# About the course

# Goals and objectives

# Jupyter Notebook

- The Jupyter Notebook is a powerful tool for interactively developing and presenting data science projects.
- "notebook" or "notebook documents" denote documents that contain both code and rich text elements, such as figures, links, equations, ...
- Because of the mix of code and text elements, these documents are the ideal place to bring together an analysis description and its results as well as they can be executed perform the data analysis in real time.
- "Jupyter" is a loose acronym meaning Julia, Python, and R. These programming languages were the first target languages of the Jupyter application, but nowadays, the notebook technology also supports many other languages. 

In [None]:
from IPython.core.display import display, HTML
display(HTML("""<a href = "https://github.com/ipython/ipython/wiki/IPython-kernels-for-other-languages/">IPython kernels for other languages</a>"""))

The main components of the whole environment are, on the one hand, the notebooks themselves and the application. On the other hand, you also have a notebook kernel and a notebook dashboard.

## What is the Jupyter Notebook App?

As a server-client application, the Jupyter Notebook App allows you to edit and run your notebooks via a web browser. The application can be executed on a PC without Internet access or it can be installed on a remote server, where you can access it through the Internet.

Its two main components are the kernels and a dashboard.

A kernel is a program that runs and introspects the user’s code. The Jupyter Notebook App has a kernel for Python code, but there are also kernels available for other programming languages.

The dashboard of the application not only shows you the notebook documents that you have made and can reopen but can also be used to manage the kernels: you can which ones are running and shut them down if necessary.

## The history of IPython and Jupyter Notebooks

- In 1980s, Guido Van Rossum begins to work on Python at the National Research Institute for Mathematics and Computer Science in the Netherlands.
- In 2001, twenty years later, Fernando Pérez starts developing IPython.
- In 2005, both Robert Kern and Fernando Pérez attempted building a notebook system. But, the prototype had never become fully usable. 
- In 2007, the IPython team formulated another attempt at implementing a notebook-type system.
- In October 2010, there was a prototype of a web notebook and in the summer of 2011, this prototype was incorporated and it was released with 0.12 on December 21, 2011.
- In subsequent years, the team got awards, such as the Advancement of Free Software for Fernando Pérez on 23 of March 2013 and the Jolt Productivity Award, and funding from the Alfred P. Sloan Foundations, among others. 
- In 2014, Project Jupyter started as a IPython project. IPython is now the name of the Python backend, which is also known as the kernel.

If you want to know more details, check out the personal accounts of Fernando Pérez and William Stein about the history of their notebooks.

## How to install Jupyter Notebook

### Running Jupyter Notebooks with the Anaconda Python Distribution

One of the requirements here is Python3. The general recommendation is that you use the Anaconda distribution to install Python and the notebook application. 


The advantage of Anaconda is that you have access to over 720 packages that can easily be installed with Anaconda's conda, a package, dependency, and environment manager. You can download and follow the instructions for the installation of Anaconda by this link: 

In [None]:
from IPython.core.display import display, HTML
display(HTML("""<a href = "https://www.anaconda.com/download/">www.anaconda.com</a>"""))

## Running Jupyter

On Windows, you can run Jupyter via the shortcut Anaconda adds to your start menu, which will open a new tab in your default web browser that should look something like the following screenshot.

![image.png](attachment:image.png)

Note that the URL for the dashboard is something like http://localhost:8888/tree. Localhost is not a website, but indicates that the content is being served from your local machine: your own computer. Jupyter's Notebooks and dashboard are web apps, and Jupyter starts up a local Python server to serve these apps to your web browser, making it essentially platform independent and opening the door to easier sharing on the web.



For creating your first notebook, click the "New" drop-down button in the top-right and select "Python 3"

![image.png](attachment:image.png)

![image.png](attachment:image.png)

There are two terms that you should notice: cells and kernels are key both to understanding Jupyter and to what makes it more than just a word processor.

- A kernel is a "computational engine" that executes the code contained in a notebook document.

- A cell is a container for text to be displayed in the notebook or code to be executed by the notebook's kernel.

## Cells

There are two main cell types that we will cover:
    
- A code cell contains code to be executed in the kernel and displays its output below.
- A Markdown cell contains text formatted using Markdown and displays its output in-place when it is run.

Let's test it out with a classic hello world example:

In [None]:
print("Hello World!")





Let's try a new code to see what happens:


In [None]:
def say_hello (recipient): 
    return "Hello, {}!".format(recipient)

say_hello("Jack")

## Keyboard shortcuts:

Keyboard shortcuts are a very popular aspect of the Jupyter environment because they facilitate a speedy cell-based workflow. Many of these are actions you can carry out on the active cell when it's in command mode.


Below, you'll find a list of some of Jupyter's keyboard shortcuts.

* Toggle between edit and command mode with Esc and Enter, respectively.
* Once in command mode:
   - Scroll up and down your cells with your Up and Down keys.
   - Press A or B to insert a new cell above or below the active cell.
   - M will transform the active cell to a Markdown cell.
   - Y will set the active cell to a code cell.
   - D + D (D twice) will delete the active cell.
   - Z will undo cell deletion.
   - Hold Shift and press Up or Down to select multiple cells at once.
   - Ctrl + Shift + -, in edit mode, will split the active cell at the cursor.
   
For more info go to help > keyboard shortcuts

## Kernels

## Choosing a kernel

You have option to change kernel in Jupyter. Back when you created a new notebook from the dashboard by selecting a Python version, you were actually choosing which kernel to use.

Not only are there kernels for different versions of Python, but also for over 100 languages including Java, C, and even Fortran.

## Naming your notebooks

You can use either the dashboard or your file browser to rename the .ipynb file which the default notebook file name is  Untitled.ipynb.

You cannot rename a notebook while it is running, so you've first got to shut it down. The easiest way to do this is to select "File > Close and Halt" from the notebook menu. However, you can also shutdown the kernel either by going to "Kernel > Shutdown" from within the notebook app or by selecting the notebook in the dashboard and clicking "Shutdown"


You can then select your notebook and and click "Rename" in the dashboard controls.

## Save and checkpoint

Pressing Ctrl + S will save your notebook by calling the "Save and Checkpoint" command, but what this checkpoint thing?

Every time you create a new notebook, a checkpoint file is created as well as your notebook file; it will be located within a hidden subdirectory of your save location called .ipynb_checkpoints and is also a .ipynb file.

By default, Jupyter will autosave your notebook every 120 seconds to this checkpoint file without altering your primary notebook file. When you "Save and Checkpoint," both the notebook and checkpoint files are updated. Therefore, the checkpoint enables you to recover your unsaved work in the event of an unexpected issue. You can revert to the checkpoint from the menu via "File > Revert to Checkpoint."

#### In this course we will work with several libraries in Python
- NumPy to work with mathematical modeling
- pandas to work with our data and statistical model
- Matplotlib to plot charts
- Seaborn to make our charts prettier

## Data structures and sequences

Data structures are a way of organizing and storing data so that they can be accessed and worked with efficiently. They define the relationship between the data, and the operations that can be performed on the data.
There are many various kinds of data structures defined that make it easier for the data scientists and the computer engineers, alike to concentrate on the main picture of solving larger problems rather than getting lost in the details of data description and access.

## Primitive data structures

These are the most primitive or the basic data structures. They are the building blocks for data manipulation and contain pure, simple values of a data. Python has four primitive variable types:

- Integers
- Float
- Strings
- Boolean

In [None]:
# Floats

x = 4.0
y = 2.0

print (x + y)
print (x * y)
print (x / y)
print (x % y)      # Returns the remainder
print (abs(x))     # Absolute value
print (x ** y)     # x to the power y 

In [None]:
# String

x = "Cake"
y = "Cookie"

x + " & " + y

Here are some other basic operations that you can perform with strings; For example, you can use * to repeat a string a certain number of times:

In [None]:
# repeat

x * 3

You can also slice strings, which means that you select parts of strings:

In [None]:
# Range slicing

z1 = x[2:]

print(z1)


# Slicing

z2 = y[0] + y[1]

print(z2)

Note that strings can also be alpha-numeric characters, but that the + operation still is used to concatenate strings.

In [None]:
x = "4"
y = "3"

x + y

In [None]:
# Boolean

x = 4

y = 2

x == y

x > y

In [None]:
x = 4
y = 2
z = (x == y)      # Comparison expression 

if z:             # Conditional on truth/false value of "z
    print ("Cookie")
else: print ("No Cookie")

To check the type of an object in Python, use the built-in type() function, just like in the lines of code below:

In [27]:
a = 4.0

b = "Class"

c = 12



In [None]:
type (a)

In [None]:
type (b)

In [None]:
type (c)

## Non-Primitive data structures

#### The  non-primitive data structures in Python are divided into:

- Arrays

- Lists

- Files

### Arays

### Lists

### Files

Resource: https://www.dataquest.io
          http://jupyter.org/
          https://www.datacamp.com