# Introduction to Jupyter Notebook

## What is Jupyter Notebook?

The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text. 

Uses include: data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and much more.

In this case, "notebook" or "notebook documents" denote documents that contain both code and rich text elements, such as figures, links, equations, ... Because of the mix of code and text elements, these documents are the ideal place to bring together an analysis description, and its results, as well as, they can be executed to perform the data analysis in real time.

**"Jupyter"** is a loose acronym meaning **Julia, Python, and R**. These programming languages were the first target languages of the Jupyter application, but nowadays, the notebook technology also supports many other languages. 

All the code in the course is compatible with any text editor or IDE that works with Python.

In [1]:
%pwd

'/Users/oluwafemifabiyi/Desktop/MSc Analytics/Autumn 2020/MSCA 37010 2 - Programming for Analytics/Week 6/Lecture'

## What Is the Jupyter Notebook app?

As a server-client application, the Jupyter Notebook app allows you to edit and run your notebooks via a web browser. The application can be executed on a PC without internet access, or it can be installed on a remote server, where you can access it through the Internet.

Its two main components are the **kernels** and a **dashboard**.

- A **kernel** is a program that runs the user’s code. The Jupyter Notebook app has a kernel for Python code, but there are also kernels available for other programming languages.

- The **dashboard** of the application not only shows you the notebook documents that you have made and can reopen but can also be used to manage the kernels: you can use it to decide which ones are running and shut them down if necessary.

## The History of iPython and Jupyter Notebook

To fully understand what the Jupyter Notebook is and what functionality it has to offer you need to know how it originated. 

- **Late 1980s**, Guido Van Rossum begins to work on Python at the National Research Institute for Mathematics and Computer Science in the Netherlands.

- **In 2001**, Fernando Pérez starts developing iPython.

- **In 2005**, both Robert Kern and Fernando Pérez attempted building a notebook system. Unfortunately, the prototype never became fully usable. 

- **In 2007**, the iPython team had kept on working and formulated another attempt at implementing a notebook-type system. 

- **By October 2010**, there was a prototype of a web notebook.

- **In the summer of 2011**, this prototype was incorporated, and it was released as version 0.12 on December 21, 2011. 

- **In subsequent years**, the team got awards, such as the Advancement of Free Software for Fernando Pérez on 23 of March 2013 and the Jolt Productivity Award, and funding from the Alfred P. Sloan Foundations, among others. 

- **Lastly, in 2014**, Project Jupyter started as a spin-off project from iPython. iPython is now the name of the Python backend, which is also known as the kernel. Recently, the next generation of Jupyter Notebooks has been introduced to the community. It's called JupyterLab.

After all this, you might wonder where this idea of notebooks originated or how it came about to the creators.

A brief research into the history of these notebooks finds that Fernando Pérez and Robert Kern were working on a notebook just at the same time as the Sage notebook was a work in progress. Since the layout of the Sage notebook was based on the layout of Google notebooks, you can also conclude that also Google used to have a notebook feature around that time. 

For what concerns the idea of the notebook, it seems that Fernando Pérez, as well as William Stein, one of the creators of the Sage notebook, have confirmed that they were avid users of the Mathematica notebooks and Maple worksheets. The Mathematica notebooks were created as a front end or GUI in 1988 by Theodore Gray. 

The concept of a notebook, which contains ordinary text and calculation and/or graphics, was definitely not new. 


## How to Use Jupyter Notebook

### 1. Two ways to open Jupyter Notebook:

1. Click on Anaconda icon

2. Type"cmd" --> "Anaconda Prompt" --> Type in "jupyter notebook" --> Hit "Enter"


### 2.    Run your Python command 
Hit `Ctrl` + `ENTER` to run your Python command in the current cell

Hit `SHIFT` + `ENTER` to run your Python command in the current cell and select below
      
Hit `Alt` + `ENTER` to run your Python command in the current cell and insert below (Windows Users)

Hit `Option` + `ENTER` to run your Python command in the current cell and insert below (Mac Users)

Hit `A` to add a cell above and Hit `B` to add a cell below when the current cell is not in editing mode (when showing a blue color bar on the left of the current cell instead of a green bar)

#### Note: Go to "Help" --> "Keyboard Shortcuts" for more shortcuts

### 3. Walk through the main buttons

### 4. Type your Python command. It can be a multi-line command too.

In [2]:
name = input("Enter your name: ")
print("Hello", name)

Enter your name:  Femi


Hello Femi


### 5. Stop the running of your Python command

If you want to stop the running, go to `Kernel` --> `Restart` 

If you want to stop the running, go to `Kernel` --> `Interrupt` (similar as "Restart" but does not always go well if a very bad loop)
   
If you want to stop the running and clear the ouput, go to `Kernel` --> `Restart & Clear Output`

In [None]:
print("Your Name")

In [None]:
# A simple python code sample
x = 34 - 23    # A comment
xy = "Hello"   # Another comment
xyz = 3.45
    
if xyz == 3.45 or xy == "Hello":
    x = x + 1
    xy = xy + " World"   # String concatenation
print (x)
print (xy)

In [None]:
x = 34 - 23    # A comment
x

#### Note: `=` is an assignment operator vs. `==` is an equality operator

In [None]:
# Practice: print My name is XXX. Hello Python World
print('My name is', name, ', Hello Python World!')

### 6. Start typing `Shift + Tab`

   If it’s possible, Jupyter will auto-complete your expression (eg. for Python commands for variables that you have already defined). If there is more than one possibility, you can choose from a drop-down menu.

In [None]:
# Example:
xyz.


### 7. Create Folders and Files

Go to the Jupyter Notebook Home Page: 

--> Click on `New` Dropdown Box on the upper right corner.

--> Select either `Python 3` to create a new notebook with Python 3 or `Folder` to create a new folder.

In [None]:
# Exercise: Create your own Assignment Folder


### 8. How to create a .py file from your notebook in order to open the script in other applications?
We save the python script as `.ipynb` file in Jupyter Notebook. If we need to create and open a `.py` file in other applications, we can follow these steps:

--> Go to `File`

--> Then go to `Download as`

--> Select the file type `.py` 

Or, use nbconvert

### 9. Path and Folder Management (see 1.2 for more details)

#### Get Work Directory

In [None]:
%pwd  # To find out where your notebooks are type: pwd in a cell. This will print your working directory.

### The `os` module (`os` stands for operating system)

**The `os` module in Python provides functions for interacting with the operating system.** OS comes under Python’s standard utility modules. This module provides a portable way of using operating system dependent functionality. The `os` and `os.path` modules include many functions to interact with the file system.

Below are some basic commands; we will learn more about how to import this package, its basic functionalities to navigate, create, delete and modify files and folders here in `PythonIntro-1.2`

In [None]:
import os
os.getcwd()    # print the current directory vs. RStudio: getwd()

In [None]:
print(os.getcwd())

In [None]:
os.listdir(os.getcwd())       # list directory contents

#### Change Work Directory

In [None]:
os.chdir('C:\Downloads')      # vs. setwd() in RStudio
print(os.getcwd())
# Or, go to cmd --> anaconda prompt --> type in "cd" to change directory

In [None]:
os.chdir(r'C:\Users\chris\PythonProgramming')

# Note: There might be an error occurs because you are using a normal string as a path. 
# To fix your problem, just put r before your normal string it converts normal string to raw string

In [None]:
%pwd

### 10. Markdown for Jupyter Notebook: 

### 1) Select Cell Type:

Go to `Cell` --> `Cell Type` --> `Markdown`

Note: `Raw NBConvert`, basically it's only of any use if you intend to use the command line tool nbconvert to convert the notebook to another format (such as HTML or Latex). When you do, cells marked as Raw NBConvert Format will be converted in a way specific to the output you're targetting.

### 2) 4 Different Text Sizes:


# Size 1. title 
## Size 2. major headings 
### Size 3. subheadings 
#### Size 4. 4th level subheadings

#### Note: Use this code
`#` Size 1. title 
`##` Size 2. major headings 
`###` Size 3. subheadings 
`####` Size 4. 4th level subheadings

### 3) Emphasis

Use this code: 

Bold: `__`string`__` or `**`string`**`

Italic: `_`string`_` or `*`string`*`

In [None]:
# Exercise" Create a sentence in bold and italic


### 4) Mathematical symbols 
Use this code: 
`$` mathematical symbols `$`

### 5) Line Breaks
Sometimes markdown doesn’t make  line breaks when you want them. Use `2 spaces` or this code for a manual line break: 

e.g.1:

`<br>`


e.g.2:

### 6) Colors: 
Use this code: `<font color=blue|red|green|pink|yellow>Text</font>` Not all markdown code works within a font tag, so review your colored text carefully.

e.g.: <font color=blue>Text</font>

### 7) Indented quoting: 
Use a `greater than` sign (`>`) and then `a space`, then type the text. The text is indented and has a gray horizontal line to the left of it until the next carriage return.

### 8) Bullets: 
Use the dash sign (`- `) with a space after it, or a space, a dash, and a space (` - `), to create a circular bullet. 
To create a sub bullet, use a tab followed a dash and a space. You can also use an asterisk instead of a dash, and it works the same.

### 9) Numbered lists: 
Start with 1. followed by a space, then it starts numbering for you. Start each line with some number and a period, then a space. Tab to indent to get subnumbering.

### 10) Graphics: 
You can attach image files directly to a notebook only in Markdown cells:

--> Set the Cell Type to be Markdown type

--> Drag and drop your images to the Markdown cell to attach it to the notebook.

![Quote.JPG](attachment:Quote.JPG)

In [None]:
# Exercise: attach a panda picture or any picture to this notebook


### 11) Horizontal lines: 
Use three asterisks: `***`

***

### Internal links: 
To link to a section, use this code: [section title](#section-title) For the text in the parentheses, replace spaces and special characters with a hyphen. Make sure to test all the links!

Alternatively, you can add an ID for a section right above the section title. Use this code: <a id="section_ID"></a> Make sure that the section_ID is unique within the notebook.

Use this code for the link and make sure to test all the links! [section title](#section_ID)

### External links: 
Use this code and test all links [link text](http://url)

[link text](https://www.solutionreach.com/blog/which-wins-the-national-average-no-show-rate-or-yours)

### 11. Help

The regular Python `help()` function also still works and you can use the magic command `%quickref` to show a quick reference sheet. And you'll see a whole bunch of them appearing. You'll probably see some magic commands that you'll grasp, such as `%save`, `%clear`, or `%debug`, but others will be less straightforward.

If you're looking for more information on the magic commands or on functions, you can always use the `?`, just like this:

In [None]:
%quickref

In [None]:
# Retrieving documentation on the alias_magic command
?%alias_magic

In [None]:
# Retrieving information on the range() function
?range

In [None]:
help()

#### Or, Search Google and StackOverflow for your error messages and see if you can find a posted solution

#### How to get the Docstring and method list pop-ups in Jupyter Notebook: 

Use Tab with your cursor directly after a defined variable to see the list of methods. 

In [1]:
# Example
A_list = [1,2,3,4] # type: A_list. (ending with the dot) and then press Tab to see the list of methods. 
                   # For the doctrings of functions, use Shift+Tab with your cursor right after the function.

In [2]:
A_list

[1, 2, 3, 4]

### 12. Making your Jupyter Notebook magical

If you want to get the most out of your notebooks with the iPython kernel, you should consider learning about the so-called "magic commands". 

Also, consider adding even more interactivity to your notebook so that it becomes an interactive dashboard to others should be one of your considerations.  

The notebook's built-in commands: There are some predefined ‘magic functions’ that will make your work a lot more interactive. 

To see which magic commands you have available in your interpreter, you can simply run the following:

In [3]:
%lsmagic

Available line magics:
%alias  %alias_magic  %autoawait  %autocall  %automagic  %autosave  %bookmark  %cat  %cd  %clear  %colors  %conda  %config  %connect_info  %cp  %debug  %dhist  %dirs  %doctest_mode  %ed  %edit  %env  %gui  %hist  %history  %killbgscripts  %ldir  %less  %lf  %lk  %ll  %load  %load_ext  %loadpy  %logoff  %logon  %logstart  %logstate  %logstop  %ls  %lsmagic  %lx  %macro  %magic  %man  %matplotlib  %mkdir  %more  %mv  %notebook  %page  %pastebin  %pdb  %pdef  %pdoc  %pfile  %pinfo  %pinfo2  %pip  %popd  %pprint  %precision  %prun  %psearch  %psource  %pushd  %pwd  %pycat  %pylab  %qtconsole  %quickref  %recall  %rehashx  %reload_ext  %rep  %rerun  %reset  %reset_selective  %rm  %rmdir  %run  %save  %sc  %set_env  %store  %sx  %system  %tb  %time  %timeit  %unalias  %unload_ext  %who  %who_ls  %whos  %xdel  %xmode

Available cell magics:
%%!  %%HTML  %%SVG  %%bash  %%capture  %%debug  %%file  %%html  %%javascript  %%js  %%latex  %%markdown  %%perl  %%prun  %%pypy  %%

In [None]:
# CPU or execution time - measure how mucn time a CPU spent on executing a program

# Wall-clock time - also called elapsed or running time, measure the total time spent execute a program in a computer.


In [9]:
import time
time.time()   # shows wall-clock time
time.gmtime()  # shows CPU time

time.struct_time(tm_year=2020, tm_mon=11, tm_mday=13, tm_hour=17, tm_min=46, tm_sec=32, tm_wday=4, tm_yday=318, tm_isdst=0)

#### `%%time` prints the wall time for the entire cell whereas `%time` gives you the time for first line only

Using `%%time` or `%time` prints 2 values: CPU Times and Wall Time

In [6]:
%%time
for i in range (0,20):
    print ("I'm timing output {}".format(i))

I'm timing output 0
I'm timing output 1
I'm timing output 2
I'm timing output 3
I'm timing output 4
I'm timing output 5
I'm timing output 6
I'm timing output 7
I'm timing output 8
I'm timing output 9
I'm timing output 10
I'm timing output 11
I'm timing output 12
I'm timing output 13
I'm timing output 14
I'm timing output 15
I'm timing output 16
I'm timing output 17
I'm timing output 18
I'm timing output 19
CPU times: user 458 µs, sys: 235 µs, total: 693 µs
Wall time: 508 µs


In [None]:
%time
for i in range (0,20):
    print ("I'm timing output {}".format(i))

### 13. Popular Data Science Python Libraries

In [None]:
#Import Python Libraries and Press Shift+Enterto execute the jupytercell
import numpy as np        # data prep 
import pandas as pd       # data prep
import scipy as sp        # math and stat
import matplotlib as mpl  # data visualization
import seaborn as sbn     # data visualization
import sklearn as skl     # machine learning

### Numpy

- Introduces objects for multidimensional arrays and matrices, as well as functions that allow to easily perform advanced mathematical and statistical operations on those objects 
- Provides vectorization of mathematical operations on arrays and matrices which significantly improves the performance 
- Many other python libraries are built on Numpy
- http://www.numpy.org/

### Pandas
- Adds data structures and tools designed to work with table-like data (similar to series and data frames in R) 
- Provides tools for data manipulation: reshaping, merging, sorting, slicing, aggregation etc. 
- Allows handling missing data
- http://pandas.pydata.org/


### Scipy: 
- Collection of algorithms for linear algebra, differential equations, numerical integration, optimization, statistics and more 
- Part of SciPy Stack 
- Built on Numpy
- https://www.scipy.org/scipylib/

### Matplotlib:
- Python 2D plotting library which produces publication quality figures in a variety of hardcopy formats

- A set of functionalities similar to those of MATLAB

- Line plots, scatter plots, barcharts, histograms, pie charts etc.

- Relatively low-level; some effort needed to create advanced visualization
- https://matplotlib.org/


### Seaborn:

- Based on matplotlib
- Provides high level interface for drawing attractive statistical graphics
- Similar (in style) to the popular ggplot2 library in R
- https://seaborn.pydata.org/

### Sklearn (scikit-learn):
- Provides machine learning algorithms: classification, regression, clustering, model validation etc. 
- Built on NumPy, SciPy and matplotlib
- http://scikit-learn.org/

#### Note: The course materials are developed mainly based on personal experience and contributions from the Python learning community
Referred book: Learning Python, 5th Edition by Mark Lutz
    