# Introduction to Jupyter Notebook

* Reference: http://programminghistorian.org/en/lessons/jupyter-notebooks

## What is Jupyter Notebook?

* `Jupyter Notebook` is a tool that we can use to write, edit and execute a computer program. The statements of the computer program we write and the results of executing the program are saved in the same document (notebook). The document has an extension `.ipynb`.
    * Jupyter Notebook can be associated with **different programming languages**, such as Python and R. Each language in the context is called `kernel`. 
    * Jupyter Notebook enables you to develop **reproducible** data analysis pipelines. 
    * Using Jupyter Notebook, you can load, transform, and analyze data. 
    * Besides, you can develop and experiment models. 
* Notebooks (Galea 2018):
    - Lab-style Notebooks: serves as the programming analog of research journals. It should contain all of works that have been done including loading, processing, analyzing and modeling the data. 
    - Deliverable Notebooks: to be presentable and should contain select parts of the lab-style notebooks. 
* What to learn with the workbook? 
    - Useful Jupyter Notebook features
    - Introductions to the Python libraries we'll use
* Reference: * https://athena.brynmawr.edu/jupyter/hub/dblank/public/Jupyter%20Notebook%20Users%20Manual.ipynb

## Markdown
* Reference: https://www.tutorialspoint.com/jupyter/jupyter_notebook_markdown_cells.htm
* Markdown is a lightweight version of markup language that is popular in the field of data science
* Syntax:
    - Markdown style
    - HTML style
* Components
    - headings
    - lists
    - table
    - italic and bold
* <a href='https://rmd4sci.njtierney.com/math'>Formula Markdown</a> 

## An Illustrative Example 
* Here, we will use a simple example to show how to use Jupyter Notebook. 
* We will write statements to receive input from the user and then do a simple calculation and finally, print out the result. 

In [3]:
# -- Load necessary libraries --
# -- For now, consider libraries as toolsets -- 
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

ModuleNotFoundError: No module named 'numpy'

In [28]:
# -- Ask for quantity --
qty = input("Enter quantity: ")

Enter quantity: 100


In [29]:
# -- Ask for unit price --
unitPrice = input("Enter unit price: ")

Enter unit price: 15.5


In [31]:
# -- Calculate sales amount-- 
amount = float(qty) * float(unitPrice)

In [32]:
# -- Print out sales amount -- 
print('Sales amount: ',amount)

Sales amount:  1550.0


In [34]:
input?

# Jupyter Features and Functions

## Getting Help: Access to Documentation
* "**Being an effective practitioner of data science is less about memorizing the tool or command you should use for every possible situation, and more about learning to effectively find the information you don't know, whether through a web search engine or another means.**" (p. 3, VanderPlas, 2017). 
* Three IPython's tools: 
    - ?: explore documentation, add question mark to end of command (object)
    - ??: explore source code
    - Tab key: Autocompletion

### Explore Documentation
**docstring**: Every object in python contains the reference to a string called **docstring** which describes the summary of the object and how to use it. 

In [4]:
# -- Get the pandas read_csv help (docstring) --
pd.read_csv?

Object `pd.read_csv` not found.


In [15]:
# -- Get the pandas groupby help (docstring) -- 
pd.DataFrame.groupby?

In [16]:
#  -- Get the python sort function help(docstring) --
sorted?

In [17]:
# -- Get the pandas groupby help (docstring) by using Python built-in help() -- 
help(pd.DataFrame.groupby)

Help on function groupby in module pandas.core.frame:

groupby(self, by=None, axis=0, level=None, as_index: bool = True, sort: bool = True, group_keys: bool = True, squeeze: bool = False, observed: bool = False) -> 'groupby_generic.DataFrameGroupBy'
    Group DataFrame using a mapper or by a Series of columns.
    
    A groupby operation involves some combination of splitting the
    object, applying a function, and combining the results. This can be
    used to group large amounts of data and compute operations on these
    groups.
    
    Parameters
    ----------
    by : mapping, function, label, or list of labels
        Used to determine the groups for the groupby.
        If ``by`` is a function, it's called on each value of the object's
        index. If a dict or Series is passed, the Series or dict VALUES
        will be used to determine the groups (the Series' values are first
        aligned; see ``.align()`` method). If an ndarray is passed, the
        values are used 

### Access Source Code with ??
* If the source code is in Python, then using ?? can reveal the code. 
* If no code is displayed, then it is written in C or other languages. 

In [18]:
def square(a):
    """Return the square of a."""
    return a ** 2


In [19]:
square?

## Tab Completion




### Tab completion when Importing
- listing available modules on import   
`import <tab>`   
`from numpy import <tab>`
- listing available modules after import         
`np.<tab>`   



### Tab Completion of Objects
- function completion    
`np.ar<tab>`   
`sor<tab>([2, 3, 1])`   
- variable completion    
`myvar_1 = 5`   
`myvar_2 = 6`   
`my<tab>`   
- listing relative path directory contents   
`../<tab>`   
- listing private/special methods   
`M._<tab>`  
(then press enter on a folder and tab again to show its contents)


#### Tab Completion: Wildcard matching
* To match characteristics in the middle of words, use the * character   
`*Warning?`   
`str.*find?`   

## Basic Keyboard Shortcuts
- `Shift + Enter` to run cell and move to next cell
- `Control + Enter` to run cell
- `Escape` to leave cell
    - `m` to change cell to Markdown (after pressing escape)
    - `y` to change cell to Code (after pressing escape)
    - Arrow keys move cells (after pressing escape)
- `Enter` to enter cell

In [20]:
# -- Enter pd. and then hit <tab> key --


In [21]:
# -- Enter pa.read and then hit <tab> key --

# Menu of Jupyter Notebook