# Jupyter Notebook

## Shortcuts
`ESC` Toggle between cell editing (green line) and cell functions (blue line)

`a` add a cell Above

`b` add a cell Below

`x` Cut a cell (delete)

`v` Paste a cell

## Help and Documentation

### **TAB**

Use TAB to list possible options

In [1]:
import pandas as pd

In [2]:
df = pd.DataFrame({"a": [1,2,3], "b": [4,5,6]})

In [3]:
df.melt

<bound method DataFrame.melt of    a  b
0  1  4
1  2  5
2  3  6>

TAB with imports

In [4]:
from matplotlib import pyplot 

### Wildcard matching

In [5]:
str.*fi*?
# gives the list of matching methods

str.find
str.isidentifier
str.rfind
str.zfill

### **?** or **help(command)**

Use `?` after a command name to get the help

In [6]:
help(len)
# inline help

Help on built-in function len in module builtins:

len(obj, /)
    Return the number of items in a container.



In [7]:
len?
# in separate window

[0;31mSignature:[0m [0mlen[0m[0;34m([0m[0mobj[0m[0;34m,[0m [0;34m/[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m Return the number of items in a container.
[0;31mType:[0m      builtin_function_or_method


In [8]:
df?

[0;31mType:[0m        DataFrame
[0;31mString form:[0m
   a  b
0  1  4
1  2  5
2  3  6
[0;31mLength:[0m      3
[0;31mFile:[0m        /usr/lib/python3/dist-packages/pandas/core/frame.py
[0;31mDocstring:[0m  
Two-dimensional size-mutable, potentially heterogeneous tabular data
structure with labeled axes (rows and columns). Arithmetic operations
align on both row and column labels. Can be thought of as a dict-like
container for Series objects. The primary pandas data structure.

Parameters
----------
data : ndarray (structured or homogeneous), Iterable, dict, or DataFrame
    Dict can contain Series, arrays, constants, or list-like objects

    .. versionchanged :: 0.23.0
       If data is a dict, column order follows insertion-order for
       Python 3.6 and later.

    .. versionchanged :: 0.25.0
       If data is a list of dicts, column order follows insertion-order
       for Python 3.6 and later.

index : Index or array-like
    Index to use for resulting frame. Will defaul

### Source code **??**

Access source code with `??`

In [9]:
df.mean??

[0;31mSignature:[0m [0mdf[0m[0;34m.[0m[0mmean[0m[0;34m([0m[0maxis[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m [0mskipna[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m [0mlevel[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m [0mnumeric_only[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m [0;34m**[0m[0mkwargs[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Return the mean of the values for the requested axis.

Parameters
----------
axis : {index (0), columns (1)}
    Axis for the function to be applied on.
skipna : bool, default True
    Exclude NA/null values when computing the result.
level : int or level name, default None
    If the axis is a MultiIndex (hierarchical), count along a
    particular level, collapsing into a Series.
numeric_only : bool, default None
    Include only float, int, boolean columns. If None, will attempt to use
    everything, then use only numeric data. Not implemented for Series.
**kwargs
    Additional keyword arguments to be passed to t

## Shell Commands

Just like using your shell

In [10]:
ls

jupyter_cheetsheet.ipynb


In [11]:
cd ./../jupyter/

[Errno 2] No such file or directory: './../jupyter/'
/home/shasthamsa/work/sphinx-walkthrough/notebooks


In [12]:
mkdir -p blah

In [13]:
ls

[0m[01;34mblah[0m/  jupyter_cheetsheet.ipynb


### Passing shell commands to python

In [14]:
pwd

'/home/shasthamsa/work/sphinx-walkthrough/notebooks'

In [15]:
current_dir = !pwd

In [16]:
current_dir

['/home/shasthamsa/work/sphinx-walkthrough/notebooks']

## Magic functions
Magic functions are 'extra' functions available in Jupyter notebook, that are not 'python' code but tell Jupyter to process the command in a specific way

Magic functions start with `%`

In [17]:
%magic


IPython's 'magic' functions

The magic function system provides a series of functions which allow you to
control the behavior of IPython itself, plus a lot of system-type
features. There are two kinds of magics, line-oriented and cell-oriented.

Line magics are prefixed with the % character and work much like OS
command-line calls: they get as an argument the rest of the line, where
arguments are passed without parentheses or quotes.  For example, this will
time the given statement::

        %timeit range(1000)

Cell magics are prefixed with a double %%, and they are functions that get as
an argument not only the rest of the line, but also the lines below it in a
separate argument.  These magics are called with two arguments: the rest of the
call line and the body of the cell, consisting of the lines below the first.
For example::

        %%timeit x = numpy.random.randn((100, 100))
        numpy.linalg.svd(x)

will time the execution of the numpy svd routine, running the assignment 

In [18]:
%lsmagic

Available line magics:
%alias  %alias_magic  %autoawait  %autocall  %automagic  %autosave  %bookmark  %cat  %cd  %clear  %colors  %conda  %config  %connect_info  %cp  %debug  %dhist  %dirs  %doctest_mode  %ed  %edit  %env  %gui  %hist  %history  %killbgscripts  %ldir  %less  %lf  %lk  %ll  %load  %load_ext  %loadpy  %logoff  %logon  %logstart  %logstate  %logstop  %ls  %lsmagic  %lx  %macro  %magic  %man  %matplotlib  %mkdir  %more  %mv  %notebook  %page  %pastebin  %pdb  %pdef  %pdoc  %pfile  %pinfo  %pinfo2  %pip  %popd  %pprint  %precision  %prun  %psearch  %psource  %pushd  %pwd  %pycat  %pylab  %qtconsole  %quickref  %recall  %rehashx  %reload_ext  %rep  %rerun  %reset  %reset_selective  %rm  %rmdir  %run  %save  %sc  %set_env  %store  %sx  %system  %tb  %time  %timeit  %unalias  %unload_ext  %who  %who_ls  %whos  %xdel  %xmode

Available cell magics:
%%!  %%HTML  %%SVG  %%bash  %%capture  %%debug  %%file  %%html  %%javascript  %%js  %%latex  %%markdown  %%perl  %%prun  %%pypy  %%

## Creating modules

In [19]:
%%file my_script.py
print("this is the 1st line")
print("this is a test")

Writing my_script.py


In [20]:
%run my_script.py

this is the 1st line
this is a test


## Error modes / Debugging

In [21]:
def errorfunc():
    return 1.0 / 0.0

In [22]:
%xmode Plain

Exception reporting mode: Plain


In [23]:
errorfunc()

ZeroDivisionError: float division by zero

In [None]:
%xmode Verbose

Exception reporting mode: Verbose


In [None]:
errorfunc()

ZeroDivisionError: float division by zero

In [None]:
%debug

> [0;32m<ipython-input-4-973b4026275a>[0m(2)[0;36merrorfunc[0;34m()[0m
[0;32m      1 [0;31m[0;32mdef[0m [0merrorfunc[0m[0;34m([0m[0;34m)[0m[0;34m:[0m[0;34m[0m[0m
[0m[0;32m----> 2 [0;31m    [0;32mreturn[0m [0;36m1.0[0m [0;34m/[0m [0;36m0.0[0m[0;34m[0m[0m
[0m
ipdb> q


### %timeit 

Very useful magic function to time a piece of code

In [None]:
%timeit L = [n ** 2 for n in range(1000)]

10000 loops, best of 3: 66 µs per loop


In [None]:
%%timeit
L = []
for n in range(1000):
    L.append(n ** 2)

10000 loops, best of 3: 111 µs per loop


In [None]:
def square(n):
    return [x ** 2 for x in range(n)]

In [None]:
%timeit square(1000)

10000 loops, best of 3: 69.7 µs per loop


In [None]:
def Fibonacci(n):
    if n == 0: return 0
    elif n == 1: return 1
    else: return Fibonacci(n-1) + Fibonacci(n-2)

In [None]:
%timeit Fibonacci(30)

1 loop, best of 3: 326 ms per loop


### %prun
to find out where time is spent

In [None]:
%prun Fibonacci(20)

 

In [None]:
from math import sqrt
def F(n):
    return ((1+sqrt(5))**n-(1-sqrt(5))**n)/(2**n*sqrt(5))

In [None]:
%timeit F(30)

The slowest run took 11.01 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 628 ns per loop


In [None]:
%prun F(50)

 

In [None]:
## need to install the extension with 
# pip install line_profiler
# and then load it:
%load_ext line_profiler

In [None]:
# lprun requires that the function to time be specifically name with -f
%lprun -f Fibonacci Fibonacci(30)

## Profiling Memory Use
### %memit and %mprun

In [None]:
# requires another extension memory_profiler
# pip install memory_profiler
%load_ext memory_profiler

In [None]:
%memit Fibonacci(30)

peak memory: 39.36 MiB, increment: 0.06 MiB


In [None]:
# mprun runs on independent modules, not inline, so we need to create a module

In [None]:
%%file transform.py
import pandas as pd
def transform(df):
    def upper(s):
        return s.str.upper()
        
    for col in df:
        df['{}_t'] = df[col].apply(lambda x: x.split()).apply(pd.Series)
    df = df.apply(lambda x: upper(x), axis=1)
    

Overwriting transform.py


In [None]:
from transform import transform
import pandas as pd
%mprun -f transform transform(pd.DataFrame([['abc','def','ghi'],['klm','nop','qrs']]))

('',)


In [None]:
%matplotlib inline
# tells Jupyter notebook to display Matplotlib plots in the notebook
# (as opposed to opening a native window as would be the case outside of Jupyter)