# IPython: beyond plain Python

Updated by [@espg](https://github.com/espg) from the 2019 ICESat2 Hackweek [intro-jupyter-git](https://github.com/ICESAT-2HackWeek/intro-jupyter-git) session, written by [@fperez](https://github.com/fperez).

When executing code in IPython, all valid Python syntax works as-is, but IPython provides a number of features designed to make the interactive experience more fluid and efficient.

## First things first: running code, getting help

In the notebook, to run a cell of code, hit `Shift-Enter`. This executes the cell and puts the cursor in the next cell below, or makes a new one if you are at the end.  Alternately, you can use:
    
- `Alt-Enter` to force the creation of a new cell unconditionally (useful when inserting new content in the middle of an existing notebook).
- `Control-Enter` executes the cell and keeps the cursor in the same cell, useful for quick experimentation of snippets that you don't need to keep permanently.

In [None]:
print("Hi")

Getting help:

In [None]:
?

Typing `object_name?` will print all sorts of details about any object, including docstrings, function definition lines (for call arguments) and constructor details for classes.

In [None]:
import numpy as np
np.linspace?

In [None]:
np.isclose??

In [None]:
*int*?

An IPython quick reference card:

In [None]:
%quickref

## Tab completion

Tab completion, especially for attributes, is a convenient way to explore the structure of any object you’re dealing with. Simply type `object_name.<TAB>` to view the object’s attributes. Besides Python objects and keywords, tab completion also works on file and directory names.

In [None]:
np.

Tab completion also works for accessing documentation. A single tab after the parens will display the function or class signature; a second tab will open a scrollable window; finally, a third repeated tab will detach the docstring and append it as frame within the notebook:

In [None]:
np.zeros()

## The interactive workflow: input, output, history

In [None]:
2+10

In [None]:
_+10

You can suppress the storage and rendering of output if you append `;` to the last cell (this comes in handy when plotting with matplotlib, for example):

In [None]:
10+20;

In [None]:
_

The output is stored in `_N` and `Out[N]` variables:

In [None]:
_12 == Out[12]

Previous inputs are available, too:

In [None]:
In[11]

In [None]:
_i

In [None]:
%history -n 1-5

**Exercise**

Use `%history?` to have a look at `%history`'s magic documentation, and write the last 10 lines of history to a file named `log.py`.

Note that the history is tracked by the ipython kernel separate from the jupyter notebook. If you have a web browser crash, and haven't saved your notebook, you can start a new ipython kernel from the terminal and still access the history log of the previous session-- regardless of if it is still running.

Hint: You can use the `-g` flag to get the global history, instead of just the current session.

## Accessing the underlying operating system

You can invoke the command line from within the notebook by prepending the "bang" ( `!` ) operator:  


In [None]:
!pwd

Some of the most common shell commands will work without prepending:

In [None]:
ls

In [None]:
pwd

...but others require the bang operator:

In [None]:
du

In [None]:
!du

It's best practice to use the operator in all cases, since it makes it more explicit when you are calling the shell

In [None]:
files = !ls 
print("files this directory:")
print(files)

In [None]:
files

In [None]:
!echo $files

In [None]:
!echo {files[0].upper()}

Note that all this is available even in multiline blocks:

In [None]:
import os
for i,f in enumerate(files):
    if f.endswith('ipynb'):
        !echo {"%02d" % i} - "{os.path.splitext(f)[0]}"
    else:
        print('--')

## Beyond Python: magic functions

The IPython 'magic' functions are a set of commands, invoked by prepending one or two `%` signs to their name, that live in a namespace separate from your normal Python variables and provide a more command-like interface.  They take flags with `--` and arguments without quotes, parentheses or commas. The motivation behind this system is two-fold:
    
- To provide an namespace for controlling IPython itself and exposing other system-oriented functionality that is separate from your Python variables and functions.  This lets you have a `cd` command accessible as a magic regardless of whether you have a Python `cd` variable.

- To expose a calling mode that requires minimal verbosity and typing while working interactively.  Thus the inspiration taken from the classic Unix shell style for commands.

In [None]:
%magic

Line vs cell magics:

Magics can be applied at the single-line level or to entire cells. Line magics are identified with a single `%` prefix, while cell magics use `%%` and can only be used as the first line of the cell (since they apply to the entire cell). Some magics, like the convenient `%timeit` that ships built-in with IPython, can be called in either mode, while others may be line- or cell-only (you can see all magics with `%lsmagic`).

Let's see this with some `%timeit` examples:

In [None]:
%timeit list(range(1000))

In [None]:
%%timeit
# comment here

list(range(10))
list(range(100))

Line magics can be used even inside code blocks:

In [None]:
for i in range(1, 5):
    size = i*100
    print('size:', size, end=' ')
    %timeit list(range(size))

Magics can do anything they want with their input, so it doesn't have to be valid Python (note that the below may not work on a Windows machine, depending on how you are running Jupyter on it):

In [None]:
%%bash
echo "My shell is:" $SHELL
echo "My disk usage is:"
df -h

Another interesting cell magic: create any file you want locally from the notebook:

In [None]:
%%writefile test.txt
This is a test file!
It can contain anything I want...

And more...



In [None]:
!cat test.txt

Let's see what other magics are currently defined in the system:

In [None]:
%lsmagic

## Plotting in the notebook

In [None]:
%matplotlib inline

In [None]:
import numpy as np
import matplotlib.pyplot as plt

In [None]:
from osgeo import gdal

In [None]:
data = gdal.Open('./tile.tif')

In [None]:
x = np.linspace(0, 2*np.pi, 300)
y = np.sin(x**2)
plt.plot(x, y)
plt.title("A little chirp")

another, lazier option is to use the `pylab` magic, which overwrites the working namespace with imports from numpy and matplotlib:

In [None]:
imshow(data.ReadAsArray()), zeros(10)

In [None]:
%pylab inline

In [None]:
imshow(data.ReadAsArray()), zeros(10)

## Profiling Code

The following example is taken from the sklearn library-- specifically, here:

In [None]:
from sklearn.cluster import OPTICS, cluster_optics_dbscan

# Generate sample data

np.random.seed(0)
n_points_per_cluster = 500

C1 = [-5, -2] + .8 * np.random.randn(n_points_per_cluster, 2)
C2 = [4, -1] + .1 * np.random.randn(n_points_per_cluster, 2)
C3 = [1, -2] + .2 * np.random.randn(n_points_per_cluster, 2)
C4 = [-2, 3] + .3 * np.random.randn(n_points_per_cluster, 2)
C5 = [3, -2] + 1.6 * np.random.randn(n_points_per_cluster, 2)
C6 = [5, 6] + 2 * np.random.randn(n_points_per_cluster, 2)
X = np.vstack((C1, C2, C3, C4, C5, C6))

clust = OPTICS(min_samples=50, xi=.05, min_cluster_size=.05)

In [None]:
%prun clust.fit(X)

## Simple Code Optimization

Python 'for' loops are not particularly fast due to interpreter overhead and object inspection. Usually, the best way to write fast code is to use numpy vectorized operations, which will call underlying compiled libraries to optimize compute intensive loops. However, there are some functions that are not easy to express in pure numpy vector functions; here's an example of a function that convolves a DEM to a ICESat waveform (the original ICESat sensor): 

In [None]:
def waveformC(raster):
    """Takes a fixed 100 by 100 meter raster at
    0.5 meter resolution, and converts it to an
    unconvolved waveform"""
    # Define constants
    cumpowr=np.zeros((544,1))   # Signal length
    # Define spatial weighting 
    x, y = np.r_[-50:50:0.5], np.r_[-50:50:0.5]
    X, Y = np.meshgrid(x,y)
    w=np.exp(-(X**2 + Y**2)/2/17.5**2)
    # Conversion to nanoseconds... 'D' is nanoseconds
    Z = raster[:,:] - np.max(raster[:,:])
    D = (abs(Z) / 0.15) + 10    # Start all recordings after 10 ns
    for i in range(200):
        for j in range(200):
            ibin=np.int(D[i,j]) #finding ibin
            #add power within pixel (i,j)-which is w(i,j) to the returned
            #power in the bin number ibin
            if ibin < 544:
                cumpowr[ibin]=cumpowr[ibin]+w[i,j]
    return cumpowr

It has two nested for loops, and runs about a tenth of a second per simulation:

In [None]:
%%timeit 
waveformC(data.ReadAsArray())

We can use cython, or numba to speed up the execution of the function. We'll use numba, since it requires just one line to use (see a more detailed discussion [here](https://towardsdatascience.com/speed-up-your-algorithms-part-2-numba-293e554c5cc1)):

In [None]:
from numba import jit

In [None]:
wave_numba = jit(waveformC)

In [None]:
%%timeit 
wave_numba(data.ReadAsArray());

In [None]:
plot(wave_numba(data.ReadAsArray()))

Using the first function on quarter million tiles would take over 6 hours; using the second is closer to 20 minutes. Note that the speed up really only works for code that has for loops to optimize... if you take a function that's already vectorized in numpy, you won't see as much of difference:

In [None]:
np.random.seed(444)
testdata = np.random.choice([False, True], size=100000)

In [None]:
def count_transitions(x):
    return(np.count_nonzero(x[:-1] < x[1:]))    

In [None]:
numba_count = jit(count_transitions)

In [None]:
%timeit numba_count(testdata)Modified

In [None]:
%timeit count_transitions(testdata)

Also note that you should always test that you get the same result from the accelerated version as the pure python one!

In [None]:
count_transitions(testdata) == numba_count(testdata)

## Running normal Python code: execution and errors

Not only can you input normal Python code, you can even paste straight from a Python or IPython shell session:

In [None]:
>>> # Fibonacci series:
... # the sum of two elements defines the next
... a, b = 0, 1
>>> while b < 10:
...     print(b)
...     a, b = b, a+b

In [None]:
In [1]: for i in range(10):
   ...:     print(i, end=' ')
   ...:     

And when your code produces errors, you can control how they are displayed with the `%xmode` magic:

In [None]:
%%writefile mod.py

def f(x):
    return 1.0/(x-1)

def g(y):
    return f(y+1)

Now let's call the function `g` with an argument that would produce an error:

In [None]:
import mod
mod.g(0)

## Basic debugging

When running code interactively, it can be tricky to figure out how to debug... 

In [None]:
%debug

In [None]:
enjoy = input('Are you enjoying this tutorial? ')
print('enjoy is:', enjoy)

## Running code in other languages with special `%%` magics

In [None]:
%%perl
@months = ("July", "August", "September");
print $months[0];