# Introduction to Jupyter

* Notebook: https://jupyter-notebook.readthedocs.io/en/stable/
* Lab: https://jupyterlab.readthedocs.io/en/latest/user/interface.html


Cells can contain Markdown (like this one) or code.
Even LaTeX formulas are supported:

$$ \int_a^b f(x) dx $$

In [13]:
# This is a code cell

print("Hello world!")

Hello world!


In [3]:
# Shell commands can be evaluated directly from code cells

!ls -l

total 944
-rw-r--r-- 1 jovyan users  32049 May 13 12:39 example.png
drwxr-xr-x 2 jovyan users    986 May 13 10:37 img
-rw-r--r-- 1 jovyan users    875 May 13 14:06 P1.1 Jupyter.ipynb
-rw-r--r-- 1 jovyan users   4463 May 13 13:58 P1.2 - Numpy.ipynb
-rw-r--r-- 1 jovyan users   2364 May 13 13:59 P1.2 - Numpy (solution).ipynb
-rw-r--r-- 1 jovyan users  58409 May 13 12:41 P1.3 - Matplotlib.ipynb
-rw-r--r-- 1 jovyan users 831703 May 13 12:40 P1.3 - Matplotlib (solved).ipynb
-rw-r--r-- 1 jovyan users    555 May 12 12:45 P2 - Histograms.ipynb
-rw-r--r-- 1 jovyan users    555 May 12 12:46 P3 - Curve Fitting.ipynb
-rw-r--r-- 1 jovyan users   3271 May 13 12:43 T1.1 - Visualizing Data.ipynb
-rw-r--r-- 1 jovyan users    555 May 12 12:45 T2 - Histograms.ipynb
-rw-r--r-- 1 jovyan users    555 May 12 12:45 T3 - Regression and Correlation.ipynb


In [8]:
# You can get help about a python function by evaluation ?<function name>
?print

[0;31mDocstring:[0m
print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)

Prints the values to a stream, or to sys.stdout by default.
Optional keyword arguments:
file:  a file-like object (stream); defaults to the current sys.stdout.
sep:   string inserted between values, default a space.
end:   string appended after the last value, default a newline.
flush: whether to forcibly flush the stream.
[0;31mType:[0m      builtin_function_or_method


In [12]:
# Alternatively put the cursor behind the "(" and press SHIFT-TAB
print("Hello world")

Hello world


Jupyer supports a number of Magic Commands

Docs: https://ipython.org/ipython-doc/3/interactive/magics.html#line-magics

The most important for us are:

- %matplotlib controlling display output (we will see that later)
- %time and %timeit for benchmarking
- %prun for profiling

In [51]:
import math
def f():
    return sorted([ math.sin(x**x/math.pi) for x in range(100) ])
    
%time f(); pass # suppress output by adding a no-op

CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 119 µs


In [52]:
%timeit f()

79.1 µs ± 948 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)


In [53]:
%prun f()

         106 function calls in 0.001 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.001    0.001 <ipython-input-51-07504ce4fdbe>:3(<listcomp>)
      100    0.000    0.000    0.000    0.000 {built-in method math.sin}
        1    0.000    0.000    0.001    0.001 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {built-in method builtins.sorted}
        1    0.000    0.000    0.001    0.001 <ipython-input-51-07504ce4fdbe>:2(f)
        1    0.000    0.000    0.001    0.001 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

 

In [31]:
# Exercises

# 1. List all files in the ../datasets folder using ! commands
# 2. Install python package csvkit via pip
# 3. run csvstat on the web_request_rate\:4w@5M.csv dataset

# 4. Profile print("Hello World") with %time, %timeit and %prun