# Welcome to Jupyter notebooks introduction course (for researchers)

Jupyter notebooks can be a nice addition for researchers. Notebooks allow fast prototyping and to quicky present or share ideas.


To make naming clear. **Jupyter Notebook** started as the initial idea. It is still supported and used. However, it will be gradually replaced with **Jupyter Lab** and serves as drop-in replacement for notebooks.

There are many ways to work and present your work made in Jupyter:
 - Local instance of Jupyter notebooks. Your hardware is your limitation. No software restrictions.
 - GitHub know how to present content of notebooks when searching through online repository
 - [Google Colab](https://colab.research.google.com/) offers free-of-charge tier of Jupyter Notebook instances, with their hardware. Google offers CPU-only, GPU, TPU instances. Free version has some limitations on resources (CPU, memory) and constrained running time (6 hours). Google Drive is your storage.
 - [JupyterHub] is designed to support classrooms. No restriction, you have to deploy it yourself.
 - [MyBinder](https://mybinder.readthedocs.io/en/latest/introduction.html) is another option to host and run your notebooks. It requires github account and repository to work on, then it creates container with your notebook on their infrastructure and notebook is publicly available.

Each Jupyter notebook is a file with .ipynb (IPython NoteBook). We start Jupyter notebook with the following CLI command:
```bash
jupyter notebook
```

When we run start Jupyter notebook, it fires up a default web browser, where it opens a tab [http://localhost:8888](http://localhost:8888). By default, your running notebook is **NOT** accessible from external network.

For Windows users, I personally recommend [**Ananconda** Python distribution][1]. It comes with all dependencies taken care of, no matter the platform. Anaconda package is **HUGE** since it contains, own copy of most libraries. To run Jupyter Notebook on Windows, Anaconda offers graphical user interface (GUI), where you can run start Notebooks with a few clicks.

One major advantage of Anaconda is its portability. Portability in sense that even if you run some *ancient* version of Linux distribution (e.g. Ubuntu 9.10), you will still be able to install Python 3.7, which was recently released, without any privilege escalation (i.e. sudo, su) commands.

[1]: https://www.anaconda.com/distribution/

## Cells
In Notebooks, you will notice two major types of cells. The **code** and **Markdown** cells. Each cell can only be of **one** type. Type of a cell can easily be changed by clicking dropdown menu in toolbar.

### Markdown cells
Markdown is a lightweight markup language. It means that with small notation convention, we can express that some text has to be **bold** or *italic*. Something can be [a link](#). We can even express table, like this:

| Column A      | Column B      | Price |
| ------------- |:-------------:| -----:|
| col 3 is      | right-aligned | 30€   |
| col 2 is      | centered      | 12€   |
| zebra stripes | are neat      |  1€   |

### Code cells
The second type of cells are **code cells** for them applies the same rules as for Python code, but with few extensios.

In [1]:
a = 16 # assign integer to variable 'a', for example

In [2]:
print(a) # this will print values of variable 'a' as plain text

16


In [3]:
display(a) # this will also print, but certain objects will be present in more colorful way

16

In [4]:
a # if last line of a cell is simply a variable name, it will show it's value

16

In [5]:
a = [51, 32, 14, 3.14]

In [6]:
print(a)

[51, 32, 14, 3.14]


Jupyter notebooks use IPython interpreter underneath. IPython has an extension called *magic commands*. Here are few examples:

In [7]:
%timeit sorted(a) # measures performance (execution time) of a command

229 ns ± 6.25 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [8]:
%pip install imbalanced-learn # We can install packages inside jupyter, same goes for %conda

Note: you may need to restart the kernel to use updated packages.


In [9]:
%lsmagic # Will list all available magic commands

Available line magics:
%alias  %alias_magic  %autoawait  %autocall  %automagic  %autosave  %bookmark  %cat  %cd  %clear  %colors  %conda  %config  %connect_info  %cp  %debug  %dhist  %dirs  %doctest_mode  %ed  %edit  %env  %gui  %hist  %history  %killbgscripts  %ldir  %less  %lf  %lk  %ll  %load  %load_ext  %loadpy  %logoff  %logon  %logstart  %logstate  %logstop  %ls  %lsmagic  %lx  %macro  %magic  %man  %matplotlib  %mkdir  %more  %mv  %notebook  %page  %pastebin  %pdb  %pdef  %pdoc  %pfile  %pinfo  %pinfo2  %pip  %popd  %pprint  %precision  %prun  %psearch  %psource  %pushd  %pwd  %pycat  %pylab  %qtconsole  %quickref  %recall  %rehashx  %reload_ext  %rep  %rerun  %reset  %reset_selective  %rm  %rmdir  %run  %save  %sc  %set_env  %store  %sx  %system  %tb  %time  %timeit  %unalias  %unload_ext  %who  %who_ls  %whos  %xdel  %xmode

Available cell magics:
%%!  %%HTML  %%SVG  %%bash  %%capture  %%debug  %%file  %%html  %%javascript  %%js  %%latex  %%markdown  %%perl  %%prun  %%pypy  %%

Another type of *magic* commands starts exclamation mark "!". Here are few examples:

In [10]:
!ls # Run ls commands directly in the cell

lab1-basics.ipynb	     lab4-plots-for-publications.ipynb	README.md
lab2-examine-datasets.ipynb  lab5-extra.ipynb
lab3-simulation-data.ipynb   lab6-cpp-engine.ipynb


In [11]:
!nproc # run nproc program

8


In [12]:
# Show head of certain linux file
!head -n20 /proc/cpuinfo

processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 94
model name	: Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz
stepping	: 3
microcode	: 0xd6
cpu MHz		: 1846.205
cache size	: 6144 KB
physical id	: 0
siblings	: 8
core id		: 0
cpu cores	: 4
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 22
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1

In [13]:
# Show working directory in verbose way
!ls -lah

total 592K
drwxr-xr-x  4 gcerar gcerar 4,0K feb 26 14:15 .
drwxr-xr-x 27 gcerar gcerar 4,0K feb 26 13:46 ..
drwxr-xr-x  8 gcerar gcerar 4,0K feb 26 14:13 .git
-rw-r--r--  1 gcerar gcerar 1,8K feb 26 14:12 .gitignore
drwxr-xr-x  2 gcerar gcerar 4,0K feb 26 14:15 .ipynb_checkpoints
-rw-rw-r--  1 gcerar gcerar  18K feb 26 14:15 lab1-basics.ipynb
-rw-rw-r--  1 gcerar gcerar 359K feb 26 13:47 lab2-examine-datasets.ipynb
-rw-rw-r--  1 gcerar gcerar  91K feb 26 13:47 lab3-simulation-data.ipynb
-rw-rw-r--  1 gcerar gcerar  82K feb 26 13:47 lab4-plots-for-publications.ipynb
-rw-rw-r--  1 gcerar gcerar 4,9K feb 26 13:47 lab5-extra.ipynb
-rw-r--r--  1 gcerar gcerar 1,1K feb 26 13:48 lab6-cpp-engine.ipynb
-rw-r--r--  1 gcerar gcerar  107 feb 26 13:53 README.md


In [14]:
# Fetch certain file for later use (e.g. dataset)
!wget https://gist.githubusercontent.com/netj/8836201/raw/6f9306ad21398ea43cba4f7d537619d0e07d5ae3/iris.csv

--2020-02-26 14:16:08--  https://gist.githubusercontent.com/netj/8836201/raw/6f9306ad21398ea43cba4f7d537619d0e07d5ae3/iris.csv
Resolving gist.githubusercontent.com (gist.githubusercontent.com)... 151.101.240.133
Connecting to gist.githubusercontent.com (gist.githubusercontent.com)|151.101.240.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3975 (3,9K) [text/plain]
Saving to: ‘iris.csv’


2020-02-26 14:16:08 (37,8 MB/s) - ‘iris.csv’ saved [3975/3975]



In [15]:
# print first few lines of file with Linux tools
!head iris.csv

"sepal.length","sepal.width","petal.length","petal.width","variety"
5.1,3.5,1.4,.2,"Setosa"
4.9,3,1.4,.2,"Setosa"
4.7,3.2,1.3,.2,"Setosa"
4.6,3.1,1.5,.2,"Setosa"
5,3.6,1.4,.2,"Setosa"
5.4,3.9,1.7,.4,"Setosa"
4.6,3.4,1.4,.3,"Setosa"
5,3.4,1.5,.2,"Setosa"
4.4,2.9,1.4,.2,"Setosa"
