# Introduction to python and Jupyter notebook

Andrea Volkamer - adapted by Gautier Peyrat

This notebook is an adaption from a notebook on [Samo Turks's GitHub page](https://github.com/samoturk/).

## Python

**Python** is a widely used general-purpose high-level programming language. 

Its design philosophy emphasizes code readability. It is very popular in science.

## Jupyter

The **Jupyter notebook** is a web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text.  

* Evolved from IPython notebook
* In addition to python it supports many other programming languages (Julia, R, (Haskell)[https://github.com/IHaskell/IHaskell], etc..)
* http://jupyter.org/

Can be easily installed using **Anaconda/Conda**

* https://www.continuum.io/downloads
* This notebook uses python 3.6

## The notebook: cell types - markdown and code

This is **Markdown** cell.

### Code cells allow you to enter and run code

Run a code cell using `Shift-Enter` or pressing the <button class='btn btn-default btn-xs'><i class="icon-step-forward fa fa-step-forward"></i></button> button in the toolbar above:

In [None]:
print('This is cell with code')

Shortcuts for running code:

- `Shift-Enter` (`⇧-Enter`) : run cell, select below
- `Ctrl-Enter` : run cell
- `Alt-Enter` : run cell, insert below

### Managing the Kernel

Code is run in a separate process called the Kernel.  Different languages can be used in a notebook. However, each notebook is associated with a single kernel.  

This notebook is associated with the IPython kernel, therefore runs Python code.

The Kernel can be interrupted or restarted.  Try running the following cell and then hit the <button class='btn btn-default btn-xs'><i class='icon-stop fa fa-stop'></i></button> button in the toolbar above.

In [None]:
import time
time.sleep(10)

### Cell menu

The "Cell" menu has a number of menu items for running code in different ways. These includes:

* Run and Select Below
* Run and Insert Below
* Run All
* Run All Above
* Run All Below

### Restarting the kernels

The kernel maintains the state of a notebook's computations. You can reset this state by restarting the kernel. This is done by clicking on the <button class='btn btn-default btn-xs'><i class='fa fa-repeat icon-repeat'></i></button> in the toolbar above.

### Output is asynchronous

All output is displayed asynchronously as it is generated in the Kernel. If you execute the next cell, you will see the output one piece at a time, not all at the end.

In [None]:
import time
import sys
for i in range(8):
    print(i)
    time.sleep(0.5)

### Large outputs

To better handle large outputs, the output area can be collapsed. Run the following cell and then single- or double- click on the active area to the left of the output:

In [None]:
for i in range(50):
    print(i)

Beyond a certain point, output will scroll automatically:

In [None]:
for i in range(500):
    print(2**i - 1)

## Keyboard Shortcut

[A popular blogpost introducing all shortcuts, but a little big old (Dec 2017)](https://towardsdatascience.com/jypyter-notebook-shortcuts-bf0101a98330)

At the notebook interface, we can also get it by clicking: Menu bar-->Help-->Keyboard Shortcuts

### Command Mode (press Esc to enable)
- `Enter` : take you to edit mode
- `Shift + Enter`  (`⇧ + Enter`) : run cell, select below
- `Y` : to code
- `M` : to markdown
- `up`  (`↑`) : select cell above
- `down`  (`↓`) : select cell below
- `A` : insert cell above
- `B` : insert cell below
- `X` : cut selected cell
- `C` : copy selected cell
- `V` : paste cell below
- `D, D (press the key twice)` delete selected cells
- `Z` : undo last cell deletion

### Edit Mode (press Enter to enable)
- `Esc` take you into command mode
- `Tab`  (`⇥`): code completion or indent
- `Shift + Tab`  (`⇧ + ⇥`) : unindent
- `Ctrl + A` : select all
- `Ctrl + C` : copy
- `Ctrl + V` : paste
- `Ctrl + Z` : undo
- `Ctrl + Y` : redo
- `Ctrl + S` save and checkpoint


## Some simple python lines...

### Variables, lists and dictionaries

In [None]:
# integer
var1 = 1

In [None]:
var1

Mathematical operations

- `+` addition
- `-` subtraction
- `*` multiplication
- `/` division

In [None]:
var1+var1

In [None]:
# lists
my_list = [1, 2, 3, 'x', 'y']
my_list

In [None]:
# note we always start counting from 0:
my_list[0]

In [None]:
# range
my_list[1:3]

In [None]:
# dictionaries (key:value)
salaries = {'Mike':2000, 'Ann':3000}

In [None]:
salaries['Mike']

In [None]:
# add a new entry
salaries['Jake'] = 2500

In [None]:
salaries

### Strings

In [None]:
# string
my_string = "This is a string"

In [None]:
print(my_string)

In [None]:
# newline is indicated by special character '\n'
long_string = 'This is a string \n Second line of the string'

In [None]:
print(long_string)

In [None]:
long_string.split(" ")

In [None]:
long_string.split("\n")

In [None]:
long_string.count('s') # case sensitive!

In [None]:
long_string.upper()

### Conditionals

In [None]:
# if-else clause
if long_string.startswith('X'):
    print('It starts with X')
elif long_string.startswith('T'):
    print('It starts with T')
else:
    print('No')

### Loops

In [None]:
for line in long_string.split('\n'):
    print (line)

In [None]:
c = 0
while c < 10:
    c += 2
    print (c)

### File operations

read a whole file and get content as a string

In [None]:
with open('./data/EGFR-course.csv', 'r') as f:
    content = f.read()
print(content)

Read a file row by row

In [None]:
with open('./data/EGFR-course.csv', 'r') as f:
     for row in f:
        print(row)

### Functions

In [None]:
def average(numbers):
    return float(sum(numbers)/len(numbers))

In [None]:
my_numbers = [1,2,2,2.5,3,]
average(my_numbers)

During GSON lessons, we will use auto-evaluation to create functions. 

##### Try to Create a python function as f(x) = (5 * x / 2)²

#### First, import the exercise

In [None]:
import sys
sys.path.insert(1, f'../corrections/exercices')
from intro import exo_calculation

#### Second, print some example of results

In [None]:
exo_calculation.example(3)

#### Then, create the function

In [None]:
def calculation(x):
    # pow(x, 2) # returns x^2
    return

#### Finally, check if the function works correctly

In [None]:
exo_calculation.correction(calculation)

### Python libraries

Library is a collection of resources. These include pre-written code, subroutines, classes, etc.

In [None]:
from math import exp

In [None]:
exp(2) #shift tab to access documentation

In [None]:
import math

In [None]:
math.exp(10)

### Packages we might need during the course

In [None]:
import numpy as np # Numpy - package for scientifc computing, alias : np

In [None]:
import pandas as pd # Pandas - package for working with data frames (tables), alias : pd

In [None]:
import sklearn # Scikit-learn - package for machine learning

In [None]:
import rdkit # RDKit - package for cheminformatics
from rdkit import Chem # Library from RDKit for molecules

In [None]:
print(np.__name__,np.__version__)
print(pd.__name__,pd.__version__)
print(sklearn.__name__,sklearn.__version__)
print(rdkit.__name__,rdkit.__version__)

Get list of the attributes and methods of any object

In [None]:
dir(Chem.Mol)

### Plotting

In [None]:
%matplotlib inline

In [None]:
import matplotlib.pyplot as plt

In [None]:
x_values = np.arange(0, 20, 0.1)
y_values = [math.sin(x) for x in x_values]

In [None]:
plt.plot(x_values, y_values);

In [None]:
plt.scatter(x_values, y_values);

In [None]:
plt.boxplot(y_values);

# Quiz

In [None]:
from nbautoeval import run_yaml_quiz

In [None]:
run_yaml_quiz(f"../corrections/quiz/intro.yaml", "theoric-quiz")

In [None]:
run_yaml_quiz(f"../corrections/quiz/intro.yaml", "code-quiz")