
## TCD_19 Tutorial 2 . Intro to Jupyter and python



Created by Emanuel Flores-Bautista 2019  All content contained in this notebook is licensed under a [Creative Commons License 4.0](https://creativecommons.org/licenses/by/4.0/). The code is licensed under a [MIT license](https://opensource.org/licenses/MIT).



Hi and welcome to **Taller de Ciencia de Datos 2019 (TCD_19)**. In this notebook we'll go through some basic features of the amazing Jupyter Notebooks. We will be using Python in this notebook but Jupyter supports more than 40 programming languages. If you want to know more about the Jupyter notebook project, check out their [website](http://jupyter.org/). Here is a very interesting [article](https://www.theatlantic.com/science/archive/2018/04/the-scientific-paper-is-obsolete/556676/) that describes why Jupyter Notebooks just might be the future for scientific papers! If you want to check some cool projects using Jupyter, check out [NBviewer](http://nbviewer.jupyter.org/). 

First we'll import some libraries that we will be using throughout the course.


In [1]:
##This Python Magic command allows graphs to be plotted in the notebook
%matplotlib inline
##This command sets the graphs format to svg
%config InlineBackend.figure_format = 'svg'

import numpy as np
import pandas as pd
from scipy.integrate import odeint
import matplotlib.pyplot as plt
import seaborn as sns

### Before we continue, you might be wondering *what the heck  is Jupyter?*

As you might already seen, Jupyter is a way to combine text, math( and LaTEX) and code in an easy-to-read document that runs in a web browser. Jupyter is also an open-source project that spun out of [IPython](https://ipython.org/) (hence the termination of the file format of our notebook `.ipynb`), a project that seek out to rapidly prototype code in Python, in a fun and interactive way. 

To see some Jupyter tricks and shortcuts like the IPython magic commands check this
[post](https://www.dataquest.io/blog/jupyter-notebook-tips-tricks-shortcuts/)




The name "Jupyter" is a combination of [Julia](https://julialang.org/) (a programming language), [Python](https://www.python.org/) (which we'll be using), and [Ruby](https://www.r-project.org/) (statistics software environment). However, you currently can run over 40 different languages in a Jupyter notebook, not just Julia, Python, and R.

Okay, enough of some history about Jupy notebooks. Let's see their power in action

* To launch a Jupyter Notebook, from the command line ( Terminal for Mac/Unix users, open the search line and type Jupyter Notebook for Windows users ) type




> `jupyter notebook`

When you launch Jupyter, you will be presented with a menu of files in your current working directory to choose to edit. You can click "New" in the upper right corner to get a new Jupyter notebook. 

### Why Python? 

* Python is a clear and powerful programming language
* Python is easy for beginners to learn but goes a long way
* Elegant syntax makes programs easy to read
* Easy to learn, write, debug, and maintain code
* Large standard library supports many common programming tasks
*  Python has an extremely useful interactive mode for experimenting with code
* Python can be extended by adding modules implemented in other languages
* Runs on Windows, Mac OS X, Unix, Linux, Solaris, cell phones, etc.
* Python is free to download and use
* Very large online user community

### Cells

A Jupyter notebook consists of cells. There are two main types of cells you will use: **code cells and markdown cells**.

Markdown cells contain text. The text is written in markdown, a lightweight markup language. You can read about its syntax [here](https://daringfireball.net/projects/markdown/syntax), you can find a useful Markdown cheatsheet [here](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet). Note that you can also insert HTML into markdown cells, and this will be rendered properly. As you are typing the contents of these cells, the results appear as text. Actually all of the above is written in Markdown except for the cells that are importing the libraries.

To execute the content of a cell, just press `Shift` and `Enter`. 

The shortcut to add a cell below is to press `Esc+ b`.

The shortcut to add a cell below is to press `Esc+ a`.

The shortcut to turn a (code)cell a markdown cell is to press `Esc+ m`.

There are quite a few usefull shortcuts in Jupy notebooks, which you can check in the keyboard icon at the Toolbar. Otherwise you can do this manually at the Toolbar. 


### Common data structures in Python

##### Strings

Here is an examples of using strings in Python. 

In [2]:
name = input(" Enter your name : ")
print (" Hello " + name + "!")

 Enter your name : 
 Hello !


Some math: integers are `int`s and floating points are `float`s


You can use code cells as a calculator. 

In [3]:
5+3

8

In [4]:
type(8)

int

In [5]:
type(8.8)

float

In [6]:
x =5*4
x

20

The following won't work as $5^{2}$

In [7]:
5^2

7

This is the proper way to do it in Python: 

In [8]:
y= 5**2 
y

25

##### Bolean

In [9]:
x<y

True

In [10]:
x is not y

True

In [11]:
x == y

False

In [12]:
z = 20 

In [13]:
x is z

True

In [14]:
x == z

True

In [15]:
type(True)

bool

In [20]:
True is not False

True

##### Lists and tuples

In [21]:
x = (2,9,5) ##This is a tuple
print(max (x)) # return the largest item using the max() method
print(type(x)) # Show that x is a tuple

9
<class 'tuple'>


In [22]:
# return a number rounded to n digits after the decimal point ( floating point )
round (12.345 , 1) 

12.3

In [23]:
# return a number rounded to the nearest integer ( int )
round (12.345)

12

In [24]:
# return a new sorted list from the items in iterable
sorted ([48 ,20 ,1.3 ,1])

[1, 1.3, 20, 48]

In [25]:
# logical operations
2 < 3 < 4

True

In [26]:
y = np.array([1 ,3 ,9])
print('the mean of y is ', y. mean())
print('the std deviation of y is ', y.std())
print('the shape of the array y is ', y.shape)

the mean of y is  4.333333333333333
the std deviation of y is  3.39934634239519
the shape of the array y is  (3,)


In [29]:
import random
random.random() ##returns a random number from 0 to 1

0.456686447043239

In [30]:
random.choice ([10 , 15, 20, 25, 30, 35, 40])

10

In [31]:
random.choice ([" Monday ", " Wednesday ", " Friday"])

' Wednesday '

In [32]:
S = " Python"
len(S) ##length

7

In [33]:
S[0] ## Returns the first element of string s 


' '

In [34]:
S [0:3] # slicing, starts at zero, second term is non-inclusive

' Py'

In [36]:
S[-1] # slices last term


'n'

In [38]:
S[:] #slices everything but the last term

' Python'

In [39]:
"y" in S # membership

True

In [40]:
"z" not in S

True

In [41]:
names = [" Peter ", " John ", " Mary "]

In [46]:
names[0]

' Peter '

In [47]:
names [1]

' John '

Lists are mutable. The `.append()` method is a quite useful method to add elements to a list. 

In [48]:
names.append(" Tom ")
names

[' Peter ', ' John ', ' Mary ', ' Tom ']

In [None]:
a = []

In [52]:
#Initialize list
new_names= []

#Start a for loop
for name in names: 
    
    new_names.append(name)

In [53]:
print(new_names)

[' Peter ', ' John ']


In [54]:
names = names + [" Jim ", "Pepito"] ## you can also add elements in this way 
names

[' Peter ', ' John ', ' Mary ', ' Tom ', ' Jim ', 'Pepito']

In [57]:
random.choice( names )

' John '

In [59]:
x = np.array([1,1,0])

x.shape

(3,)

In [60]:
names.reverse() # reverses the list in place ( common mutable sequence operation )

In [61]:
names

['Pepito', ' Jim ', ' Tom ', ' Mary ', ' John ', ' Peter ']

In [65]:
sorted(names) # sorts the list i place ( list method )

[' Jim ', ' John ', ' Mary ', ' Peter ', ' Tom ', 'Pepito']

In [63]:
names[::-1]

[' Peter ', ' John ', ' Mary ', ' Tom ', ' Jim ', 'Pepito']

##### Ranges

Ranges are immutable sequences of integers normally used in for loops.

In [66]:
range(5) # stop

range(0, 5)

In [67]:
range (1 ,6) # start and stop

range(1, 6)

In [68]:
range (0 ,11 ,2) # start and stop and step

range(0, 11, 2)

In [71]:
for i in range(0,11):
    print(i)

0
1
2
3
4
5
6
7
8
9
10


In [72]:
x

array([1, 1, 0])

Oh si, puedo escribir en texto !!

In [76]:
?enumerate

In [75]:
for i, letra in enumerate('gatito mojado'):
    
    print(i, letra)

0 g
1 a
2 t
3 i
4 t
5 o
6  
7 m
8 o
9 j
10 a
11 d
12 o


In [73]:
list(x)

[1, 1, 0]

In [74]:
list(range (0,11 ,2)) # for printing it out

[0, 2, 4, 6, 8, 10]

##### Numpy arange and linspace. 

The following is taken from Jake Van der Plas awesome [Python Data Science Handbook](https://jakevdp.github.io/PythonDataScienceHandbook/02.02-the-basics-of-numpy-arrays.html):  

In [77]:
np.arange(0,10) 

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [86]:
x = np.linspace(0, 10, 10) #linear space

In [87]:
x

array([ 0.        ,  1.11111111,  2.22222222,  3.33333333,  4.44444444,
        5.55555556,  6.66666667,  7.77777778,  8.88888889, 10.        ])

In [101]:
np.random.seed(0)  # seed for reproducibility

x1 = np.random.randint(0, 10, size=6)  # One-dimensional array
x2 = np.random.randint(0, 10, size=(3, 4))  # Two-dimensional array
x3 = np.random.randint(0, 10, size=(3, 4, 5))  # Three-dimensional array

In [107]:
x3

array([[[8, 1, 5, 9, 8],
        [9, 4, 3, 0, 3],
        [5, 0, 2, 3, 8],
        [1, 3, 3, 3, 7]],

       [[0, 1, 9, 9, 0],
        [4, 7, 3, 2, 7],
        [2, 0, 0, 4, 5],
        [5, 6, 8, 4, 1]],

       [[4, 9, 8, 1, 1],
        [7, 9, 9, 3, 6],
        [7, 2, 0, 3, 5],
        [9, 4, 4, 6, 4]]])

In [108]:
print(x3.dtype) # data type of the array

int64


In [110]:
print('x3 number of each dimension', x3.ndim) #array's number of dimensions
print('x3 size of each dimension', x3.shape) #array shape
print('x3 total number of elements ', x3.size)# array's number of elements 
print('size in bytes of each element', x3.itemsize, 'bytes') #bytes of each array element  
print('total size of the array', x3.nbytes, 'bytes') #total size in bytes of the array (elements*bytes/element)



x3 number of each dimension 3
x3 size of each dimension (3, 4, 5)
x3 total number of elements  60
size in bytes of each element 8 bytes
total size of the array 480 bytes


Now let's access the matrix elements. 

In [None]:
x3

In [None]:
x3[0]

In [None]:
x3[0][1]

In [None]:
x3[0][1][2]

### One -dimensional arrays

In [None]:
x = np.arange(10)
x

In [None]:
x[:4]

In [None]:
x[4:]

In [None]:
x[4:7]

In [None]:
x[::2] #every other el1ement

In [None]:
x[1::2]# every other element starting at index 1

In [None]:
x[::-1]  # all elements, reversed

In [None]:
x[::-1]  # all elements, reversed

### Back to 3D arrays

Subarrays can be reversed together

In [None]:
x3[:, :, 0]

In [None]:
x3[:, ::-1, 0]

In [None]:
x3[::-1, ::-1, 0]

Reshaping arrays

In [None]:
grid = np.arange(1, 10)
grid

In [None]:
grid_modified = grid.reshape((3, 3))
print(grid_modified)

In [None]:
x = np.array([1, 2, 3])
y = np.array([3, 2, 1])
np.concatenate([x, y])

In [None]:
grid = np.array([[1, 2, 3],
                 [4, 5, 6]])


# concatenate along the first axis
np.concatenate([grid, grid])


In [None]:
# concatenate along the second axis
np.concatenate([grid, grid], axis = 1)

In [None]:
x = np.array([1, 2, 3])
grid = np.array([[9, 8, 7],
                 [6, 5, 4]])

# vertically stack the arrays
np.vstack([x, grid])

In [None]:
grid

In [None]:
# horizontally stack the arrays
y = np.array([[99],
              [99]])
np.hstack([grid, y])

### Sets

Sets are unordered collections of distinct hashable objects.

In [None]:
nodes = set ([2 ,4 ,6 ,8 ,10 ,12])
nodes

In [None]:
nodes.add (14)
nodes

In [None]:
nodes.add (6) # adding a single item at a time
print(nodes)
nodes.update ([6 ,14]) # item adding multiple items at a time
print(nodes)

In [None]:
word = "an -ti -dis -es -tab -lish -ment -ar -i-an - ism " # syllabified
word = word.replace ("-","")
letters = set(word)
count = len(letters)

In [None]:
count

In [None]:
count = len(set("floc -ci -nau -ci -ni -hil -i-pil -i-fi -ca - tion ".replace("-","")))
count

##### Dictionaries

Dictionaries are mappings from key objects to value objects.

In [None]:
md = {}
md = dict ()
age = {'tim': 29, 'jim': 31, 'pam': 27}
age ['tim']


In [None]:
age ['tim'] = age ['tim'] + 1


In [None]:
age ['tim']

In [None]:
age ['tim'] += 1
age['tim']

In [None]:
age ['tom'] = 45

In [None]:
age.keys()

In [None]:
age.values()

In [None]:
age['tim'] ## a way to get the values of a specific key

In [None]:
for name in sorted (age.keys() , reverse = True ):
    print (name , age[name])

#### List comprehension

In [None]:
numbers = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
squares = []
for n in numbers :
    squares.append (n*n)

In [None]:
print(numbers, squares)


If we just care about the squares we can do the following 

In [None]:
squares = [n*n for n in range (0 ,11)]

In [None]:
squares

We can do multiple iterations. 

In [None]:
[(i, j) for i in range(2) for j in range(3)]

We can also integrate conditionals inside the iterator:

In [None]:
[val for val in range(20) if val % 3 > 0]

Which is equivalent of running the following:

In [None]:
L = []
for val in range(20):
    if val % 3:
        L.append(val)
L

We can also make set comprehension using curly braces:

In [1]:
{n**2 for n in range(12)}


{0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121}

Recall that a `set` is a collection that contains no duplicates. The `set` comprehension respects this rule, and eliminates any duplicate entries:

In [None]:
{a % 3 for a in range(1000)}

You can also make `dict` comprehension. 

In [3]:
{x:x**2 for x in range(6)}

{0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25}

##### Functions

In [None]:
def add (a,b):
    result = a + b
    return result

In [None]:
add(2,9)

In [None]:
def password ( length ):
    characters = " abcdefghijklmnopqrstuvwxyz "
    pw = str ()
    for i in range ( length ):
        pw = pw + random . choice ( characters )
    return pw

In [None]:
password(12)

Let's use define a (silly) function. We'll just showcase the use of a `for` loop and a `if/elif` statement.

In [None]:
def dont_print_tom_pam(names):
    '''Iterate over the names dictionary, dont print out the names tom or pam'''
    for name in names.keys():
        if name is 'tom':
            break
        elif name is 'pam':
            break
        print(name)

In [None]:
dont_print_tom_pam(age) ## executing the function on the age dictionary 

### Plotting 

Let's make some plots. Matplotlib.pyplot is the most widely used library for plotting in Python. Remember we imported it as `plt`

In [None]:
# Generate data to plot
x = np.linspace(0, 10, 200)
y = np.sin(x)


In [None]:
plt.plot(x,y)
plt.title('Plotting Sine in Python is Easy as 123', fontsize = 18)

### Differential Equations

Let's code the [Predator-Prey Model](http://www.scholarpedia.org/article/Predator-prey_model). 

In [None]:
?odeint

In [None]:
a,b,c,d = 1,1.1,1,1.4

def dP_dt(P,t):
    return [P[0]*(a-b*P[1]), -P[1]*(c-d*P[0])]

ts= np.linspace(0,25, 10000)
P0= [1.0, 2]
Ps= odeint(dP_dt, P0, ts)
prey= Ps[:,0]
predator= Ps[:,1]

Let's plot our results.

In [None]:
plt.plot(ts, prey, 'c-', label= 'Rabbits')
plt.plot(ts, predator, 'y-', label= 'Foxes')
plt.xlabel('$Time$')
plt.ylabel('Population')
plt.title('Predator-Prey Model', fontsize= 18)
plt.legend()
plt.show()

Let's look at the phase space. 

In [None]:
plt.plot(prey, predator, 'c-')
plt.xlabel('Rabbits')
plt.ylabel('Foxes')
plt.title('Predator-Prey Model Phase Space', fontsize= 18)
plt.show()

Let's code up several trajectories to see if the dynamics are similar. 

In [None]:
prey_states = np.linspace(1.,5, 5)
for i, prey in enumerate(prey_states): 
        P0= [prey, 3]
        Ps= odeint(dP_dt, P0, ts)
        
        

        plt.plot(Ps[:,0], Ps[:, 1], label = 'Prey starting condition = {0}'.format(prey))

plt.xlabel('Rabbits')
plt.ylabel('Foxes')
plt.legend()

plt.title('Phase Space', fontsize= 18)

Let's write it into a function.

In [None]:
def plot_phase_space(dP_dt, ts, n_states):
    
    '''Plots several trajectories of the Predator-Prey Model given different starting\
    prey populations'''
    
    prey_states = np.linspace(1.,5, n_states)
    
    for i, prey in enumerate(prey_states): 
            P0= [prey, 3]
            Ps= odeint(dP_dt, P0, ts)



            plt.plot(Ps[:,0], Ps[:, 1], label = 'Prey starting condition = {0}'.format(prey))

    plt.xlabel('Rabbits')
    plt.ylabel('Foxes')
    plt.legend()

    plt.title('Phase Space', fontsize= 18)
    


In [None]:
plot_phase_space(dP_dt, ts, 3) ### Calling the function

We can clearly see that they follow a [limit cycle](https://en.wikipedia.org/wiki/Limit_cycle), it represents a self-sustained oscillatory system. 

#### Best practices for code cells

Here is a summary of some general rules for composing and formatting your code cells.

1. Keep the width of code in cells below 80 characters.
2. Keep your code cells short. If you find yourself having one massive code cell, break it up.
3. Always properly comment your code. Provide complete doc strings for any functions you define.
4. Import one module per line.
5. To publish notebooks, always display your graphics in the notebook.

#### Conclusions and additional remarks

In this tutorial we covered what is Jupyter, what are the basic components of Jupy notebooks, and what can we do with them.

Python's modules offer a great deal of tutorials through their documentation. I strongly encourage you to check them out. Moreover, in practice, one can find a lot of information for a specific problem by googling and using Stack Overflow. 


You can find a great follow-up [tutorial](https://learnxinyminutes.com/docs/python3/) here. Learnxiny is a great platform to quickly learn the syntax almost any programming language in a practical way.