#  A new interactive world
*with Python Notebooks*

<!-- logos -->
<table>
<tr><td colspan=3> <img src='http://www.hpc.cineca.it/sites/all/themes/scai/logo.png' width=600> </td>
</tr><tr>
<td> <img src='http://upload.wikimedia.org/wikipedia/commons/thumb/f/f8/Python_logo_and_wordmark.svg/2000px-Python_logo_and_wordmark.svg.png' width=300> </td>
<td> <img src='https://raw.githubusercontent.com/jupyter/nature-demo/master/images/jupyter-logo.png' width=300> </td>
</tr>
</table>

## Here we go 
...with live code from the beginning!

In [11]:
# Install an extension from my repo
! wget -q "http://j.mp/pdonorio_py" -O whois_pdonorio.py

In [9]:
# Load the installed extension
%load_ext whois_pdonorio

<small> wait, what is that?? </small>

A command about *my self*

In [3]:
# This is a new command added from the extension 'whois_pdonorio'
%helloworld

Hello World!
	Who am i?

email: p.donoriodemeo@cineca.it
twitter: @paolo.donorio
github: @pdonorio


42

### Uhm. Let's *double check*
* note to self: click on the cell below
* then press shift+enter

In [2]:
import time
print ("Today is " + time.strftime("%d/%m/%Y"))

Today is 09/10/2015


## A notebook and a Calculator

Many of the things I used to use a calculator for, I now use Python for:

In [15]:
2+2

4

In [16]:
(50-5*6)/4

5.0

In [18]:
7/3

2.3333333333333335

<small>Note: with Python 2 there would have been a different answer for this last command.</small>

## What did we get so far?

* integers, floating point numbers
* extensions
* the normal python import of library
    - libraries are called **modules**

In [11]:
# An example of using a module
from math import sqrt
sqrt(81)

9.0

In [12]:
# Or you can simply import the math library itself
import math
math.sqrt(81)

9.0

You can define variables using the equals (=) sign:

In [13]:
width = 20
length = 30
area = length*width
area

600

If you try to access a variable that you haven't yet defined, you get an error (**traceback**)

In [14]:
volume

NameError: name 'volume' is not defined

and you need to define it:

In [None]:
depth = 10
volume = area*depth
volume

# How to get up-to-date

What you will see:

* What Python scripting is, and explain how it works on a low level
* The Ipython project and its jupyter fork
* How notebooks are easy and powerfull
* How to write extensions for notebook and how to use them
* Access the MapReduce world with notebooks

# The evolution

`Scripting` -> *Python* Interpreter -> *Ipython* Shell -> *Jupyter* Notebooks -> **Awesome!**

## About '*scripting*'

## Compilers

* **Computer programs** are "*binaries*" (alphabet of only 0 and 1) instructions, which is the only language the machine understand.


* A **programming language** (e.g. `C` or `Fortran`) is a way in the middle between human language (e.g. English) and machine language (0s and 1s).


* A person learns `C` language which is translated into machine language by ***compilers***. The **code** is translated into **binaries** or **executables**.  

## Interpreters

* **Scripting languages** are programming languages that *don't require an explicit compilation step*. For example: PHP, Javascript, Perl, Python, R.


* You have to compile a C program before you can run it. You don't have to compile a JavaScript program before using it.


* Scripting languages use **interpreters** (instead of compilers) to translate source-code to machine executable code at run-time.

## What is Python?

[Python](http://www.python.org/) is: 

> a modern, general-purpose, object-oriented, high-level programming language.

### General characteristics of Python

* **clean and simple language:** 
    * Easy-to-read and intuitive code
    * easy-to-learn minimalistic syntax
    * maintainability scales well with size of projects

* **expressive language:** 
    * Fewer lines of code
    * fewer bugs
    * easier to maintain

### Some *technical* details

* **dynamically typed:** 
    * No need to define the type of variables, function arguments or return types.
* **automatic memory management:** 
    * No need to explicitly allocate and deallocate memory for variables and data arrays
        * No memory leak bugs 
* **interpreted:** 
    * No need to compile the code
        * The Python interpreter reads and executes the python code directly

### Advantages

* The main advantage is ease of programming
    - minimizing the **time required** to develop, debug and maintain the code
* Well designed language that encourage many good programming practices
    - Modular, with good system for packaging and re-use of code
    - This often results in more transparent, maintainable and bug-free code
* Self describing (*introspection*)
    - Documentation tightly integrated with the code
* A large standard library, and a large collection of add-on packages

In [4]:
def myfunc(par1, par2="test"):
    """ I may help """
    print(par1, par2)

In [5]:
?myfunc

In [7]:
myfunc("one")
myfunc("one", "two")

one test
one two


In [8]:
myfunc()

TypeError: myfunc() missing 1 required positional argument: 'par1'

### Disadvantages

* Interpreted and dynamically typed programming language's execution **may be slow compared to compiled** programming languages
* Very different from functional programming, which you may already know a little bit

### Popularity

source: http://githut.info/
<img src='http://j.mp/1NwNIdj'>

# Python interpreter

The standard way to use the Python programming language is to use the Python interpreter to run python code

* The python interpreter is a program that read and execute the python code in files passed to it as arguments
* At the command prompt, the command ``python`` is used to invoke the Python interpreter

For example, to run a file ``my-program.py`` that contains python code from the command prompt, use:

    $ python my-program.py

We can also start the interpreter by simply typing ``python`` at the command line, and interactively type python code into the interpreter. 

<img src="https://raw.githubusercontent.com/cineca-scai/lectures/master/pydata/images/python-screenshot.jpg" width="700">


Most notable feature of the Python language syntax is the use of **indentation** to define code blocks, instead of classic '*Parentheses*'

Let's see a quick shell demo.

### IPython

IPython is an interactive shell that addresses the limitation of the standard python interpreter

...it is a work-horse for scientific use of python! 

It provides an interactive prompt to the python interpreter with a greatly improved user-friendliness.

<img src="https://raw.githubusercontent.com/cineca-scai/lectures/master/pydata/images/ipython-screenshot.jpg" width="800">

Born in 2001 as a work of a student (*Fernando Perez*).

Based on features he liked in *Mathematica* and trying to create a system for everyday scientific computing.

Some of the many useful features of IPython includes:

* Command history, which can be browsed with the up and down arrows on the keyboard.
* Tab auto-completion.
* In-line editing of code.
* Object introspection, and automatic extract of documentation strings from python objects like classes and functions.
* Good interaction with operating system shell.
* Support for multiple parallel back-end processes, that can run on computing clusters or cloud services like Amazon EE2.


Let's see a quick shell demo.

* colors
* output
* magic commands
* history
* bash commands

Note: we are currently running *Docker* for building our environment.

To open a bash shell attached to the running container you may use the command:

```
docker exec -it $(docker ps | grep client_ | awk '{print $1}') bash
```

Inside that shell you may run '`python`' and '`ipython`' commands.

# IPython notebook

[IPython notebook](http://ipython.org/notebook.html) is an HTML-based notebook environment for Python

* Based on the IPython shell
* Provides a web cell-based interactive environment powered with Javascript
* System profiles to access unix-terminal-like capability
* Comments and notes with HTML and markdown formats
* Integrates embedded plots


<img src="https://raw.githubusercontent.com/cineca-scai/lectures/master/pydata/images/ipython-notebook-screenshot.jpg" width="800">

Although using the a web browser as graphical interface, 

IPython notebooks are usually run **locally**

from the same computer that run the browser. 


To start a new IPython notebook session, run the following command:

    $ ipython notebook

from a directory where you want the notebooks to be stored. 

<small>(This will open a new browser window with a running explorer of the current path)</small>

# Jupyter project

> In 2014, Fernado Perez announced a spin-off project from IPython called Project Jupyter. 

> IPython will continue to exist as a Python shell and a kernel for Jupyter, 

> while **the notebook and other language-agnostic parts of IPython** will move under the Jupyter name. 

> Jupyter added support for Julia, R, Haskell and Ruby.

source: https://en.wikipedia.org/wiki/IPython#Project_Jupyter

## Kernels

Jupyter notebooks are based on **ipython kernels**.

> A ‘kernel’ is a program that runs and introspects the user’s code. 

IPython includes a kernel for Python code

People have written kernels for [several other languages](https://github.com/ipython/ipython/wiki/IPython-kernels-for-other-languages).

<small>Note: 
If you wish to **write a Kernel for a missing language**, you can read the ipython development documentation:
http://ipython.readthedocs.org/en/stable/development/kernels.html.</small>

<small>Note: When IPython starts a kernel, it passes it a connection file. 
This specifies how to set up communications with the frontend.</small>

# Welcome, 
## new *you* 
## as a <u>notebooker</u> data scientist

We can trying some features together

## A notebook is crazy simple and fun

- explorer, create new, remove, rename

- move inside, run cell code, help

- the kernel, start and stop, cell types 

- shortcuts: becoming an editor

## A notebook is crazy simple and fun

- markdown and notes <small>(consider to [learn markdown language](http://markdowntutorial.com/))

- download ipynb, python, html

- install a library and use it

- slideshow

## Introspection

Everything is an object

`dir` and `type` functions help you check how anything in python is an object

In [7]:
import math
print("\nENV:\n", dir())
print("\nMODULE:\n", dir(math))


ENV:
 ['In', 'Out', '_', '_1', '__', '___', '__builtin__', '__builtins__', '__doc__', '__loader__', '__name__', '__package__', '__spec__', '_dh', '_i', '_i1', '_i2', '_i3', '_i4', '_i5', '_i6', '_i7', '_ih', '_ii', '_iii', '_oh', '_sh', 'exit', 'get_ipython', 'math', 'quit']

MODULE:
 ['__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'acos', 'acosh', 'asin', 'asinh', 'atan', 'atan2', 'atanh', 'ceil', 'copysign', 'cos', 'cosh', 'degrees', 'e', 'erf', 'erfc', 'exp', 'expm1', 'fabs', 'factorial', 'floor', 'fmod', 'frexp', 'fsum', 'gamma', 'hypot', 'isfinite', 'isinf', 'isnan', 'ldexp', 'lgamma', 'log', 'log10', 'log1p', 'log2', 'modf', 'pi', 'pow', 'radians', 'sin', 'sinh', 'sqrt', 'tan', 'tanh', 'trunc']


In [71]:
print(math.__doc__)
math?

This module is always available.  It provides access to the
mathematical functions defined by the C standard.


In [15]:
# ?math.trunc is equivalent
help(math.trunc)

Help on built-in function trunc in module math:

trunc(...)
    trunc(x:Real) -> Integral
    
    Truncates x to the nearest Integral toward 0. Uses the __trunc__ magic method.



In [55]:
math.trunc(5.676876)

5

In [60]:
type(math.trunc)

builtin_function_or_method

In [54]:
math.trunc

<function math.trunc>

In [62]:
print(dir(math.trunc))

['__call__', '__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__le__', '__lt__', '__module__', '__name__', '__ne__', '__new__', '__qualname__', '__reduce__', '__reduce_ex__', '__repr__', '__self__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__text_signature__']


In [53]:
math.trunc.__doc__

'trunc(x:Real) -> Integral\n\nTruncates x to the nearest Integral toward 0. Uses the __trunc__ magic method.'

In [63]:
math.trunc.__name__

'trunc'

We can assign the function to another object

In [24]:
myfun = math.trunc
myfun(5.676876)

5

In [65]:
print(type(5.676876))
print(type(myfun(5.676876)))

<class 'float'>
<class 'int'>


Now that we understand how easy is to interact with parts of own python core

In [51]:
def one_line_help(obj):
    """Get the docstring of an object and read the first four lines"""
    print("\n".join(obj.__doc__.split("\n")[0:4]))
    
one_line_help(dir)

dir([object]) -> list of strings

If called without an argument, return the names in the current scope.
Else, return an alphabetized list of names comprising (some of) the attributes


In [68]:
# Another way of inspecting is via inspect module
import inspect
inspect.getdoc(one_line_help)

'Get the docstring of an object and read the first four lines'

## Exercises 

* Apply sin of logarithm of 4 with base 10

* Move a file into system 'tmp' directory
    - hint1: there is a useless '`tar.gz`' file in `/opt`' dir
    - hint2: check `shutil`

## Python and modules versions

We provided here an installed 'extension' that you might use to ensure reproducibility of your python code.

In [3]:
%reload_ext version_information

%version_information numpy, pandas, matplotlib

Software,Version
Python,3.4.3 64bit [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)]
IPython,4.0.0
OS,Linux 4.0.9 boot2docker x86_64 with debian jessie sid
numpy,1.9.3
pandas,0.16.2
matplotlib,1.4.3
Thu Oct 08 15:16:54 2015 UTC,Thu Oct 08 15:16:54 2015 UTC


```
Note: 

now that containers are well spread, you may use Docker to freeze a version of your environment.

This is very usefull for other... and for your self! ;)
```

# you can catch a break

In [None]:
import time
time.sleep(15*60)
print "Our mind is clearer, let's start again."