<img src="http://hilpisch.com/tpq_logo.png" alt="The Python Quants" width="45%" align="right" border="4">

# Introduction to Python, IPython & Jupyter

Dr. Yves J. Hilpisch

The Python Quants GmbH

<a href='http://fpq.io'>http://fpq.io</a> | <a href='mailto:team@tpq.io'>team@tpq.io</a>

## Python Environments

### Anaconda

It is important to have available a **consistent Python distribution** for interactive analytics, prototyping and development. **Anaconda** is one excellent option that is targeted towards

* corporate and financial institutions,
* data scientists,
* quantitative and financial analysts as well as
* academics, researchers, teachers

You can download it here <a href="http://www.continuum.io/downloads" target="_blank">Anaconda page</a>. However, in principle, you do not need to take care of this since there is the Python Quant Platform.

### Quant Platform

Alternatively, you can use the Web-based financial analytics environment **Python Quant Platform** (<a href="http://quant-platform.com" target="_blank">http://quant-platform.com</a>) where you find a **complete, browser-based Python analytics and development environment** with, among others, a full Anaconda Python distribution already installed (both 2.7 and 3.4 versions).

In [None]:
from IPython.display import IFrame, Image, HTML
Image('http://hilpisch.com/pqp_overview.png')

## IPython

IPython is today the most popular and one of the most **powerful interactive analytics environments** for Python and other languages like R. It comes in three **technological flavours**:

* IPython Shell
* IPython QTConsole
* IPython Notebook (now Jupyter Notebook)



Cf. http://ipython.readthedocs.org/en/stable/

On the **Quant Platform**, you can use both the **IPython Shell** and **IPython Notebook** (now **Jupyter Notebook**).

## First Steps with Python

### Using Jupyter Notebook

IPython/Jupyter (Notebook) allows for eaay, fail-safe Python development and interactive analytics. It supports the user with a number of tools:

* magic commands that bring magic to the command line
* help system for fast help access
* tab completion for inspection of available names, attributes and methods
* system shell
* others ...

#### Calculations

On a rather fundamental level, IPython can work as a calculator.

In [None]:
3 + 4

In [None]:
3 * 5

In [None]:
3 / 4 
 # Python 2.x --> result 0
 # Python 3.x --> result 0.75

In [None]:
3 / 4.

In [None]:
log(1)

In [None]:
import math

In [None]:
math.log(1)

In [None]:
math.log?
# reading the help text

In [None]:
math.  # tab completion

#### Magic Commands

Magic commands are IPython specific, some IPython Notebook specific.

In [None]:
%magic

In [None]:
%lsmagic

In [None]:
%prun?
 # get help on a command with ?

In [None]:
Image('http://matplotlib.org/mpl_examples/pie_and_polar_charts/polar_scatter_demo.hires.png')

In [None]:
# displaying graphics within the Notebook
%matplotlib inline
# suppressing uncritical warning
import warnings; warnings.simplefilter('ignore')

In [None]:
%load http://matplotlib.org/mpl_examples/shapes_and_collections/scatter_demo.py

#### System Shell

In [None]:
%run dx_example.py

In [None]:
!ls -n

In [None]:
mkdir test

In [None]:
ls

In [None]:
cd test

In [None]:
cd ..

In [None]:
rmdir test

In [None]:
ls

### Interactive Python Coding

#### Deciding Prime Characteristic of Integer

As an exercise, we want to implement a function that decides whether a given integer is prime or not. The function shall check:

* whether the input is indeed an integer
* whether number is both positive and not "too small"
* whether it has the prime characteristic

Let's start with the basic function definition.

In [None]:
def is_prime(I):
    pass

In [None]:
is_prime(1)

In [None]:
is_prime('Python')

Let's add type checking.

In [None]:
def is_prime(I):
    print("Type of I is %s" % type(I))

In [None]:
is_prime(1)

In [None]:
is_prime('Python')

We only accept the 'int' type.

In [None]:
def is_prime(I):
    if type(I) != int:
        raise TypeError, "Input has not the right type."
    print("Input type is ok.")

In [None]:
is_prime(1)

In [None]:
is_prime('Python')

In [None]:
def is_prime(I):
    if not isinstance(I, int):  # alternative inspection
        raise TypeError, "Input has not the right type."
    print("Input type is ok.")

In [None]:
is_prime(1)

In [None]:
is_prime('Python')

We also have to exclude negative and "too small" numbers.

In [None]:
def is_prime(I):
    if type(I) != int:
        raise TypeError("Input has not the right type.")
    if I <= 3:
        raise ValueError("Number too small.")
    print("Input is ok.")

In [None]:
is_prime(1)

In [None]:
is_prime(5)

Finally, we add the functionality to check the prime characteristic.

In [None]:
def is_prime(I):
    if type(I) != int:
        raise TypeError("Input has not the right type.")
    if I <= 3:
        raise ValueError("Number too small.")
    else:
        for i in xrange(2, I):
            if I % i == 0:
                print("Number is not prime, it is divided by %d." % i)
                break
            if i == I - 1:
                print("Number is prime.")

In [None]:
is_prime(1)

In [None]:
is_prime('Python')

In [None]:
is_prime(5)

In [None]:
is_prime(6)

In [None]:
%time is_prime(18000001)

In [None]:
%time is_prime(int(1e8) + 7)

In [None]:
%time is_prime(int(1e8) + 3)

If this is too long, we can implement two simple optimizations:

* we only need to check **odd numbers**
* we only need to check numbers up to the **square root of the input number**

In [None]:
def is_prime(I):
    if type(I) != (int or long):
        raise TypeError("Input has not the right type.")
    if I <= 1:
        raise ValueError("Number too small.")
    else:
        if I % 2 == 0:
            print("Number is even, therefore not prime.")
            return None
        else:
            end = int(I ** 0.5) + 1
            for i in xrange(3, end, 2):
                if I % i == 0:
                    print("Number is not prime, it is divided by %d." % i)
                    break
                if i >= end - 2:
                    print("Number is prime.")

With the **improved algorithm**, Python becomes faster.

In [None]:
%time is_prime(int(1e8) + 7)

In [None]:
%time is_prime(int(1e8) + 3)

In [None]:
p = 100109100129162907  # larger prime
%time is_prime(p)

However, performance can be further enhanced by an even better algorithm.

In [None]:
def is_prime_2(n):
  if n == 2 or n == 3: return True
  if n < 2 or n % 2 == 0: return False, 2
  if n < 9: return True
  if n % 3 == 0: return False, 3
  r = int(n ** 0.5)
  f = 5
  while f <= r:
    if n % f == 0: return False, f
    if n % (f + 2) == 0: return False, f + 2
    f +=6
  return True

In [None]:
%time is_prime_2(int(1e8) + 7)

In [None]:
%time is_prime_2(int(1e8) + 3)

In [None]:
%time is_prime_2(p)  # about twice as fast as the previous one

In [None]:
type(range(10))

In [None]:
type(xrange(10))

In [None]:
for x in range(10):
    print (x)

In [None]:
for x in xrange(10):
    print (x)

In [None]:
import sys

In [None]:
sys.getsizeof(range(1000))

In [None]:
sys.getsizeof(xrange(1000))

### Dynamic Compiling

In [None]:
def is_prime_3(n):
    if n % 2 == 0:
        return False
    from_i = 3
    to_i = n ** 0.5 + 1
    for i in xrange(from_i, int(to_i), 2):
        if n % i == 0:
            return False
    return True

In [None]:
%time is_prime_3(p)  # pure Python slightly slower

In [None]:
import numba as nb

In [None]:
is_prime_nb = nb.jit(is_prime_3)

In [None]:
%timeit is_prime_nb(p)  # compiled version about 5 times faster

## Modelling Data

### Basic Data Types

#### Integers

In [None]:
a = 10

In [None]:
a

In [None]:
type(a)

In [None]:
a.bit_length()

In [None]:
a = 1000000

In [None]:
a.bit_length()

In [None]:
googol = 10 ** 100
googol

In [None]:
googol.bit_length()

In [None]:
1 + 4

In [None]:
1 / 4

In [None]:
type(1 / 4)

#### Floats

In [None]:
b = 1.

In [None]:
b

In [None]:
type(b)

In [None]:
1. / 4

In [None]:
type(1. / 4)

In [None]:
b = 0.35

In [None]:
b

Floating point numbers are not stored with perfect precision ...

In [None]:
b + 0.1

This is due to the float representation of decimal numbers as sums of fractions, i.e. for $0 < n < 1$, $n$ is represented by a series of the form $n = \frac{x}{2} + \frac{y}{4} + \frac{z}{8} + ...$

In [None]:
c = 0.5

In [None]:
c.as_integer_ratio()

In [None]:
b.as_integer_ratio()

Should be, of course, $0.35 = \frac{7}{20}$.

In [None]:
import decimal
from decimal import Decimal

In [None]:
decimal.getcontext()

In [None]:
d = Decimal(1) / Decimal (11)
d

In [None]:
decimal.getcontext().prec = 4  # lower precision than default

In [None]:
e = Decimal(1) / Decimal (11)
e

In [None]:
decimal.getcontext().prec = 50  # higher precision than default

In [None]:
f = Decimal(1) / Decimal (11)
f

In [None]:
g = d + e + f  # and mix it up
g

### Strings

In [None]:
t = 'this is a string object'

In [None]:
t.capitalize()

In [None]:
t.upper()

In [None]:
t.split()

In [None]:
t.find('string')  # returns index value/position

In [None]:
t.replace(' ', '|')

In [None]:
'http://www.python.org'.strip('htp:/')  # delete leading/lagging characters

In [None]:
'http://www.python.org'.strip('htp:/w.')

In [None]:
t[4:8]  # slicing is also possible

In [None]:
help(''.strip)

**Regular expressions** are really helpful when working with strings.

In [None]:
import re

In [None]:
series = """
'01/18/2014 13:00:00', 100, '1st';
'01/18/2014 13:30:00', 110, '2nd';
'01/18/2014 14:00:00', 120, '3rd'
"""

In [None]:
dt = re.compile("'[0-9/:\s]+'")  # describes a 'datetime'

In [None]:
result = dt.findall(series)
result

The results can then be parsed and transformed into Python datetime objects.

In [None]:
from datetime import datetime
pydt = datetime.strptime(result[0].replace("'", ""),
                         '%m/%d/%Y %H:%M:%S')
pydt

In [None]:
print(pydt)

In [None]:
pydt.__str__()

### Basic Data Structures

#### Tuples

In [None]:
t = (1, 2.5, 'data')
type(t)

In [None]:
t = 1, 2.5, 'data'
type(t)

In [None]:
t[2]

In [None]:
type(t[2])

In [None]:
t.count('data')

In [None]:
t.index(1)

#### Lists

In [None]:
l = [1, 2.5, 'data']
l[2]

In [None]:
l = list(t)
l

In [None]:
type(l)

In [None]:
l.append([4, 3])  # append list at the end
l

In [None]:
l.extend([1.0, 1.5, 2.0])  # append elements of list
l

In [None]:
l.insert(1, 'insert')  # insert object before index position
l

In [None]:
l.remove('data')  # remove first occurence of object
l

In [None]:
p = l.pop(3)  # removes and returns object at index
print(l, p)

In [None]:
l[2:5]  # 3rd to 5th element

In [None]:
for element in l[2:5]:
    print(element ** 2)

In [None]:
r = range(0, 8, 1)  # start, end, step width
r

In [None]:
type(r)

#### Dictionaries

In [None]:
d = {
     'Name' : 'Angela Merkel',
     'Country' : 'Germany',
     'Profession' : 'Chancelor',
     'Age' : 60
     }
type(d)

In [None]:
print(d['Name'], d['Age'])

In [None]:
d.keys()

In [None]:
d.values()

In [None]:
d.items()  # key-value pairs

In [None]:
birthday = True
if birthday is True:
    d['Age'] += 1
print(d['Age'])

In [None]:
for item in d.items():
    print(item)

In [None]:
for value in d.values():
    print(type(value))

#### Sets

In [None]:
s = set(['u', 'd', 'ud', 'du', 'd', 'du'])
s

In [None]:
t = set(['d', 'dd', 'uu', 'u'])

In [None]:
s.union(t)  # all of s and t

In [None]:
s.intersection(t)  # both in s and t

In [None]:
s.difference(t)  # in s but not t

In [None]:
t.difference(s)  # in t but not s

In [None]:
s.symmetric_difference(t)  # in either one but not both

One application of set objects is to get rid of duplicates in a list object, for example.

In [None]:
from random import randint
l = [randint(0, 10) for i in range(1000)]
    # 1,000 random integers between 0 and 10
len(l)  # number of elements in l

In [None]:
l[:20]

In [None]:
s = set(l)
s

In [None]:
for number in s:
    print("Number %2d occurs %3d times in the data set." % (number, l.count(number)))

## Selected Idioms

#### For Loops and If-Elif-Else

In [None]:
for i in range(2, 5):
    print(l[i] ** 2)

In [None]:
for i in range(1, 10):
    if i % 2 == 0:  # % is for modulo
        print("%d is even" % i)
    elif i % 3 == 0:
        print("%d is multiple of 3" % i)
    else:
        print("%d is odd" % i)

#### While Loops

In [None]:
total = 0
while total < 100:
    total += 1
print(total)

#### Try-Except-Else-Finally

In [None]:
import math

In [None]:
# raises error intentionally
for i in [2, 3, -4, 6, -5, -4]:
    print math.sqrt(i)

In [None]:
for i in [2, 3, -4, 6, -5, -4]:
    try:
        print ("square root of %d is %f" % (i, math.sqrt(i)))
    except:
        print ("cannot calculate square root of %d" % i) 

In [None]:
for i in [2, 3, -4, 6, 'Python', -5, -4]:
    try:
        print ("square root of %d is %f" % (i, math.sqrt(i)))
    except ValueError:
        print ("cannot calculate square root of %d" % i)
    except TypeError:
        print ("the input '%s' is not a number" % i)

In [None]:
for i in [2, 3, -4, 6, 'Python', -5, -4]:
    try:
        print ("square root of %d is %f" % (i, math.sqrt(i)))
    except ValueError:
        print ("cannot calculate square root of %d" % i)
    except TypeError:
        print ("the input '%s' is not a number" % i)
    else:  # only if there is no exception
        print ("simply continuing the iteration")

In [None]:
for i in [2, 3, -4, 6, 'Python', -5, -4]:
    try:
        print ("square root of %d is %f" % (i, math.sqrt(i)))
    except ValueError:
        print ("cannot calculate square root of %d" % i)
    except TypeError:
        print ("the input '%s' is not a number" % i)
    else:  # only if there is no exception
        print ("simply continuing the iteration")
    finally:  # executed in any case
        print ("I am executed not matter what\n")

#### List Comprehension

In [None]:
m = [i ** 2 for i in range(5)]  # simple
m

In [None]:
m = []
for i in range(5):
    m.append(i ** 2)

In [None]:
m

In [None]:
n = [[i ** 3 for i in range(j)] for j in range(7)]  # nested

In [None]:
n

#### Dict Comprehension

In [None]:
a = ['a', 'b', 'c']
b = [1, 2, 3]

In [None]:
zip(a, b)

In [None]:
{k: v for k, v in zip(a, b)}

In [None]:
import string

In [None]:
%pprint

In [None]:
zip(string.ascii_lowercase, range(26))

In [None]:
{k: v ** 2 for (k, v) in zip(string.ascii_lowercase, range(26))}

#### Set Comprehension

In [None]:
set(i ** 2 for i in xrange(10))

In [None]:
import random

In [None]:
l = [random.randint(1, 10) for _ in xrange(1000)]

In [None]:
# l

In [None]:
set(i ** 2 for i in l)

#### Functions

In [None]:
def f(x):
    return x ** 2
f(25)

In [None]:
results = [f(x) for x in m]
results

In [None]:
def even(x):
    return x % 2 == 0
even(3)

In [None]:
even(20)

#### Look Behind the Scenes

In [None]:
import dis

In [None]:
dis.dis(f)

In [None]:
dis.dis(even)

#### `*args, **kwargs`

In [None]:
# flexible argument handling (iterable as input, e.g. tuple)
def iterable(*args):
    return args[0] + args[1]

In [None]:
iterable(2, 4)

In [None]:
iterable('Welcome to ', 'NYC.')

In [None]:
# flexible argument handling (dict as input)
def keyvalue(**kwargs):
    return kwargs

In [None]:
keyvalue(x=2, y='Python')

In [None]:
# flexible argument handling (dict as input)
def keyvalue(**kwargs):
    return kwargs['x'], kwargs['y']

In [None]:
keyvalue(x=2, y='Python')

#### Interlude: Name Spaces

In [None]:
# globals()  # dictionary

In [None]:
# locals()  # dictionary

In [None]:
globals().items()[:2]

In [None]:
def f(x, y):
    local = 'A string object.'
    for i in locals().items():
        print i
    # print local

In [None]:
f(2, 'Python')

In [None]:
local

#### Generators

In [None]:
def my_range(start, end):
    while start < end:
        yield start
        start += 1

In [None]:
mr = my_range(1, 10)

In [None]:
mr.next()

In [None]:
mr.next()

In [None]:
dis.dis(my_range)

In [None]:
mr = my_range(0, 10)
for number in mr:
    print(number),

In [None]:
for number in xrange(10):
    print(number),

In [None]:
type(my_range)

In [None]:
mr = my_range(0, 10)
type(mr)

#### Functional Programming

In [None]:
map(even, range(10))

In [None]:
map(lambda x: x ** 2, range(10))

In [None]:
filter(even, range(15)) 

In [None]:
%time reduce(lambda x, y: x + y, range(100000))

In [None]:
def cumsum(l):
    total = 0
    for elem in l:
        total += elem
    return total
%time cumsum(range(100000))

In [None]:
%time sum(range(100000))

In [None]:
%time sum(xrange(100000))

## Python Best Practices

A good place to start the journey through Python Best Practices with lots of further links is [https://gist.github.com/sloria/7001839](https://gist.github.com/sloria/7001839) &mdash; but do not worry that much in the beginning of your Python career (but do also not forget to get back to the material later).

### Syntax

The most important guideline for writing Python code might be the **PEP 8** (i.e. PEP = Python Enhancement Proposal) &mdash; cf. [http://www.python.org/dev/peps/pep-0008/](http://www.python.org/dev/peps/pep-0008/).

The easiest way to get used to it, is to work with an editor that has built-in syntax and PEP8 checking, like Spyder does.

### Documentation

Most documentation is found as inline documentation. Do not do too much, however.

In [None]:
3 + 4  # this adds 3 + 4

The comment is superfluous &ndash; the code is self-explanatory. 

It is important to use doc strings regularly and correctly.

In [None]:
def f(x):
    ''' Function that returns the square of x.
    
    Parameters
    ==========
    x : float
        input value, real number
    
    Returns
    =======
    f(x) : float
        square of x
        
    Raises
    ======
    TypeError
        if x is not float or int
    '''
    if type(x) != float and type(x) != int:
        raise TypeError, 'Not the right input type.'
    return x ** 2

In [None]:
f?
# get the docstring as help

In [None]:
f??
# get even the full code

In [None]:
f(10)

In [None]:
f('Test')

### Importing

Avoid using the "star import" and abbreviate library names when appropriate.

Avoid:

In [None]:
from math import *
exp(1)

Do:

In [None]:
import math
math.exp(1)

In [None]:
import pandas as pd

### Testing

Strive for complete test coverage. At least, implement unit tests.

In [None]:
import nose.tools as nt

In [None]:
def test_f_calculation():
    ''' Test if it calculates correctly. '''
    nt.assert_equal(f(4), 16)

In [None]:
test_f_calculation()
  # no output = test passes

In [None]:
def test_f_type_error():
    ''' Tests if type error is raised. '''
    nt.assert_raises(TypeError, f, 'test')

In [None]:
test_f_type_error()

In [None]:
def test_f_fail():
    ''' Test if test fails. '''
    nt.assert_equal(f(4), 15)

In [None]:
test_f_fail()
  # intentional fail of test

Or all in once:

In [None]:
test_f_calculation()
test_f_type_error()
# test_f_fail()

### Version Control

**Github.com** has become today's standard version control and collaboration platform. Alternatively, you can also use **Git** in combinattion with an internally hosted **git server**.

In [None]:
from IPython.display import Image

In [None]:
Image('http://hilpisch.com/github.png', width="100%")

### Keep it Simple

In addition, a couple of general rules should be followed:

* **avoid duplication**: organize your code to avoid redundancies
* **think of others and the "later you"**: consider yourself 6-18 months from now and ask if you will understand everything then (for sure?)
* **document as much as necessary and as concise as possible**: look for the right balance
* **do not reinvent the wheel**: Python provides many useful libraries with thousands of valuable functions ...

<a href="http://tpq.io" target="_blank">http://tpq.io</a> | <a href="mailto:yves@tpq.io">yves@tpq.io</a> | <a href="http://twitter.com/dyjh" target="_blank">@dyjh</a> | <a href="http://hilpisch.com" target="_blank">http://hilpisch.com</a> 

**Quant Platform** &mdash; <a href="http://quant-platform.com" target="_blank">http://quant-platform.com</a>

**Python for Finance** &mdash; <a href="http://python-for-finance.com" target="_blank">http://python-for-finance.com</a>

**Derivatives Analytics with Python** &mdash; <a href="http://derivatives-analytics-with-python.com" target="_blank">http://derivatives-analytics-with-python.com</a>

**Python Trainings** &mdash; <a href="http://training.tpq.io" target="_blank">http://training.tpq.io</a>