# Data Science in Python (DSiP) - Programming Basics

by [Michael Granitzer (michael.granitzer@uni-passau.de)](http://www.mendeley.com/profiles/michael-granitzer/) 

   
based on the following sources

* [J.R. Johansson (robert@riken.jp)](http://dml.riken.jp/~rob/), 
  [Scientific Programming in Pyton](http://github.com/jrjohansson/scientific-python-lectures)

__License__

This work is licensed under a [Creative Commons Attribution 3.0 Unported License](http://creativecommons.org/licenses/by/3.0/)

# Python Files and Modules

## Python Files

__Structure of the Code file__
    
* Code files end with "`.py`":

        myprogram.py

* Every line is a Python statement (or part thereof). 

        * comment line start with `#`

__Run Python Program from command line__

* Execute with python interpreter

        $ python myprogram.py        

* Direct use on UNIX systems
    - define the path to the interpreter on the first line of the program as __comment__

        \#!/usr/bin/env python
        

    - If setting the executable flag of the file, we can run the program directly in the shell:

        $ myprogram.py


## Python Modules

### Overview
 * **Modules** group Python functions
 * __Modules are files__ with containing Python code (e.g. myprogram.py)
 * **Using Modules** requires to import them first using the __`import` statement__
 * **Scope:** Import a module to the current namespace, defined in its own namespace or used as `module.function`
 
The Python Standard Library is a large collection of modules that provides *cross-platform* implementations of common facilities such as access to the operating system, file I/O, string management, network communication, and much more.

### References
 
 * The Python Language Reference: http://docs.python.org/2/reference/index.html
 * The Python Standard Library: http://docs.python.org/2/library/

### Module import

For example, to import the module `math`, which contains many standard mathematical functions, we can do:

In [15]:
import math

Imports the **whole module** and makes it available under **the moduls namespace**. For example, we can do:

In [16]:
import math

x = math.cos(2 * math.pi)

print(x)

1.0


<img src="files/images/python namespaces.png" width="60%">

<div class="alert alert-warning">
**Note**: Contrary to Java a Python file can contain classes, variables, function and main code. The filename defines the module name and the namespace to access the functions
</div>

### Import a Module in current Namespace

* Goal: avoid writing the prefix `module.` by __importing all symbols (functions and variables)__ of one module into the current namespace 

In [17]:
from math import *

x = cos(2 * pi)

print(x)

1.0


* **Caveat:** possible namespace conflicts in large programs

### Import selected Functions into the current Namespace 

Selective import of function from a module

In [18]:
from math import cos, pi

x = cos(2 * pi)

print(x)

1.0


### Changing the Name/Namespace of a Functions/Modules at Import
Names of variables, functions and the namespace of moduls can be changed at  import using the **`as`** keyword

In [19]:
from math import cos as c
from math import pi as p
import math as m
x = c(2 * p)
y = m.cos(2 * m.pi)
print (x,y)

1.0 1.0


**Caveat:** Readability may suffer.

### Looking at what a module contains

Once a module is imported, we can list the symbols it provides using the `dir` function:

In [20]:
import math

print(dir(math))

['__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'acos', 'acosh', 'asin', 'asinh', 'atan', 'atan2', 'atanh', 'ceil', 'copysign', 'cos', 'cosh', 'degrees', 'e', 'erf', 'erfc', 'exp', 'expm1', 'fabs', 'factorial', 'floor', 'fmod', 'frexp', 'fsum', 'gamma', 'gcd', 'hypot', 'inf', 'isclose', 'isfinite', 'isinf', 'isnan', 'ldexp', 'lgamma', 'log', 'log10', 'log1p', 'log2', 'modf', 'nan', 'pi', 'pow', 'radians', 'sin', 'sinh', 'sqrt', 'tan', 'tanh', 'tau', 'trunc']


### Getting Help on a Function

And using the function `help` we can get a description of each function (almost .. not all functions have docstrings, as they are technically called, but the vast majority of functions are documented this way). 

In [21]:
help(math.log)

Help on built-in function log in module math:

log(...)
    log(x[, base])
    
    Return the logarithm of x to the given base.
    If the base not specified, returns the natural logarithm (base e) of x.



In [22]:
log(10)

2.302585092994046

In [23]:
math.log(1)

0.0

### Getting Help on a Module

Use `help` function directly on modules: 

    help(math) 

Some very useful modules form the Python standard library are 

 * `os`
 * `sys`
 * `math`
 * `shutil`
 * `re`
 * `subprocess`
 * `multiprocessing`
 * `threading`. 

A complete lists of standard modules for Python 2 and Python 3 are available at http://docs.python.org/2/library/ and http://docs.python.org/3/library/, respectively.

### Exercise: Explore the math package

**Setup**

 * Open the Jupyter notebook using the command `'jupyter notebook'` from the directory where you want to store your exercises
 * Create a new notebook - use a meaningfull naming scheme, like for example `'Name-coursname-semester'`
 * Create a first markdown block and create a header with title, name, course and semester
 * Work through the ipython notebook tutorial under http://ipython.org/ipython-doc/2/notebook/notebook.html (Yes, that's for the "old" ipython notebook, but tutorial is better than the port to Jupyter.) 
   - Learn the principles behind the notebook, how it is started, how you edit code and text and the ipytyhon magic commands
 * Create a first code block and start with the following exercise
 
** Exercise **

Explore the math package and make the following calculations

* `cos(0.7) + sin(0.3)`
* factorial of 20
* round down the following numbers: `1.4`, `3.5`, `4.8`
* check the run-time behaviour of different functions in the `math` package 

In [24]:
import math as m

print(dir (m))
print(help(m.cos))
print(help(m.sin))
print(help(m.factorial))
print(help(m.floor))

print(m.cos(0.7) + m.sin(0.4))
print(m.factorial(20))
print([m.floor(i) for i in [1.4,3.5,4.8]])

%timeit m.factorial(20) 
%timeit m.cos(0.7)

['__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'acos', 'acosh', 'asin', 'asinh', 'atan', 'atan2', 'atanh', 'ceil', 'copysign', 'cos', 'cosh', 'degrees', 'e', 'erf', 'erfc', 'exp', 'expm1', 'fabs', 'factorial', 'floor', 'fmod', 'frexp', 'fsum', 'gamma', 'gcd', 'hypot', 'inf', 'isclose', 'isfinite', 'isinf', 'isnan', 'ldexp', 'lgamma', 'log', 'log10', 'log1p', 'log2', 'modf', 'nan', 'pi', 'pow', 'radians', 'sin', 'sinh', 'sqrt', 'tan', 'tanh', 'tau', 'trunc']
Help on built-in function cos in module math:

cos(...)
    cos(x)
    
    Return the cosine of x (measured in radians).

None
Help on built-in function sin in module math:

sin(...)
    sin(x)
    
    Return the sine of x (measured in radians).

None
Help on built-in function factorial in module math:

factorial(...)
    factorial(x) -> Integral
    
    Find x!. Raise a ValueError if x is negative or non-integral.

None
Help on built-in function floor in module math:

floor(...)
    floor(x)
    
  

# Variables and Types in Python

## Syntax and Naming Convention

 * __Variable names__ contain 
    * alphanumerical characters `a-z`, `A-Z`, `0-9` 
    * some special characters such as `_`. 
   

 * __ Convention __ 
     * variable names start with a lower-case letter
     * Class names start with a capital letter. 
     * **Visible variable** names must start with a letter. 
     * **Hidden variables** and class variables start with a double underscore`__`
     

 * __Python keywords__ not usable as variable names:
 
 
    and, as, assert, break, class, continue, def, del, elif, else, except, 
    exec, finally, for, from, global, if, import, in, is, lambda, not, or,
    pass, print, raise, return, try, while, with, yield
    

## Assignment


The assignment operator in Python is **`=`**. 

The type is determined **dynamically** on assignment. No explicit definition

In [25]:
# variable assignments
x = 1.0
my_variable = 12.2

The type is derived form the value it was assigned (duck-typing)

In [26]:
type(x)

float

If we assign a new value to a variable, its type can change.

In [27]:
x = 1

In [28]:
type(x)

int

If we try to use a variable that has not yet been defined we get an `NameError`:

In [29]:
print(z)

NameError: name 'z' is not defined

## Basictypes

In [30]:
# integers
x = 1
type(x)

int

In [31]:
# float
x = 1.0
type(x)

float

In [32]:
# boolean
b1 = True
b2 = False

type(b1)

bool

In [33]:
# complex numbers: note the use of `j` to specify the imaginary part
x = 1.0 - 1.0j
type(x  )

complex

In [34]:
print(x )

(1-1j)


In [35]:
print(x.real, x.imag)

1.0 -1.0


## Type utility functions


The module `types` contains a number of type name definitions that can be used to test if variables are of certain types:

In [36]:
import types

# print all types defined in the `types` module
print(dir(types))

['AsyncGeneratorType', 'BuiltinFunctionType', 'BuiltinMethodType', 'CodeType', 'CoroutineType', 'DynamicClassAttribute', 'FrameType', 'FunctionType', 'GeneratorType', 'GetSetDescriptorType', 'LambdaType', 'MappingProxyType', 'MemberDescriptorType', 'MethodType', 'ModuleType', 'SimpleNamespace', 'TracebackType', '_GeneratorWrapper', '__all__', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', '_ag', '_calculate_meta', '_collections_abc', '_functools', 'coroutine', 'new_class', 'prepare_class']


In [38]:
x = 1.

# check if the variable x is a float
type(x) is float

True

In [27]:
# check if the variable x is an int
type(x) is int

False

We can also use the `isinstance` method for testing types of variables:

In [28]:
isinstance(x, float)

True

## Type casting

In [29]:
x = 1.5

print(x, type(x))

1.5 <class 'float'>


In [30]:
x = int(x)

print(x, type(x))

1 <class 'int'>


In [43]:
z = complex(x)

print(z, type(z))

(1+0j) <class 'complex'>


In [46]:
float(z.imag)

0.0

#### Complex Numbers

Complex variables cannot be cast to floats or integers. We need to use `z.real` or `z.imag` to extract the part of the complex number we want:

In [30]:
y = bool(z.real)

print(z.real, " -> ", y, type(y))

y = bool(z.imag)

print(z.imag, " -> ", y, type(y))

1.0  ->  True <class 'bool'>
0.0  ->  False <class 'bool'>


# Operators and Comparisons

Most operators and comparisons in Python work as one would expect. We will briefly outline the following operators

 * Arithmetic Operators
 * Boolean Operators 
 * Comparison Operators


## Arithmetic Operators


* Arithmetic operators `+`, `-`, `*`, `/`, `//` (integer division), '**' power


In [33]:
1 + 2, 1 - 2, 1 * 2, 1 / 2

(3, -1, 2, 0.5)

In [34]:
1.0 + 2.0, 1.0 - 2.0, 1.0 * 2.0, 1.0 / 2.0

(3.0, -1.0, 2.0, 0.5)

In [49]:
# Integer division of float numbers
3.0 // 2.

1.0

In [36]:
# Note! The power operators in python isn't ^, but **
2 ** 2

4

## Boolean Operators 



* The boolean operators are spelled out as words `and`, `not`, `or`. 

In [37]:
True and False

False

In [38]:
not False

True

In [39]:
True or False

True

## Comparison Operators



* Comparison operators `>`, `<`, `>=` (greater or equal), `<=` (less or equal), `==` equality, `is` identical.

In [38]:
2 > 1, 2 < 1

(True, False)

In [39]:
2 > 2, 2 < 2

(False, False)

In [40]:
2 >= 2, 2 <= 2

(True, True)

In [41]:
# equality
[1,2] == [1,2]

True

In [54]:
# objects identical?
l1 = [1, 2]
l2 = [1,2]
l1 = l2
l2[1] = 3
l1 is l2
print(l1)

[1, 3]


In [43]:
[1,2] is [1,2]

False

# Basic Data Structures 



Python provides a large set of fast, easy to use data structures. In particular

 * Strings
 * List  (mutable arrays)
 * Tupel (inmutable Lists)
 * Dictionaries (Maps)
 
Built-in data structures are implemented in C.


## Strings


Strings are the variable type that is used for storing text messages as array of characters/bytes. 

In [57]:
s = "Hello world"
type(s)

str

In [58]:
# length of the string: the number of characters
len(s)

11

In [59]:
# replace a substring in a string with somethign else
s2 = s.replace( "world", "test")
print(s2)

Hello test


We can index a character in a string using `[]`: 

**Note:** Indexing start at 0!

In [61]:
s[0]

'H'

### Basic Indexing

We can extract a part of a string using the syntax `[start:stop]`, which extracts characters between index `start` and `stop`:

In [48]:
s[0:5]

'Hello'

If we omit either (or both) of `start` or `stop` from `[start:stop]`, the default is the beginning and the end of the string, respectively:

In [49]:
s[:5]

'Hello'

In [50]:
s[6:]

'world'

In [51]:
s[:]

'Hello world'

### Advanced Indexing using :
Define the step size using the syntax **`[start:end:step]`** (the default value for `step` is 1, as we saw above):

In [44]:
s[::1]

'Hello world'

In [45]:
s[::2]

'Hlowrd'

This technique is called *slicing*. Read more about the syntax here: http://docs.python.org/release/2.7.3/library/functions.html?highlight=slice#slice

Python has a very rich set of functions for text processing. See for example http://docs.python.org/2/library/string.html for more information.

### String formatting examples

In [54]:
print("str1", "str2", "str3")  # The print statement concatenates strings with a space

str1 str2 str3


In [55]:
print("str1", 1.0, False, -1j)  # The print statements converts all arguments to strings

str1 1.0 False (-0-1j)


In [56]:
print("str1" + "str2" + "str3") # strings added with + are concatenated without space

str1str2str3


In [57]:
print("value = %f" % 1.0)       # we can use C-style string formatting

value = 1.000000


In [58]:
# this formatting creates a string
s2 = "value1 = %.2f. value2 = %d" % (3.1415, 1.5)

print(s2)

value1 = 3.14. value2 = 1


In [59]:
# alternative, more intuitive way of formatting a string 
s3 = 'value1 = {0}, value2 = {1}'.format(3.1415, 1.5)

print(s3)

value1 = 3.1415, value2 = 1.5


### Question
Print the last element of a given string

In [60]:
word ="Have fun"

In [61]:
# Write your code here to print the last character


Execute the next cell to see a sample solution

In [46]:
print('''
You can access the last charactere of string by using negative indexing, the next instruction will let you print the last charactere

    print(word[-1])


 You can also use the length of the string to access the last element
 since the indexing start from '0' so the length of the phrase minus one give us the index of the last element 
 that's why len(word)-1 is the index of the last element 
    
    print(word[len(word)-1])''')



You can access the last charactere of string by using negative indexing, the next instruction will let you print the last charactere

    print(word[-1])


 You can also use the length of the string to access the last element
 since the indexing start from '0' so the length of the phrase minus one give us the index of the last element 
 that's why len(word)-1 is the index of the last element 
    
    print(word[len(word)-1])


## List



Lists are very similar to strings, except that __each element can be of any type.__

The syntax for creating lists in Python is `[...]`:

In [63]:
l = [1,2,3,4]

print(type(l))
print(l)

<class 'list'>
[1, 2, 3, 4]


See `help(list)` for more details, or read the online documentation 

### List Indexing 
We can use the __same slicing techniques__ to manipulate lists as we could use on __strings__:

In [64]:
print(l)

print(l[1:3])

print(l[::2])

[1, 2, 3, 4]
[2, 3]
[1, 3]


**Noe:** Indexing starts at 0!

In [65]:
l[0]

1

### Heterogeneous Types and Nesting
Elements in a list do not all have to be of the same type:

In [47]:
l = [1, 'a', 1.0, 1-1j]

print(l)

[1, 'a', 1.0, (1-1j)]


Python lists can be inhomogeneous and arbitrarily nested:

In [48]:
nested_list = [1, [2, [3, [4, [5]]]]]

nested_list

[1, [2, [3, [4, [5]]]]]

### Lists and flow control

Lists play a very important role in Python, and are for example used in loops and other flow control structures (discussed below). There are number of convenient functions for generating lists of various types, for example the `range` function:

In [49]:
start = 10
stop = 30
step = 2

range(start, stop, step)

range(10, 30, 2)

In [50]:
# in python 3 range generates an iterator, which can be converted to a list using 'list(...)'. It has no effect in python 2
list(range(start, stop, step))

[10, 12, 14, 16, 18, 20, 22, 24, 26, 28]

In [51]:
list(range(-10, 10))

[-10, -9, -8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [52]:
s

'Hello world'

In [53]:
# convert a string to a list by type casting:

s2 = list(s)

s2

['H', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd']

In [54]:
# sorting lists
s2.sort()

print(s2)

[' ', 'H', 'd', 'e', 'l', 'l', 'l', 'o', 'o', 'r', 'w']


### Adding, inserting, modifying, and removing elements from lists

In [55]:
# create a new empty list
l = []

# add an elements using `append`
l.append("A")
l.append("d")
l.append("d")

print(l)

['A', 'd', 'd']


We can modify lists by assigning new values to elements in the list. In technical jargon, lists are *mutable*.

In [56]:
l[1] = "p"
l[2] = "p"

print(l)

['A', 'p', 'p']


In [57]:
l[1:3] = ["d", "d"]

print(l)

['A', 'd', 'd']


#### Insert

Insert an element at an specific index using `insert`

In [58]:
l.insert(0, "i")
l.insert(1, "n")
l.insert(2, "s")
l.insert(3, "e")
l.insert(4, "r")
l.insert(5, "t")

print(l)

['i', 'n', 's', 'e', 'r', 't', 'A', 'd', 'd']


#### Remove
Remove first element with specific value using 'remove'

In [59]:
l.remove("A")

print(l)

['i', 'n', 's', 'e', 'r', 't', 'd', 'd']


Remove an element at a specific location using `del`:

In [60]:
del l[7]
del l[6]

print(l)

['i', 'n', 's', 'e', 'r', 't']


### Question


Print the length of a given list like [9,5,1,2]

In [61]:
#Write your solution here

Execute the next cell to see a sample solution

In [62]:
print("""print(len([9,5,1,2])""")

print(len([9,5,1,2])


Check if 3 belong to this list [6,9,5,3]

In [65]:
#Write your solution here

False

Execute the next cell to see a sample solution

In [83]:
print("""     3 in [6,9,5,3]

The result of this instruct is boolean """)

     3 in [6,9,5,3]

The result of this instruct is boolean 


## Tuples



Tuples are like lists, except that they cannot be modified once created, that is they are *immutable*. 

In Python, tuples are created using the syntax `(..., ..., ...)`, or even `..., ...`:

In [84]:
point = (10, 20)

print(point, type(point))

(10, 20) <class 'tuple'>


In [85]:
point = 10, 20

print(point, type(point))

(10, 20) <class 'tuple'>


### Unpacking tuples

We can unpack a tuple by assigning it to a comma-separated list of variables:

In [86]:
x, y = point

print("x =", x)
print("y =", y)

x = 10
y = 20


#### Tuples are Inmutable 

If we try to assign a new value to an element in a tuple we get an error:

In [66]:
point[0] = 20

NameError: name 'point' is not defined

## Dictionaries



Dictionaries are also like lists, except that each element is a key-value pair. The syntax for dictionaries is `{key1 : value1, ...}`:

In [67]:
params = {"parameter1" : 1.0,
          "parameter2" : 2.0,
          "parameter3" : 3.0,}

print(type(params))
print(params)

<class 'dict'>
{'parameter1': 1.0, 'parameter3': 3.0, 'parameter2': 2.0}


In [68]:
print("parameter1 = " + str(params["parameter1"]))
print("parameter2 = " + str(params["parameter2"]))
print("parameter3 = " + str(params["parameter3"]))

parameter1 = 1.0
parameter2 = 2.0
parameter3 = 3.0


In [69]:
params["parameter1"] = "A"
params["parameter2"] = "B"

# add a new entry
params["parameter4"] = "D"

print("parameter1 = " + str(params["parameter1"]))
print("parameter2 = " + str(params["parameter2"]))
print("parameter3 = " + str(params["parameter3"]))
print("parameter4 = " + str(params["parameter4"]))

parameter1 = A
parameter2 = B
parameter3 = 3.0
parameter4 = D


### Now it is time to do Exercise 2.1.a.


See for the exercise subfolder.
  

# Control Flow

## Conditional statements: if, elif, else



The Python Syntax for conditional execution of code use the keywords `if`, `elif` (else if), `else`:

In [70]:
statement1 = True
statement2 = False

if statement1:
    print("statement1 is True")

elif statement2:
    print("statement2 is True")
    
else:
    print("statement1 and statement2 are False")

statement1 is True


### Blocks are defined by indention

For the first time, here we encounted a peculiar and unusual aspect of the Python programming language: Program blocks are defined by their indentation level. 

Compare to the equivalent C code:

    if (statement1)
    {
        printf("statement1 is True\n");
    }
    else if (statement2)
    {
        printf("statement2 is True\n");
    }
    else
    {
        printf("statement1 and statement2 are False\n");
    }

In C blocks are defined by the enclosing curly brakets `{` and `}`. And the level of indentation (white space before the code statements) does not matter (completely optional). 

But in Python, the extent of a code block is defined by the indentation level (usually a tab or say four white spaces). This means that we have to be careful to indent our code correctly, or else we will get syntax errors. 



### IF Example - Blocks by indention

In [92]:
statement1 = statement2 = True

if statement1:
    if statement2:
        print("both statement1 and statement2 are True")


both statement1 and statement2 are True


In [93]:
# Bad indentation!
if statement1:
    if statement2:      
        print("both statement1 and statement2 are True")  # this line is not properly indented

both statement1 and statement2 are True


In [94]:
statement1 = False 

if statement1:
    print("printed if statement1 is True")
    
    print("still inside the if block")

In [95]:
if statement1:
    print("printed if statement1 is True")
    
print("now outside the if block")

now outside the if block


## Loops



In Python, loops can be programmed in a number of different ways. The most common is the `for` loop, which is used together with iterable objects, such as lists. The basic syntax is:


### `for` loops:

In [96]:
for x in [1,2,3]:
    print(x)
y=[1,2,3]
print(y)

1
2
3
[1, 2, 3]


### Counting `for` loops

The `for` loop iterates over the elements of the supplied list, and executes the containing block once for each element. Any kind of list can be used in the `for` loop. For example:

In [97]:
for x in range(6): # by default range start at 0
    print(x)

0
1
2
3
4
5


Note: `range(4)` does not include 4 !

In [98]:
for x in range(-3,3):
    print(x)

-3
-2
-1
0
1
2


In [99]:
for word in ["data", "science", "with", "python"]:
    print(word)

data
science
with
python


### Iterating over Dictionaries

To iterate over key-value pairs of a dictionary:

In [71]:
for key, value in params.items():
    print(key + " = " + str(value))

parameter1 = A
parameter3 = 3.0
parameter4 = D
parameter2 = B


Sometimes it is useful to have access to the indices of the values when iterating over a list. We can use the `enumerate` function for this:

In [72]:
 for idx, x in enumerate(range(-3,3)):
    print(idx, x) 

0 -3
1 -2
2 -1
3 0
4 1
5 2


### List comprehensions: Creating lists using `for` loops

A convenient and compact way to initialize lists:

In [73]:
l1 = [x**2 for x in range(0,5)]

print(l1)

[0, 1, 4, 9, 16]


List comprehensions can be conditional (and hence very powerful especially when working with data structures!)

In [74]:
l1 = [x**2 for x in range(0,5) if x<3]

print(l1)

[0, 1, 4]


#### `while` loops

In [75]:
i = 0

while i < 5:
    print(i)
    
    i = i + 1
    
print("done")

0
1
2
3
4
done


Note that the `print("done")` statement is not part of the `while` loop body because of the difference in indentation.

# Functions



A function in Python is defined using the keyword `def`, followed by a function name, a signature within parenthises `()`, and a colon `:`. The following code, with one additional level of indentation, is the function body.

In [76]:
def func0():   
    print("test")

In [77]:
func0()

test


## Documenting a Function


Optionally, but highly recommended, we can define a so called "docstring", which is a description of the functions purpose and behaivor. The docstring should follow directly after the function definition, before the code in the function body.

In [78]:
def func1(s):
    """
    Print a string 's' and tell how many characters it has 
    """
    
    print(s + " has " + str(len(s)) + " characters")

In [79]:
help(func1)

Help on function func1 in module __main__:

func1(s)
    Print a string 's' and tell how many characters it has



In [80]:
func1("test")

test has 4 characters


##  Returning values



Functions that returns a value use the `return` keyword to return __any object or function__:

In [81]:
def square(x):
    """
    Return the square of x.
    """
    return x ** 2

In [82]:
square(4)

16

### Returning Multiple Values
We can return multiple values from a function using tuples (see above):

In [83]:
def powers(x):
    """
    Return a few powers of x.
    """
    return x ** 2, x ** 3, x ** 4

In [84]:
h=powers(3)
print(type(h))
print(h)
print(h[0])

<class 'tuple'>
(9, 27, 81)
9


In [85]:
x2, x3, x4 = powers(3)

print(x3)

27


## Default argument and keyword arguments



In a definition of a function, we can give default values to the arguments the function takes:

In [86]:
def myfunc(x, p=2, debug=False):
    if debug:
        print("evaluating myfunc for x = " + str(x) + " using exponent p = " + str(p))
    return x

If we don't provide a value of the `debug` argument when calling the the function `myfunc` it defaults to the value provided in the function definition:

In [87]:
myfunc(5)

5

In [88]:
myfunc(5, debug=True, p="jadfs")

evaluating myfunc for x = 5 using exponent p = jadfs


5

### Keyword Argument
If we explicitly list the name of the arguments in the function calls, they do not need to come in the same order as in the function definition.

This is called *keyword* arguments, and is often very useful in functions that takes a lot of optional arguments.

In [89]:
myfunc(p=3, debug=True, x=7)

evaluating myfunc for x = 7 using exponent p = 3


7

## Unnamed functions (lambda function)



In Python we can also create unnamed functions, using the `lambda` keyword:

In [119]:
f1 = lambda x: x**2
    
# is equivalent to 

def f2(x):
    return x**2

In [120]:
f1(2), f2(2)

(4, 4)

### (Lambda) Functions as Argument

This technique is useful for example when we want to pass a simple function as an argument to another function, like this:

In [90]:
# map is a built-in python function, that applys a given funtion to an iterable and returns the results
# map(function, iterable, ...)
map(lambda x: x**2, range(-3,4))

<map at 0x10836a550>

In [122]:
# in python 3 we can use `list(...)` to convert the iterator to an explicit list
list(map(lambda x: x**2, range(-3,4)))

[9, 4, 1, 0, 1, 4, 9]

**Now it would be a good time to do Exercise 2.1.b**

See in the exercise subfolder ([local](exercises/Exercise%20DSiP-2-1-Python%20Standard%20Data%20Structures.ipynb)|[online](http://nbviewer.ipython.org/github/mgrani/LODA-lecture-notes-on-data-analysis/blob/master/I-Data-Science-in-Python/exercises/Exercise%20DSiP-2-1-Python%20Standard%20Data%20Structures.ipynb)).

## Practice

### Exercise 1
Write a Python function that can calculate the length of a given string

In [123]:
def s_length(string):
    #write your code here

SyntaxError: unexpected EOF while parsing (<ipython-input-123-04f2fe160e59>, line 2)

Execute the next cell to test your solution

In [124]:
import sys
import traceback


try:
    assert s_length("python") == 6
    print("success")
except AssertionError:
    print('try again')
    
solution ="""def s_length(string):
    c = 0
    for char in string:
        c += 1
    return c"""

NameError: name 's_length' is not defined

Execute the next cell to see the solution

In [125]:
print(solution)

NameError: name 'solution' is not defined

### Exercise 2
Write a function that print a single string from two strings, separated by a space and switch the first character of each string.

Sample String : 'abc', 'xyz' 
Expected Result : 'xbc ayz'

In [126]:
def char_mix(a, b):
    #write your code here

SyntaxError: unexpected EOF while parsing (<ipython-input-126-a37422b14840>, line 2)

Execute the next cell to test your solution

In [127]:
import sys
import traceback


try:
    assert char_mix("abc","jhg") == "jbc ahg"
    print("success")
except AssertionError:
    print('try again')
    
sol ="""def char_mix(a, b):
  mix_a = b[:1] + a[1:]
  mix_b = a[:1] + b[1:]
  return mix_a + ' ' + mix_b"""

NameError: name 'char_mix' is not defined

Execute the next cell to see the solution

In [128]:
print(sol)

NameError: name 'sol' is not defined

# Classes



Classes are the key features of object-oriented programming.

A class is a structure for representing an object and the operations that can be performed on the object. 

A class can contain *attributes* (variables) and *methods* (functions).

A class is defined using the `class` keyword plus a number of class method definitions (a function in a class).

* Each class method should have **an argurment `self`** as it first argument. This object is a self-reference.

* Some class method names have special meaning, for example:

 * `__init__`: The name of the method that is invoked when the object is first created.
 * `__str__` : A method that is invoked when a simple string representation of the class is needed, as for example when printed.
 * There are many more, see http://docs.python.org/2/reference/datamodel.html#special-method-names

In [91]:
class Point:
    """
    Simple class for representing a point in a Cartesian coordinate system.
    """
    
    def __init__(self, x, y):
        """
        Create a new Point at x, y.
        """
        self.x = x
        self.y = y
        
    def translate(self, dx, dy):
        """
        Translate the point by dx and dy in the x and y direction.
        """
        self.x += dx
        self.y += dy
        
    def __str__(self):
        return("Point at [%f, %f]" % (self.x, self.y))

## Instance Creation



To create a new instance of a class:

In [92]:
p1 = Point(0, 0) # this will invoke the __init__ method in the Point class

print(p1)         # this will invode the __str__ method

Point at [0.000000, 0.000000]


To invoke a class method in the class instance `p`:

In [93]:
p2 = Point(1, 1)

p1.translate(0.25, 1.5)

print(p1)
print(p2)

Point at [0.250000, 1.500000]
Point at [1.000000, 1.000000]


Note that calling class methods can modifiy the state of that particular class instance, but does not effect other class instances or any global variables.

That is one of the nice things about object-oriented design: code such as functions and related variables are grouped in separate and independent entities. 

## Practice

### Exercise 1
Write a class named Circle created by a radius and two methods which will compute the perimeter and the area of a circle.

In [2]:
class Circle():
    #write your code here and then remove the `pass`
    pass

Execute the next cell to test your solution

In [3]:
import sys
import traceback

New_Circle = Circle(4)
try:
    assert New_Circle.area() == 50.24
    assert New_Circle.perimeter() == 25.12
    print("success")
except AssertionError:
    print('try again')
    
sol1 ="""class Circle():
    def __init__(self, r):
        self.radius = r

    def area(self):
        return self.radius**2*3.14
    
    def perimeter(self):
        return 2*self.radius*3.14"""

TypeError: object() takes no parameters

Execute the next cell to see the solution

In [4]:
print(sol1)

NameError: name 'sol1' is not defined

# Creating Modules


One of the most important concepts in good programming is to reuse code and avoid repetitions.

The idea is to write functions and classes with a well-defined purpose and scope, and reuse these instead of repeating similar code in different part of a program (modular programming). The result is usually that readability and maintainability of a program is greatly improved. What this means in practice is that our programs have fewer bugs, are easier to extend and debug/troubleshoot. 

Python supports modular programming at different levels. Functions and classes are examples of tools for low-level modular programming. Python modules are a higher-level modular programming construct, where we can collect related variables, functions and classes in a module. A python module is defined in a python file (with file-ending `.py`), and it can be made accessible to other Python modules and programs using the `import` statement. 

Consider the following example: the file `mymodule.py` contains simple example implementations of a variable, function and a class:

In [135]:
%%file mymodule.py
"""
Example of a python module. Contains a variable called my_variable,
a function called my_function, and a class called MyClass.
The code is stored in mymodule.py through using the %%file magic
"""

my_variable = 0

def my_function():
    """
    Example function
    """
    return my_variable
    
class MyClass:
    """
    Example class.
    """

    def __init__(self):
        self.variable = my_variable
        
    def set_variable(self, new_value):
        """
        Set self.variable to a new value
        """
        self.variable = new_value
        
    def get_variable(self):
        return self.variable

Writing mymodule.py


# Exceptions



In Python errors are managed with a special language construct called "Exceptions". When errors occur exceptions can be raised, which interrupts the normal program flow and fallback to somewhere else in the code where the closest try-except statements is defined.


##  Generating Exceptions



To generate an exception we can use the `raise` statement, which takes an argument that must be an instance of the class `BaseExpection` or a class derived from it. 

In [136]:
raise Exception("description of the error")

Exception: description of the error

A typical use of exceptions is to abort functions when some error condition occurs, for example:

    def my_function(arguments):
    
        if not verify(arguments):
            raise Expection("Invalid arguments")
        
        # rest of the code goes here

## Catching Exception

To gracefully catch errors that are generated by functions and class methods, or by the Python interpreter itself, use the `try` and  `except` statements:

    try:
        # normal code goes here
    except:
        # code for error handling goes here
        # this code is not executed unless the code
        # above generated an error

For example:

In [94]:

try:
    print("test")
    # generate an error: the variable test is not defined
    print(test)
except:
    print("Caught an expection")

test
Caught an expection


## Message of an Exception




To get information about the error, we can access the `Exception` class instance that describes the exception by using for example:

    except Exception as e:

In [95]:
try:
    print("test")
    # generate an error: the variable test is not defined
    print(test)
except Exception as e:
    print("Caught an exception:" + str(e))

test
Caught an exception:name 'test' is not defined


Print the stack trace of an exception

In [96]:
import traceback
def tb_test():
    try:
        print("test")
        # generate an error: the variable test is not defined
        print(test)
    except Exception as e:
        print("Caught an exception:" + str(e))
        print("And here comes the Traceback:")
        traceback.print_exc()

tb_test()

test
Caught an exception:name 'test' is not defined
And here comes the Traceback:


Traceback (most recent call last):
  File "<ipython-input-96-1205d0fe34a1>", line 6, in tb_test
    print(test)
NameError: name 'test' is not defined


# Further reading



* http://www.python.org - The official web page of the Python programming language.
* http://www.python.org/dev/peps/pep-0008 - Style guide for Python programming. Highly recommended. 
* http://www.greenteapress.com/thinkpython/ - A free book on Python programming.
* [Python Essential Reference](http://www.amazon.com/Python-Essential-Reference-4th-Edition/dp/0672329786) - A good reference book on Python programming.

### Versions

In [98]:
import sys
import IPython

In [99]:
print("This notebook was evaluated with: Python %s and IPython %s." % (sys.version, IPython.__version__))

This notebook was evaluated with: Python 3.5.2 |Anaconda custom (x86_64)| (default, Jul  2 2016, 17:52:12) 
[GCC 4.2.1 Compatible Apple LLVM 4.2 (clang-425.0.28)] and IPython 5.1.0.
