# Introduction: python

*Davide Gerosa (Milano-Bicocca)*

**Sources**: Michael Zingale at Stony Brook University: https://sbu-python-class.github.io

## Why python?

* Python is a very high-level language

  * it provides many complex data-structures (lists, dictionaries, ...)

  * your code is shorter than a comparable algorithm in a compiled language

* Many powerful libraries to perform complex tasks

  * Parse structured inputs files

  * send e-mail

  * interact with the operating system

  * make plots

  * make GUIs

  * do scientific computations

  * ...

* Python makes it easy to prototype new tools

* Python is cross-platform and Free

## Language Features

Some of the language features are:

* Dynamical typing

* Object-oriented foundation

* Extensible (easy to call Fortran, C/C++, ...)

* Automatic memory management (garbage collection)

* Ease of readability (whitespace matters)


## Scientific python

Perhaps most importantly, and why we are here:

> Python has been widely adopted in the scientific community.



## Questions


- What's your research about? What are you more/less interested in this class?

- Have you ever used python?
    1. Never ever ever
    2. Seen it but never used it for research
    3. Used it but only some basic operations
    4. I'm up to speed
    5. I'm a Python developer already

I designed this class to both provide an introduction to people that have never used python, but also to give solid foundations to those that have used it already (also applies to myself: preparing this class definitely improved my python skills!).

### Some of my coding projects

I used python and/or git for 

- Black holes spins: https://github.com/dgerosa/precession
- Gravitational waves detectability with machine learning: https://github.com/dgerosa/pdetclassifier 
- Write papers: https://github.com/dgerosa/writeapaper
- Handle latex bibliographies: https://github.com/dgerosa/filltex
- Keep my CV automatically updated:  https://github.com/dgerosa/CV
- Teaching: https://github.com/dgerosa/astrostatistics_bicocca_2023 


Ok, let's go...

## Getting python

You will want to install python and the associated libraries on your
laptop that you can bring to classes. Hopefully you have done this already.

On Linux machines, you probably already have python and you can get
the needed libraries through your system package manager.

For Windows, I recommend the free Anaconda distribution:

https://www.anaconda.com/products/individual

For Mac, I recommend using homwbrew (not just for python, that's a general package manager for mac)

https://brew.sh/

If you have python successfully installed, you should be able to start
the python interpreter at the command line as: `python`.  A shell will
come up, and you can try out your first program:

```
print("hello, world")
```

You will also need to be able to install packages. So try

```
pip install jupyter
```

and see if that works. You might need to add the `--user` option, depending on what privileges you have on your laptopt and your python installation. 


## Following along in class

All of the class notes are hosted as Jupyter notebooks on github:

https://github.com/dgerosa/scientificcomputing_bicocca_2023

If you know some git already, go ahead and fork + clone the repository. If "fork" still makes you think of lunch instead of computers no worries, we'll talk about git later on. For now just use the green download button for dummies.

# Jupyter

We'll be using Jupyter for all of our examples -- this allows us to run python in a web-based notebook, keeping a history of input and output, along with text and images.

For Jupyter help, visit:
https://jupyter.readthedocs.io/en/latest/content-quickstart.html

We interact with python by typing into _cells_ in the notebook.  By default, a cell is a _code_ cell, which means that you can enter any valid python code into it and run it.  Another important type of cell is a _markdown_ cell.  This lets you put text, with different formatting (italics, bold, etc) that describes what the notebook is doing.

You can change the cell type via the menu at the top, or using the shortcuts:

  * ctrl-m m : mark down cell
  * ctrl-m y : code cell

Some useful short-cuts:

 * shift+enter = run cell and jump to the next (creating a new cell if there is no other new one)
 * ctrl+enter = run cell-in place
 * alt+enter = run cell and insert a new one below

ctrl+m h lists other commands

A "markdown cell" enables you to typeset LaTeX equations right in your notebook.  Just put them in `$` or `$$`:

$$\frac{\partial \rho}{\partial t} + \nabla \cdot (\rho U) = 0$$

<div class="alert alert-block alert-danger">
    
**Important**: when you work through a notebook, everything you did in previous cells is still in memory and _known_ by python, so you can refer to functions and variables that were previously defined.  Even if you go up to the top of a notebook and insert a cell, all the information done earlier in your notebook session is still defined -- it doesn't matter where physically you are in the notebook.  If you want to reset things, you can use the options under the _Kernel_ menu.
</div>

<div class="alert alert-block alert-warning">
    
<span class="fa fa-flash"></span> Quick Exercise:

Create a new cell below this one.  Make sure that it is a _code_ cell, and enter the following code and run it:
    
```

 print("Hello, World")
 
```
</div>

`print()` is a _function_ in python that takes arguments (in the `()`) and outputs to the screen.  You can print multiple quantities at once like:

In [2]:
print(1, 2, 3)

1 2 3


In [3]:
1

1

Note that the default behavior in Jupyter is to print the return value from the last statement in a cell, so we don't need to `print` if we just want the value of something like:

In [4]:
a = 10
a

10

# Basic Python Datatypes

Python is a dynamically typed language -- this means that you don't
need to specify ahead of time what kind of data you are going to store
in a variable.  Nevertheless, there are some core datatypes that we
need to become familiar with as we use the language.

The first set of datatypes are similar to those found in other
languages (like C/C++ and Fortran): floating point numbers, integers,
and strings.

Floating point is essential for computational science.  A great
introduction to floating point and its limitations is: [What every
computer scientist should know about floating-point
arithmetic](http://dl.acm.org/citation.cfm?id=103163) by
D. Goldberg.

The next set of datatypes are containers.  In python, unlike some
languages, these are built into the language and make it very easy to
do complex operations.  We'll look at these later.


Some examples come from the python tutorial:
http://docs.python.org/3/tutorial/

## Integers

Integers are numbers without a decimal point.  They can be positive or negative.  Most programming languages use a finite-amount of memory to store a single integer, but in python will expand the amount of memory as necessary to store large integers.

The basic operators, `+`, `-`, `*`, and `/` work with integers

In [5]:
2+2+3

7

In [3]:
2*-4

-8

Note: integer division is one place where python 2 and python 3 different
    
In python 3.x, dividing 2 integers results in a float.  In python 2.x, dividing 2 integers results in an integer.  The latter is consistent with many strongly-typed programming languages (like Fortran or C), since the data-type of the result is the same as the inputs, but the former is more inline with our expectations

In [8]:
3/2

1.5

To get an integer result, we can use the // operator.

In [7]:
3//2

1

Python is a _dynamically-typed language_&mdash;this means that we do not need to declare the datatype of a variable before initializing it.  

Here we'll create a variable (think of it as a descriptive label that can refer to some piece of data).  The `=` operator assigns a value to a variable.  

In [6]:
a = 1
b = 2

Functions operate on variables and return a result.  Here, `print()` will output to the screen.

In [7]:
a + b

3

In [8]:
a * b

2

Note that variable names are case sensitive, so a and A are different

In [9]:
A = 2048

In [10]:
print(a, A)

1 2048


Here we initialize 3 variable all to `0`, but these are still distinct variables, so we can change one without affecting the others.

In [9]:
x = y = z = 0

In [10]:
print(x, y, z)

0 0 0


In [11]:
z = 1

In [12]:
z

1

Python has some built in help (and Jupyter/ipython has even more)

try doing:
```
help(x)
```

alternatively, try:
```
x?
```

(this only works in Jupyter)

In [13]:
help(z)

Help on int object:

class int(object)
 |  int([x]) -> integer
 |  int(x, base=10) -> integer
 |  
 |  Convert a number or string to an integer, or return 0 if no arguments
 |  are given.  If x is a number, return x.__int__().  For floating point
 |  numbers, this truncates towards zero.
 |  
 |  If x is not a number or if base is given, then x must be a string,
 |  bytes, or bytearray instance representing an integer literal in the
 |  given base.  The literal can be preceded by '+' or '-' and be surrounded
 |  by whitespace.  The base defaults to 10.  Valid bases are 0 and 2-36.
 |  Base 0 means to interpret the base from the string as an integer literal.
 |  >>> int('0b100', base=0)
 |  4
 |  
 |  Built-in subclasses:
 |      bool
 |  
 |  Methods defined here:
 |  
 |  __abs__(self, /)
 |      abs(self)
 |  
 |  __add__(self, value, /)
 |      Return self+value.
 |  
 |  __and__(self, value, /)
 |      Return self&value.
 |  
 |  __bool__(self, /)
 |      True if self else False
 |

Another function, `type()` returns the data type of a variable

In [14]:
type(x)

int

Note in languages like Fortran and C, you specify the amount of memory an integer can take (usually 2 or 4 bytes).  This puts a restriction on the largest size integer that can be represented.  Python will adapt the size of the integer so you don't *overflow*

In [15]:
a = 12345678901234567890123456789012345123456789012345678901234567890
print(a)
print(a.bit_length())
print(type(a))

12345678901234567890123456789012345123456789012345678901234567890
213
<class 'int'>


## Floating point

When operating with both floating point and integers, the result is promoted to a float.

In [17]:
1. + 2

3.0

But note the special integer division operator

In [18]:
1.//2

0.0

It is important to understand that since there are infinitely many real numbers between any two bounds, on a computer we have to approximate this by a finite number.  There is an IEEE standard for floating point that pretty much all languages and processors follow.  

The means two things

* not every real number will have an exact representation in floating point
* there is a finite precision to numbers -- below this we lose track of differences (this is usually called *roundoff* error)

This paper is an amazing reference to understand how a computer stores numbers:

[Goldberg 1991: What every computer scientist should know about floating-point arithmetic](https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html)

Consider the following expression, for example:

In [19]:
0.3/0.1 - 3

-4.440892098500626e-16

In [16]:
0.3/0.1 == 3

False

Here's another example: The number 0.1 cannot be exactly represented on a computer.  In our print, we use a format specifier (the stuff inside of the {}) to ask for more precision to be shown:

In [20]:
a = 0.1
print("{:30.20}".format(a))

        0.10000000000000000555


We can ask python to report the limits on floating point

In [21]:
import sys
sys.float_info

sys.float_info(max=1.7976931348623157e+308, max_exp=1024, max_10_exp=308, min=2.2250738585072014e-308, min_exp=-1021, min_10_exp=-307, dig=15, mant_dig=53, epsilon=2.220446049250313e-16, radix=2, rounds=1)

Note that this says that we can only store numbers between 2.2250738585072014e-308 and 1.7976931348623157e+308

We also see that the precision is 2.220446049250313e-16 (this is commonly called _machine epsilon_).  To see this, consider adding a small number to 1.0.  We'll use the equality operator (`==`) to test if two numbers are equal:

<div class="alert alert-block alert-warning">
    
<span class="fa fa-flash"></span> Quick Exercise:

Define two variables, $a = 1$, and $e = 10^{-16}$.

Now define a third variable, `b = a + e`

We can use the python `==` operator to test for equality.  What do you expect `b == a` to return? run it an see if it agrees with your guess.
</div>

## Modules

The core python language is rather small. Most happens in modules. Some modules are part of a standard library that provides additional functionality.  These added pieces are in the form of modules that we can _import_ into our python session (or program).

The `math` module provides functions that do the basic mathematical operations as well as provide constants (note there is a separate `cmath` module for complex numbers).

In python, you `import` a module.  The functions are then defined in a separate _namespace_&mdash;this is a separate region that defines names and variables, etc.  A variable in one namespace can have the same name as a variable in a different namespace, and they don't clash.  You use the "`.`" operator to access a member of a namespace.

By default, when you type stuff into the python interpreter or here in the Jupyter notebook, or in a script, it is in its own default namespace, and you don't need to prefix any of the variables with a namespace indicator.

In [17]:
import math

`math` provides the value of pi

In [18]:
math.pi

3.141592653589793

This is distinct from any variable `pi` we might define here

In [20]:
pi = 3

In [21]:
print(pi, math.pi)

3 3.141592653589793


Note here that `pi` and `math.pi` are distinct from one another&mdash;they are in different namespaces.

### Floating point operations

The same operators, `+`, `-`, `*`, `/` work are usual for floating point numbers.  To raise an number to a power, we use the `**` operator (this is the same as Fortran)

In [22]:
R = 2.0

In [23]:
math.pi * R**2

12.566370614359172

Operator precedence follows that of most languages.  See

https://docs.python.org/3/reference/expressions.html#operator-precedence
    
in order of precedence:
* quantites in `()`
* slicing, calls, subscripts
* exponentiation (`**`)
* `+x`, `-x`, `~x`
* `*`, `@`, `/`, `//`, `%`
* `+`, `-`

(after this are bitwise operations and comparisons)

Parentheses can be used to override the precedence.

<div class="alert alert-block alert-warning">
    
<span class="fa fa-flash"></span> Quick Exercise:

Consider the following expressions.  Using the ideas of precedence, think about what value will result, then try it out in the cell below to see if you were right.

  * `1 + 3*2**2`
  * `1 + (3*2)**2`
  * `2**3**2`

</div>

The math module provides a lot of the standard math functions. Most are actually repeated in `numpy`, so in practice I personally almost never use `math`.

For the trig functions, the expectation is that the argument to the function is in radians&mdash;you can use `math.radians()` to convert from degrees to radians, ex:

In [24]:
math.cos(math.radians(45))

0.7071067811865476

Notice that in that statement we are feeding the output of one function (`math.radians()`) into a second function, `math.cos()`

When in doubt, as for help to discover all of the things a module provides:

In [25]:
help(math.sin)

Help on built-in function sin in module math:

sin(x, /)
    Return the sine of x (measured in radians).



## Complex numbers

python uses '`j`' to denote the imaginary unit

In [32]:
1.0 + 2j

(1+2j)

In [26]:
a = 1j
b = 3.0 + 2.0j
print(a + b)
print(a * b)

(3+3j)
(-2+3j)


we can use `abs()` to get the magnitude and separately get the real or imaginary parts 

In [27]:
print("magnitude: ", abs(b))
print("real part: ", a.real)
print("imag part: ", a.imag)

magnitude:  3.605551275463989
real part:  0.0
imag part:  1.0


## Strings

Python doesn't care if you use single or double quotes for strings:

In [29]:
a = "this is my string"
b = 'another string'

In [30]:
print(a)
print(b)

this is my string
another string


Many of the usual mathematical operators are defined for strings as well.  For example to concatenate or duplicate:

In [31]:
a + b

'this is my stringanother string'

In [32]:
a + ". " + b

'this is my string. another string'

In [33]:
a * 2

'this is my stringthis is my string'

There are several escape codes that are interpreted in strings.  These start with a backwards-slash, `\`.  E.g., you can use `\n` for new line

In [34]:
a = a + "\n" + "hello"
print(a)

this is my string
hello


<div class="alert alert-block alert-warning">
    
<span class="fa fa-flash"></span> Quick Exercise:
    
The `input()` function can be used to ask the user for input.

  * Use `help(input)` to see how it works.  
  * Write code to ask for input and store the result in a variable.  `input()` will return a string.

  * Use the `float()` function to convert a number entered as input to a floating point variable.  
  * Check to see if the conversion worked using the `type()` function.
</div>

Triple quotes """ can enclose multiline strings.  This is useful for docstrings at the start of functions (more on that later...)

In [41]:
c = """
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor 
incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis 
nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. 
Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore 
eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt 
in culpa qui officia deserunt mollit anim id est laborum."""

In [42]:
print(c)


Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor 
incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis 
nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. 
Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore 
eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt 
in culpa qui officia deserunt mollit anim id est laborum.


A raw string does not replace escape sequences (like \n).  Just put a `r` before the first quote:

In [35]:
d = r"this is a raw string\n hello"
d

'this is a raw string\\n hello'

Slicing is used to access a portion of a string.

Slicing a string can seem a bit counterintuitive if you are coming from C or Fortran.  The trick is to think of the index as representing the left edge of a character in the string.  When we do arrays later, the same will apply.

Also note that python (like C) uses 0-based indexing

Negative indices count from the right.

In [40]:
a[0:5:2]

'ti '

In [44]:
a = "this is my string"
print(a)
print(a[5:7])
print(a[0])
print(d)
print(d[-2])

this is my string
is
t
this is a raw string\n
\


<div class="alert alert-block alert-warning">
    
<span class="fa fa-flash"></span> Quick Exercise:

Strings have a lot of _methods_ (functions that know how to work with a particular datatype, in this case strings).  A useful method is `.find()`.  For a string `a`,
`a.find(s)` will return the index of the first occurrence of `s`.

For our string `c` above, find the first `.` (identifying the first full sentence), and print out just the first sentence in `c` using this result

</div>

There are also a number of methods and functions that work with strings.  Here are some examples:

In [41]:
print(a.replace("this", "that"))
print(len(a))
print(a.strip())    # Also notice that strip removes the \n
print(a.strip()[-1])

that is my string
hello
23
this is my string
hello
o


Note that our original string, `a`, has not changed.  In python, strings are *immutable*.  Operations on strings return a new string.

In [46]:
a

'this is my string'

In [42]:
type(a)

str

As usual, ask for help to learn more:

In [48]:
#help(str)

We can format strings when we are printing to insert quantities in particular places in the string.  A `{}` serves as a placeholder for a quantity and is replaced using the `.format()` method:

In [49]:
a = 1
b = 2.0
c = "test"
print("a = {}; b = {}; c = {}".format(a, b, c))

a = 1; b = 2.0; c = test


But the more modern way to do this is to use *f-strings*

In [50]:
print(f"a = {a}; b = {b}; c = {c}")

a = 1; b = 2.0; c = test


Note the `f` preceding the starting `"`

# Advanced Datatypes

These notes follow the official python tutorial pretty closely: http://docs.python.org/3/tutorial/

## Lists

Lists group together data.  Many languages have arrays (we'll look at those in a bit in python).  But unlike arrays in most languages, lists can hold data of all different types -- they don't need to be homogeneos.  The data can be a mix of integers, floating point or complex #s, strings, or other objects (including other lists).

A list is defined using square brackets:

In [51]:
a = [1, 2.0, "my list", 4]

In [52]:
a

[1, 2.0, 'my list', 4]

We can index a list to get a single element -- remember that python starts counting at 0:

In [53]:
a[2]

'my list'

In [54]:
a+["hello"]

[1, 2.0, 'my list', 4, 'hello']

Like with strings, mathematical operators are defined on lists:

In [55]:
a*2

[1, 2.0, 'my list', 4, 1, 2.0, 'my list', 4]

The `len()` function returns the length of a list

In [56]:
len(a)

4

Unlike strings, lists are _mutable_ -- you can change elements in a list easily

In [57]:
print(a)
a[1] = -2.0
print(a)

[1, 2.0, 'my list', 4]
[1, -2.0, 'my list', 4]


In [58]:
a

[1, -2.0, 'my list', 4]

In [7]:
a[0:1] = [-1, -2.1]   # this will put two items in the spot where 1 existed before
a

[-1, -2.1, -2.0, 'my list', 4]

Note that lists can even contain other lists:

In [59]:
a[1] = ["other list", 3]
a

[1, ['other list', 3], 'my list', 4]

Just like everything else in python, a list is an object that is the instance of a class.  Classes have methods (functions) that know how to operate on an object of that class.

There are lots of methods that work on lists.  Two of the most useful are append, to add to the end of a list, and pop, to remove the last element:

In [60]:
a.append(6)
a

[1, ['other list', 3], 'my list', 4, 6]

In [61]:
a.pop()

6

In [63]:
a

[1, ['other list', 3], 'my list', 4]

<div class="alert alert-block alert-warning">
    
<span class="fa fa-flash"></span> Quick Exercise:

An operation we'll see a lot is to begin with an empty list and add elements to it.  An empty list is created as:
```

 a = []
 
```

  * Create an empty list
  * Append the integers 1 through 10 to it.  
  * Now pop them out of the list one by one.
  
</div>

### Copying lists

Copying may seem a little counterintuitive at first.  The best way to think about this is that your list lives in memory somewhere and when you do 

```
a = [1, 2, 3, 4]
```

then the variable `a` is set to point to that location in memory, so it refers to the list.

If we then do
```
b = a
```
then `b` will also point to that same location in memory -- the exact same list object.

Since these are both pointing to the same location in memory, if we change the list through `a`, the change is reflected in `b` as well:

In [65]:
a = [1, 2, 3, 4]
b = a  # both a and b refer to the same list object in memory
print(a)
a[0] = "changed"
print(b)

[1, 2, 3, 4]
['changed', 2, 3, 4]


If you want to create a new object in memory that is a copy of another, then you can either index the list, using `:` to get all the elements, or use the `list()` function:

In [66]:
c = list(a)   # you can also do c = a[:], which basically slices the entire list
a[1] = "two"
print(a)
print(c)

['changed', 'two', 3, 4]
['changed', 2, 3, 4]


Things get a little complicated when a list contains another mutable object, like another list.  Then the copy we looked at above is only a _shallow copy_.

When in doubt, use the `id()` function to figure out where in memory an object lies (you shouldn't worry about the what value of the numbers you get from `id` mean, but just whether they are the same as those for another object)

In [67]:
print(id(a), id(b), id(c))

4372641728 4372641728 4372278592


Or use the `is` operator

In [65]:
a is b

True

In [66]:
a is c

False

There are lots of other methods that work on lists (remember, ask for help)

In [67]:
my_list = [10, -1, 5, 24, 2, -1, 9]
my_list.sort()
my_list

[-1, -1, 2, 5, 9, 10, 24]

In [68]:
my_list.count(-1)

2

We can also insert elements

In [69]:
a.insert(3, "my inserted element")
a

['changed', 2, 3, 'my inserted element', 4]

joining two lists is simple.  Like with strings, the `+` operator concatenates:

In [68]:
b = [1, 2, 3]
c = [4, 5, 6]
d = b + c
print(d)

[1, 2, 3, 4, 5, 6]


## Dictionaries

A dictionary stores data as a `key:value` pair.  Unlike a list where you have a particular order, the keys in a dictionary allow you to access information anywhere easily:

In [70]:
my_dict = {"key1":1, "key2":2, "key3":3}

In [71]:
my_dict["key1"]

1

You can add a new `key:value` easily, and it can be of any type

In [72]:
my_dict["newkey"] = "new"
my_dict

{'key1': 1, 'key2': 2, 'key3': 3, 'newkey': 'new'}

You can also easily get the list of keys that are defined in a dictionary

In [73]:
keys = list(my_dict.keys())
keys

['key1', 'key2', 'key3', 'newkey']

and check easily whether a key exists in the dictionary using the `in` operator

In [74]:
print("key1" in keys)
print("invalidKey" in keys)

True
False


<div class="alert alert-block alert-warning">
    
<span class="fa fa-flash"></span> Quick Exercise:

Create a dictionary where the keys are the string names of the numbers zero to nine and the values are their numeric representation (0, 1, ... , 9)

</div>

## List Comprehensions

List comprehensions provide a compact way to initialize lists.  Some examples from the tutorial

In [78]:
squares = [x**2 for x in range(10)]

In [79]:
squares

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

Here we use another python type, the tuple, to combine numbers from two lists into a pair

In [28]:
[(x, y) for x in [1,2,3] for y in [3,1,4] if x != y]

[(1, 3), (1, 4), (2, 3), (2, 1), (2, 4), (3, 1), (3, 4)]

<div class="alert alert-block alert-warning">
    
<span class="fa fa-flash"></span> Quick Exercise:

Use a list comprehension to create a new list from `squares` containing only the even numbers.  It might be helpful to use the modulus operator, `%`

</div>

## Tuples

Tuples are immutable -- they cannot be changed, but they are useful for organizing data in some situations.  We use () to indicate a tuple:

In [75]:
a = (1, 2, 3, 4)
a

(1, 2, 3, 4)

We can unpack a tuple:

In [76]:
w, x, y, z = a

In [77]:
w

1

Since a tuple is immutable, we cannot change an element:

In [78]:
a[0] = 2

TypeError: 'tuple' object does not support item assignment

But we can turn it into a list, and then we can change it

In [79]:
z = list(a)

In [80]:
z[0] = "new"

In [81]:
z

['new', 2, 3, 4]

It is often not clear how tuples differ from lists.  The most obvious way is that they are immutable.  Often we'll see tuples used to store related data that should all be interpreted together.  A good example is a Cartesian point, (x, y).  Here is a list of points:

In [36]:
points = []
points.append((1,2))
points.append((2,3))
points.append((3,4))
points

[(1, 2), (2, 3), (3, 4)]

We can even generate these for a curve using a list comprehension:

In [37]:
points = [(x, 2*x + 5) for x in range(10)]
points

[(0, 5),
 (1, 7),
 (2, 9),
 (3, 11),
 (4, 13),
 (5, 15),
 (6, 17),
 (7, 19),
 (8, 21),
 (9, 23)]

# Control Flow

These notes follow the official python tutorial pretty closely: http://docs.python.org/3/tutorial/

To write a program, we need the ability to iterate and take action based on the values of a variable.  This includes if-tests and loops.

Python uses whitespace to denote a block of code.

## While loop

A simple while loop&mdash;notice the indentation to denote the block that is part of the loop.

Here we also use the compact `+=` operator: `n += 1` is the same as `n = n + 1`

In [85]:
n = 0
while n < 10:
    print(n)
    n += 1

0
1
2
3
4
5
6
7
8
9


This was a very simple example.  But often we'll use the `range()` function in this situation.  Note that `range()` can take a stride.

In [86]:
for n in range(2, 10, 2):
    print(n)

2
4
6
8


## If statements

`if` allows for branching. Python does not have a select/case statement like some other languages, but `if`, `elif`, and `else` can reproduce any branching functionality you might need.

In [3]:
x = 0

if x < 0:
    print("negative")
elif x == 0:
    print("zero")
else:
    print("positive")


zero


## Iterating over elements

It's easy to loop over items in a list or any _iterable_ object. Looping over indexes like you would do in C is definitely not pythonic. The `in` operator is the key here.

In [87]:
alist = [1, 2.0, "three", 4]
for a in alist:
    print(a)

1
2.0
three
4


In [89]:
for i in [0,1,2,3]:
    print(alist[i])

1
2.0
three
4


In [90]:
for c in "this is a string":
    print(c)

t
h
i
s
 
i
s
 
a
 
s
t
r
i
n
g


We can combine loops and if-tests to do more complex logic, like break out of the loop when you find what you're looking for

In [91]:
n = 0
for a in alist:
    if a == "three":
        break
    else:
        n += 1

print(n)


2


(for that example, however, there is a simpler way)

In [85]:
alist.index("three")

2

For dictionaries, you can also loop over the elements

In [92]:
my_dict = {"key1":1, "key2":2, "key3":3}

for k, v in my_dict.items():
    print("key = {}, value = {}".format(k, v))    # notice how we do the formatting here


key = key1, value = 1
key = key2, value = 2
key = key3, value = 3


In [93]:
for k in sorted(my_dict):
    print(k, my_dict[k])

key1 1
key2 2
key3 3


Sometimes we want to loop over a list element and know its index -- `enumerate()` helps here:

In [88]:
for n, a in enumerate(alist):
    print(n, a)

0 1
1 2.0
2 three
3 4


<div class="alert alert-block alert-warning">
    
<span class="fa fa-flash"></span> Quick Exercise:
    
`zip()` allows us to loop over two iterables at the same time.  Consider the following two
lists:

```

 a = [1, 2, 3, 4, 5, 6, 7, 8]
 b = ["a", "b", "c", "d", "e", "f", "g", "h"]
 
```

`zip(a, b)` will act like a list with each element a tuple with one item from `a` and the corresponding element from `b`. 

Try looping over these lists together (using `zip()`) and print the corresponding elements from each list together on a single line.

</div>

In [101]:
a = [1, 2, 3, 4, 5, 6, 7, 8]
b = ["a", "b", "c", "d", "e", "f", "g", "h"]


for x,y in zip(a,b):
    print(x,y)



1 a
2 b
3 c
4 d
5 e
6 f
7 g
8 h


In [97]:
list(zip(a,b))

[(1, 'a'),
 (2, 'b'),
 (3, 'c'),
 (4, 'd'),
 (5, 'e'),
 (6, 'f'),
 (7, 'g'),
 (8, 'h')]

In [99]:
type(zip(a,b))

zip

<div class="alert alert-block alert-warning">
    
<span class="fa fa-flash"></span> Quick Exercise:
    

The `.split()` function on a string can split it into words (separating on spaces).  

Using `.split()`, loop over the words in the string

`a = "The quick brown fox jumped over the lazy dog"`

and print one word per line

</div>

# Functions and Classes

Functions and classes are the building blocks of complex programs.
These allow you to organize your code into logical units that can
reused.


# Functions

Functions are used to organize program flow, especially to allow us to easily do commonly needed tasks over and over again.  We've already used a lot of functions, such as those that work on lists (`append()` and `pop()`) or strings (like `replace()`).  Here we see how to write our own functions

A function takes arguments, listed in the `()` and returns a value.  Even if you don't explicitly give a return value, one will be return (e.g., `None`). 

Here's a simple example of a function that takes a single argument, `i`

In [107]:
def my_fun(i):
    print(f"in the function, i = {i}")
    
my_fun(10)
my_fun(5)

in the function, i = 10
in the function, i = 5


In [110]:
def great(x):
    alist='hello'
    print(x)
    print(alist)


In [111]:
great(x)

8
hello


In [112]:
print(great(x))

8
hello
None


Functions are one place where _scope_ comes into play.  A function has its own _namespace_.  If a variable is not defined in that function, then it will look to the namespace from where it was called to see if that variable exists there.  

However, you should avoid this as much as possible (variables that persist across namespaces are called global variables).

Functions always return a value&mdash;if one is not explicitly given, then they return None, otherwise, they can return values (even multiple values) of any type

In [113]:
a = my_fun(10)
a

in the function, i = 10


Here's a simple function that takes two numbers and returns their product.

In [114]:
def multiply(a, b):
    return a*b

c = multiply(3, 4)
c

12

<div class="alert alert-block alert-warning">
    
<span class="fa fa-flash"></span> Quick Exercise:
    

Write a simple function that takes a sentence (as a string) and returns an integer equal to the length of the longest word in the sentence.  The `len()` function and the `.split()` methods will be useful here.

</div>

`None` is a special quantity in python (analogous to `null` in some other languages).  We can test on `None`&mdash;the preferred manner is to use `is`:

In [116]:
def do_nothing():
    pass

a = do_nothing()
if a is None:
    print("we didn't do anything")

we didn't do anything


In [90]:
a is None

True

## More Complex Functions

Here's a more complex example.  We return a pair of variables&mdash;behind the scenes in python this is done by packing them into a tuple and then unpacking on the calling end.  Also note the _docstring_ here.

In [117]:
def fib2(n): # return Fibonacci series up to n (from the python tutorial)
    """Return a list containing the Fibonacci series up to n."""
    result = []
    a, b = 0, 1
    while a < n:
        result.append(a)    # see below
        a, b = b, a+b
    return result, len(result)

fib, n = fib2(250)
fib

[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233]

Note that this function includes a docstring (just after the function definition).  This is used by the help system

In [118]:
help(fib2)

Help on function fib2 in module __main__:

fib2(n)
    Return a list containing the Fibonacci series up to n.



You can have optional arguments which provide defaults.  Here's a simple function that validates an answer, with an optional argument that can provide the correct answer.

In [119]:
?fib2

In [120]:
def check_answer(val, correct_answer="a"):
    return val == correct_answer

print(check_answer("a"))
print(check_answer("a", correct_answer="b"))

True
False


It is important to note that python evaluates the optional arguments once&mdash;when the function is defined.  This means that if you make the default an empty object, for instance, it will persist across all calls.

**This leads to one of the most common errors for beginners**

Here's an example of trying to initialize to an empty list:

In [94]:
def f(a, L=[]):
    L.append(a)
    return L

print(f(1))
print(f(2))
print(f(3))

[1]
[1, 2]
[1, 2, 3]


Notice that each call does not create its own separate list.  Instead a single empty list was created when the function was first processed, and this list persists in memory as the default value for the optional argument `L`.  

If we want a unique list created each time (e.g., a separate place in memory), we instead initialize the argument's value to `None` and then check its actual value and create an empty list in the function body itself if the default value was unchanged.

In [121]:
def fnew(a, L=None):
    if L is None:
        L = []
    L.append(a)
    return L

print(fnew(1))
print(fnew(2))
print(fnew(3))

[1]
[2]
[3]


In [122]:
L = fnew(1)
print(fnew(2, L=L))

[1, 2]


Notice that the same `None` that we saw previously comes into play here.  

In [123]:
L

[1, 2]

## Lambdas

Lambdas are "disposable" functions.  These are small, nameless functions that are often used as arguments in other functions. The following are equivalent:

In [9]:
def square(x):
    return x**2

square = lambda x : x**2

For instance: We have a list of tuples and we want to sort the list based on the second item in the tuple.  The `sort` method can take a `key` optional argument that tells us how to interpret the list item for sorting

In [131]:
pairs = [(1, 'one'), (2, 'two'), (3, 'three'), (4, 'four')]
pairs.sort(key=lambda p: p[1])
pairs

[(4, 'four'), (1, 'one'), (3, 'three'), (2, 'two')]

In [128]:
pairs = [(10, 'one'), (2, 'two'), (3, 'three'), (4, 'four')]

pairs.sort()
pairs

[(2, 'two'), (3, 'three'), (4, 'four'), (10, 'one')]

In [129]:
pairs

[(2, 'two'), (3, 'three'), (4, 'four'), (10, 'one')]

Here we use a lambda in an extract from a list (with the filter function)

In [135]:
list(filter(lambda x:x==1, [1,2,3]))

[1]

In [136]:
squares = [x**2 for x in range(100)]
sq = list(filter(lambda x : x%2 == 0 and x%3 == 0, squares))
sq

[0,
 36,
 144,
 324,
 576,
 900,
 1296,
 1764,
 2304,
 2916,
 3600,
 4356,
 5184,
 6084,
 7056,
 8100,
 9216]

# Classes

Classes are the fundamental concept for object oriented programming.  A class defines a data type with both data and functions that can operate on the data.  An object is an instance of a class.  Each object will have its own namespace (separate from other instances of the class and other functions, etc. in your program).

We use the dot operator, `.`, to access members of the class (data or functions).  We've already been doing this a lot, strings, ints, lists, ... are all objects in python.

## Naming conventions

The python community has some naming convections, defined in PEP-8:

https://www.python.org/dev/peps/pep-0008/

The widely adopted ones are:

* class names start with an uppercase, and use "camelcase" for multiword names, e.g. `ShoppingCart`

* variable names (including objects which are instances of a class) are lowercase and use underscores to separate words, e.g., `shopping_cart`

* module names should be lowercase with underscores



## A simple class

Here's a class that holds some student info

In [137]:
class Student:
    def __init__(self, name, grade=None):
        self.name = name
        self.grade = grade

This has a function, `__init__()` which is called automatically when we create an instance of the class.  

The argument `self` refers to the object that we will create, and points to the memory that they object will use to store the class's contents.

In [139]:
a = Student("Mike", 18)
print(a.name)
print(a.grade)

Mike
18


Let's create a bunch of them, stored in a list

In [140]:
students = []
students.append(Student("fry", "19"))
students.append(Student("leela", "30"))
students.append(Student("zoidberg", "23"))
students.append(Student("hubert", "10"))
students.append(Student("bender", "26"))
students.append(Student("calculon", "27"))
students.append(Student("amy", "30"))
students.append(Student("hermes", "30"))
students.append(Student("scruffy", "12"))
students.append(Student("flexo", "18"))
students.append(Student("morbo", "23"))
students.append(Student("hypnotoad", "30lode"))
students.append(Student("zapp", "14"))

<div class="alert alert-block alert-warning">
    
<span class="fa fa-flash"></span> Quick Exercise:


Loop over the students in the `students` list and print out the name and grade of each student, one per line.

</div>

In [143]:
students[0].name

'fry'

We can use list comprehensions with our list of objects.  For example, let's find all the students who have 30's

In [17]:
As = [q.name for q in students if q.grade.startswith("30")]
As

['leela', 'amy', 'hermes', 'hypnotoad']

## Playing Cards

Here's a more complicated class that represents a playing card.  Notice that we are using unicode to represent the suits.

In [145]:
class Card:
    
    def __init__(self, suit=1, rank=2):
        if suit < 1 or suit > 4:
            print("invalid suit, setting to 1")
            suit = 1
            
        self.suit = suit
        self.rank = rank
        
    def value(self):
        """ we want things order primarily by rank then suit """
        return self.suit + (self.rank-1)*14
    
    # we include this to allow for comparisons with < and > between cards 
    def __lt__(self, other):
        return self.value() < other.value()

    def __eq__(self, other):
        return self.rank == other.rank and self.suit == other.suit
    
    def __repr__(self):
        return self.__str__()
    
    def __str__(self):
        suits = [u"\u2660",  # spade
                 u"\u2665",  # heart
                 u"\u2666",  # diamond
                 u"\u2663"]  # club
        
        r = str(self.rank)
        if self.rank == 11:
            r = "J"
        elif self.rank == 12:
            r = "Q"
        elif self.rank == 13:
            r = "K"
        elif self.rank == 14:
            r = "A"
                
        return r +':'+suits[self.suit-1]

we can create a card easily.

In [146]:
c1 = Card()

We can pass arguments to `__init__` in when we setup the class:

In [147]:
c2 = Card(suit=2, rank=2)

Once we have our object, we can access any of the functions in the class using the `dot` operator

In [148]:
c2.value()

16

In [149]:
c3 = Card(suit=0, rank=4)

invalid suit, setting to 1


The `__str__` method converts the object into a string that can be printed.

In [150]:
print(c1)
print(c2)

2:♠
2:♥


The value method assigns a value to the object that can be used in comparisons, and the `__lt__` method is what does the actual comparing

In [151]:
print(c1 > c2)
print(c1 < c2)

False
True


Note that not every operator is defined for our class, so, for instance, we cannot add two cards together:

In [152]:
c1 + c2

TypeError: unsupported operand type(s) for +: 'Card' and 'Card'

<div class="alert alert-block alert-warning">
    
<span class="fa fa-flash"></span> Quick Exercise:

 * Create a "hand" corresponding to a straight (5 cards of any suite, but in sequence of rank)
 * Create another hand corresponding to a flush (5 cards all of the same suit, of any rank)
 * Finally create a hand with one of the cards duplicated&mdash;this should not be allowed in a standard deck of cards.  How would you check for this?

</div>

## Deck of Cards

Classes can use other include other classes as data objects&mdash;here's a deck of cards.  Note that we are using the python random module here.

In [153]:
import random

class Deck:
    """ the deck is a collection of cards """

    def __init__(self):

        self.nsuits = 4
        self.nranks = 13
        self.minrank = 2
        self.maxrank = self.minrank + self.nranks - 1

        self.cards = []

        for rank in range(self.minrank,self.maxrank+1):
            for suit in range(1, self.nsuits+1):
                self.cards.append(Card(rank=rank, suit=suit))

    def shuffle(self):
        random.shuffle(self.cards)

    def get_cards(self, num=1):
        hand = []
        for n in range(num):
            hand.append(self.cards.pop())

        return hand
    
    def __str__(self):
        string = ""
        for c in self.cards:
            string += str(c) + " "
        return string

Let's create a deck, shuffle, and deal a hand (for a poker game)

In [154]:
mydeck = Deck()
print(mydeck)
print(len(mydeck.cards))

2:♠ 2:♥ 2:♦ 2:♣ 3:♠ 3:♥ 3:♦ 3:♣ 4:♠ 4:♥ 4:♦ 4:♣ 5:♠ 5:♥ 5:♦ 5:♣ 6:♠ 6:♥ 6:♦ 6:♣ 7:♠ 7:♥ 7:♦ 7:♣ 8:♠ 8:♥ 8:♦ 8:♣ 9:♠ 9:♥ 9:♦ 9:♣ 10:♠ 10:♥ 10:♦ 10:♣ J:♠ J:♥ J:♦ J:♣ Q:♠ Q:♥ Q:♦ Q:♣ K:♠ K:♥ K:♦ K:♣ A:♠ A:♥ A:♦ A:♣ 
52


Notice that there is no error handling in this class.  `get_cards()` will deal cards from the deck, removing them in the process.  Eventually we'll run out of cards.

In [158]:
mydeck.shuffle()

hand = mydeck.get_cards(5)
for c in sorted(hand): print(c)

4:♥
6:♠
6:♦
9:♠
K:♠


## Operators

We can define operations like `+` and `-` that work on our objects.  Here's a simple example of currency&mdash;we keep track of the country and the amount

In [119]:
class Currency:
    """ a simple class to hold foreign currency """
    
    def __init__(self, amount, country="US"):
        self.amount = amount
        self.country = country
        
    def __add__(self, other):
        if self.country != other.country:
            return None
        return Currency(self.amount + other.amount, country=self.country)

    def __sub__(self, other):
        return Currency(self.amount - other.amount, country=self.country)

    def __str__(self):
        return f"{self.amount} {self.country}"

We can now create some monetary amounts for different countries

In [120]:
d1 = Currency(10, "US")
d2 = Currency(15, "US")
print(d2 - d1)

5 US


<div class="alert alert-block alert-warning">
    
<span class="fa fa-flash"></span> Quick Exercise:
    

As written, our Currency class has a bug&mdash;it does not check whether the amounts are in the same country before adding.  Modify the `__add__` method to first check if the countries are the same.  If they are, return the new `Currency` object with the sum, otherwise, return `None`.

</div>

## Vectors Example

Here we write a class to represent 2-d vectors.  Vectors have a direction and a magnitude.  We can represent them as a pair of numbers, representing the `x` and `y` lengths.  We'll use a tuple internally for this

We want our class to do all the basic operations we do with vectors: add them, multiply by a scalar, cross product, dot product, return the magnitude, etc.

We'll use the math module to provide some basic functions we might need (like sqrt)

This example will show us how to overload the standard operations in python.  Here's a list of the builtin methods:
https://docs.python.org/3/reference/datamodel.html

To make it really clear what's being called when, I've added prints in each of the functions

In [121]:
import math

In [130]:
class Vector:
    """ a general two-dimensional vector """
    
    def __init__(self, x, y):
        print("in __init__")
        self.x = x
        self.y = y
        
    def __str__(self):
        print("in __str__")        
        return f"({self.x} î + {self.y} ĵ)"
    
    def __repr__(self):
        print("in __repr__")        
        return f"Vector({self.x}, {self.y})"

    def __add__(self, other):
        print("in __add__")        
        if isinstance(other, Vector):
            return Vector(self.x + other.x, self.y + other.y)
        else:
            # it doesn't make sense to add anything but two vectors
            print(f"we don't know how to add a {type(other)} to a Vector")
            raise NotImplementedError

    def __sub__(self, other):
        print("in __sub__")        
        if isinstance(other, Vector):
            return Vector(self.x - other.x, self.y - other.y)
        else:
            # it doesn't make sense to add anything but two vectors
            print(f"we don't know how to add a {type(other)} to a Vector")
            raise NotImplementedError

    def __mul__(self, other):
        print("in __mul__")        
        if isinstance(other, int) or isinstance(other, float):
            # scalar multiplication changes the magnitude
            return Vector(other*self.x, other*self.y)
        else:
            print("we don't know how to multiply two Vectors")
            raise NotImplementedError

    def __matmul__(self, other):
        print("in __matmul__")
        # a dot product
        if isinstance(other, Vector):
            return self.x*other.x + self.y*other.y
        else:
            print("matrix multiplication not defined")
            raise NotImplementedError

    def __rmul__(self, other):
        print("in __rmul__")        
        return self.__mul__(other)

    def __truediv__(self, other):
        print("in __truediv__")        
        # we only know how to multiply by a scalar
        if isinstance(other, int) or isinstance(other, float):
            return Vector(self.x/other, self.y/other)

    def __abs__(self):
        print("in __abs__")        
        return math.sqrt(self.x**2 + self.y**2)

    def __neg__(self):
        print("in __neg__")        
        return Vector(-self.x, -self.y)

    def cross(self, other):
        # a vector cross product -- we return the magnitude, since it will
        # be in the z-direction, but we are only 2-d 
        return abs(self.x*other.y - self.y*other.x)

This is a basic class that provides two methods `__str__` and `__repr__` to show a representation of it. These two functions provide a readable version of our object.

The convection is what `__str__` is human readable while `__repr__` should be a form that can be used to recreate the object (e.g., via `eval()`).  See:

http://stackoverflow.com/questions/1436703/difference-between-str-and-repr-in-python

In [131]:
v = Vector(1,2)
v

in __init__
in __repr__


Vector(1, 2)

In [132]:
print(v)

in __str__
(1 î + 2 ĵ)


Vectors have a length, and we'll use the `abs()` builtin to provide the magnitude.  For a vector:

$$\vec{v} = \alpha \hat{i} + \beta \hat{j}$$

we have

$$|\vec{v}| = \sqrt{\alpha^2 + \beta^2}$$

In [133]:
abs(v)

in __abs__


2.23606797749979

Let's look at mathematical operations on vectors now.  We want to be able to add and subtract two vectors as well as multiply and divide by a scalar.

In [134]:
u = Vector(3,5)

in __init__


In [135]:
w = u + v
print(w)

in __add__
in __init__
in __str__
(4 î + 7 ĵ)


In [136]:
u - v

in __sub__
in __init__
in __repr__


Vector(2, 3)

It doesn't make sense to add a scalar to a vector, so we didn't implement this -- what happens?

In [137]:
u + 2.0

in __add__
we don't know how to add a <class 'float'> to a Vector


NotImplementedError: 

Now multiplication.  It makes sense to multiply by a scalar, but there are multiple ways to define multiplication of two vectors.  

Note that python provides both a `__mul__` and a `__rmul__` function to define what happens when we multiply a vector by a quantity and what happens when we multiply something else by a vector.

In [138]:
u*2.0

in __mul__
in __init__
in __repr__


Vector(6.0, 10.0)

In [139]:
2.0*u

in __rmul__
in __mul__
in __init__
in __repr__


Vector(6.0, 10.0)

and division: `__truediv__` is the python 3 way of division `/`, while `__floordiv__` is the old python 2 way, also enabled via `//`.

Dividing a scalar by a vector doesn't make sense:

In [140]:
u/5.0

in __truediv__
in __init__
in __repr__


Vector(0.6, 1.0)

In [141]:
5.0/u

TypeError: unsupported operand type(s) for /: 'float' and 'Vector'

Python 3.5 introduced a new matrix multiplication operator, `@` -- we'll use this to implement a dot product between two vectors:

In [142]:
u @ v

in __matmul__


13

For a cross product, we don't have an obvious operator, so we'll use a function.  For 2-d vectors, this will result in a scalar

In [143]:
u.cross(v)

1

Finally, negation is a separate operation:

In [144]:
-u

in __neg__
in __init__
in __repr__


Vector(-3, -5)

# Modules

Here we import our own module called `myprofile` (we have it in the same directory as this notebook)

In [161]:
import myprofile

We have a docstring at the top -- the comments there are what appear when we ask for help

In [160]:
help(myprofile)

Help on module myprofile:

NAME
    myprofile

DESCRIPTION
    A very simple profiling class.  Define some timers and methods
    to start and stop them.  Nesting of timers is tracked so we can
    pretty print the profiling information.
    
    # define a timer object, labeled 'my timer'
    a = timer('my timer')
    
    This will add 'my timer' to the list of keys in the 'my timer'
    dictionary.  Subsequent calls to the timer class constructor
    will have no effect.
    
    # start timing the 'my timer' block of code
    a.begin()
    
    ... do stuff here ...
    
    # end the timing of the 'my timer' block of code
    a.end()
    
    for best results, the block of code timed should be large
    enough to offset the overhead of the timer class method
    calls.
    
    Multiple timers can be instantiated and nested.  The stackCount
    global parameter keeps count of the level of nesting, and the
    timerNesting data structure stores the nesting level for each
    define

This module simply provides a way to time routines (python and ipython have built-in methods for this too)

In [162]:
t = myprofile.Timer("main loop")
t.begin()

sum = 0.0
for n in range(1000):
    sum += n**2


t.end()
myprofile.time_report()

print(sum)

main loop:  0.00036907196044921875
332833500.0


In the file `myprofile.py`, you will see a block of code under

```
if __name__ == "__main__":
```

That code is executed if the file is run directly, either from the commandline as:

`python myprofile.py`

for through the `%run` magic

In [156]:
%run myprofile

1:  10.005089044570923
2:  25.006304025650024
   3:  20.003389835357666


# Exceptions

Python raises exceptions when it encounters an error.  The idea is that you can trap these exceptions and take an appropriate action instead of causing the code to crash.  The mechanism for this is `try` / `except`.  Here's an example that causes an exception, `ZeroDivisionError`:

In [163]:
a = 1/0

ZeroDivisionError: division by zero

and here we handle this  

In [164]:
try:
    a = 1/0
except ZeroDivisionError:
    print("warning: you divided by zero")
    a = 1

a



1

In [171]:
def rec(a):
    try:
        b=1/a
    except ZeroDivisionError:
        print("warning: you divided by zero")
        b=None
    return b

In [173]:
print(rec(0))



None


Another example&mdash;trying to access a key that doesn't exist in a dictionary:

In [174]:
dict = {"a":1, "b":2, "c":3}
print(dict["d"])


KeyError: 'd'

`KeyError` is the exception that was raised.  We can check for this and take the appropriate action instead

In [175]:
try:
    val = dict["d"]
except KeyError:
    val = None

print(val)

None


There are a lot of different types of exceptions that you can catch, and you can catch multiple ones per except clause or have multiple except clauses.  You probably won't be able to anticipate every failure mode in advance.  In that case, when you run and your code crashes because of an exception, the python interpreter will print out the name of the exception and you can then modify your code to take the appropriate action.

# File I/O

One of the main things that we want to do in scientific computing is get data into and out of our programs.  In addition to plain text files, there are modules that can read lots of different data formats we might encounter.

As expected, a file is an object.  Here we'll use the `try`, `except` block to capture exceptions (like if the file cannot be opened). 

In [176]:
try: f = open("./mywrite.txt", "w")   # open for writing -- any file of the same name will be overwritten
except: 
    print("cannot open the file")

print(f)


<_io.TextIOWrapper name='./mywrite.txt' mode='w' encoding='UTF-8'>


In [178]:
f.write("this is my first write\n")
f.close()

We can easily loop over the lines in a file

In [179]:
try: 
    f = open("./test.txt", "r")
except:
    print("error: cannot open the file")
    
for line in f:
    print(line.split())
    
f.close()

['Lorem', 'ipsum', 'dolor', 'sit', 'amet,', 'consectetur', 'adipisicing', 'elit,', 'sed', 'do']
['eiusmod', 'tempor', 'incididunt', 'ut', 'labore', 'et', 'dolore', 'magna', 'aliqua.', 'Ut', 'enim', 'ad']
['minim', 'veniam,', 'quis', 'nostrud', 'exercitation', 'ullamco', 'laboris', 'nisi', 'ut']
['aliquip', 'ex', 'ea', 'commodo', 'consequat.', 'Duis', 'aute', 'irure', 'dolor', 'in']
['reprehenderit', 'in', 'voluptate', 'velit', 'esse', 'cillum', 'dolore', 'eu', 'fugiat', 'nulla']
['pariatur.', 'Excepteur', 'sint', 'occaecat', 'cupidatat', 'non', 'proident,', 'sunt', 'in']
['culpa', 'qui', 'officia', 'deserunt', 'mollit', 'anim', 'id', 'est', 'laborum.']
[]


In [180]:
try: 
    f = open("./tessasssst.txt", "r")
except:
    print("error: cannot open the file")

error: cannot open the file


as mentioned earlier, there are lots of string functions.  Above we used `strip()` to remove the trailing whitespace and returns.

In pratice, we never read files explicitely but use some library, such as `numpy.genfromtxt`. Specific data format ofen have dedicated modules, like `cvs` for cvs files and `configparser` for ini files 

# Exercises

Work on those you like the most. For the exam, preparing 2 of these is enough (or 3 if you pick those that are very short).

## Q1: Machine precision

When talking about floating point, we discussed _machine epsilon_, $\epsilon$&mdash;this is the smallest number that when added to 1 is still different from 1.

We'll compute $\epsilon$ here:

  * Pick an initial guess for $\epsilon$ of `eps = 1`.  

  * Create a loop that checks whether `1 + eps` is different from `1`
  
  * Each loop iteration, cut the value of `eps` in half
  
What value of $\epsilon$ do you find?



## Q2: Iterations

### Part 1

To iterate over the tuples, where the _i_-th tuple contains the _i_-th elements of certain sequences, we can use `zip(*sequences)` function.

We will iterate over two lists, `names` and `age`, and print out the resulting tuples.

  * Start by initializing lists `names = ["Mary", "John", "Sarah"]` and `age = [21, 56, 98]`.
  
  * Iterate over the tuples containing a name and an age, the `zip(list1, list2)` function might be useful here.
  
  * Print out formatted strings of the type "*NAME is AGE years old*".
  

### Part 2

The function `enumerate(sequence)` returns tuples containing indices of objects in the sequence, and the objects. 

The `random` module provides tools for working with the random numbers. In particular, `random.randint(start, end)` generates a random number not smaller than `start`, and not bigger than `end`.

  * Generate a list of 10 random numbers from 0 to 9.
  
  * Using the `enumerate(random_list)` function, iterate over the tuples of random numbers and their indices, and print out *"Match: NUMBER and INDEX"* if the random number and its index in the list match.

## Q3: Books

Here is a list of book titles (from http://thegreatestbooks.org).  Loop through the list and capitalize each word in each title. 

In [25]:
titles = ["don quixote", 
          "in search of lost time", 
          "ulysses", 
          "the odyssey", 
          "war and piece", 
          "moby dick", 
          "the divine comedy", 
          "hamlet", 
          "the adventures of huckleberry finn", 
          "the great gatsby"]

## Q4: Word counts

Here's some text (the Gettysburg Address).  Our goal is to count how many times each word repeats.  We'll do a brute force method first, and then we'll look a ways to do it more efficiently (and compactly).

In [26]:
gettysburg_address = """
Four score and seven years ago our fathers brought forth on this continent, 
a new nation, conceived in Liberty, and dedicated to the proposition that 
all men are created equal.

Now we are engaged in a great civil war, testing whether that nation, or 
any nation so conceived and so dedicated, can long endure. We are met on
a great battle-field of that war. We have come to dedicate a portion of
that field, as a final resting place for those who here gave their lives
that that nation might live. It is altogether fitting and proper that we
should do this.

But, in a larger sense, we can not dedicate -- we can not consecrate -- we
can not hallow -- this ground. The brave men, living and dead, who struggled
here, have consecrated it, far above our poor power to add or detract.  The
world will little note, nor long remember what we say here, but it can never
forget what they did here. It is for us the living, rather, to be dedicated
here to the unfinished work which they who fought here have thus far so nobly
advanced. It is rather for us to be here dedicated to the great task remaining
before us -- that from these honored dead we take increased devotion to that
cause for which they gave the last full measure of devotion -- that we here
highly resolve that these dead shall not have died in vain -- that this
nation, under God, shall have a new birth of freedom -- and that government
of the people, by the people, for the people, shall not perish from the earth.
"""

We've already seen the `.split()` method will, by default, split by spaces, so it will split this into words, producing a list:

In [27]:
ga = gettysburg_address.split()

In [182]:
ga

['Four',
 'score',
 'and',
 'seven',
 'years',
 'ago',
 'our',
 'fathers',
 'brought',
 'forth',
 'on',
 'this',
 'continent,',
 'a',
 'new',
 'nation,',
 'conceived',
 'in',
 'Liberty,',
 'and',
 'dedicated',
 'to',
 'the',
 'proposition',
 'that',
 'all',
 'men',
 'are',
 'created',
 'equal.',
 'Now',
 'we',
 'are',
 'engaged',
 'in',
 'a',
 'great',
 'civil',
 'war,',
 'testing',
 'whether',
 'that',
 'nation,',
 'or',
 'any',
 'nation',
 'so',
 'conceived',
 'and',
 'so',
 'dedicated,',
 'can',
 'long',
 'endure.',
 'We',
 'are',
 'met',
 'on',
 'a',
 'great',
 'battle-field',
 'of',
 'that',
 'war.',
 'We',
 'have',
 'come',
 'to',
 'dedicate',
 'a',
 'portion',
 'of',
 'that',
 'field,',
 'as',
 'a',
 'final',
 'resting',
 'place',
 'for',
 'those',
 'who',
 'here',
 'gave',
 'their',
 'lives',
 'that',
 'that',
 'nation',
 'might',
 'live.',
 'It',
 'is',
 'altogether',
 'fitting',
 'and',
 'proper',
 'that',
 'we',
 'should',
 'do',
 'this.',
 'But,',
 'in',
 'a',
 'larger',
 '

Now, the next problem is that some of these still have punctuation.  In particular, we see "`.`", "`,`", and "`--`".

When considering a word, we can get rid of these by using the `replace()` method:

In [183]:
a = "end.,"
b = a.replace(".", "").replace(",", "")
b

'end'

Another problem is case&mdash;we want to count "but" and "But" as the same.  Strings have a `lower()` method that can be used to convert a string:

In [184]:
a = "But"
b = "but"
a == b

False

In [185]:
a.lower() == b.lower()

True

Recall that strings are immutable, so `replace()` produces a new string on output.

### Your task

Create a dictionary that uses the unique words as keys and has as a value the number of times that word appears.  

Write a loop over the words in the string (using our split version) and do the following:
  * remove any punctuation
  * convert to lowercase
  * test if the word is already a key in the dictionary (using the `in` operator)
     - if the key exists, increment the word count for that key
     - otherwise, add it to the dictionary with the appropriate count of `1`.

At the end, print out the words and a count of how many times they appear

### More compact way

We can actually do this a lot more compactly by using another list comprehensions and another python datatype called a set.  A set is a group of items, where each item is unique (e.g., no repetitions).

Here's a list comprehension that removes all the punctuation and converts to lower case:

In [186]:
words = [q.lower().replace(".", "").replace(",", "") for q in ga]

and by using the `set()` function, we turn the list into a set, removing any duplicates:

In [187]:
unique_words = set(words)

now we can loop over the unique words and use the `count` method of a list to find how many there are

In [188]:
count = {}
for uw in unique_words:
    count[uw] = words.count(uw)
    
count

{'of': 5,
 'ago': 1,
 'their': 1,
 'perish': 1,
 'civil': 1,
 'should': 1,
 'four': 1,
 'do': 1,
 'not': 5,
 'freedom': 1,
 'altogether': 1,
 'a': 7,
 'unfinished': 1,
 'measure': 1,
 'or': 2,
 'little': 1,
 'before': 1,
 'proposition': 1,
 'live': 1,
 'all': 1,
 'created': 1,
 'living': 2,
 'struggled': 1,
 'say': 1,
 'great': 3,
 'world': 1,
 'note': 1,
 'fitting': 1,
 'add': 1,
 'in': 4,
 'resolve': 1,
 'rather': 2,
 'but': 2,
 'nation': 5,
 'detract': 1,
 'forget': 1,
 'power': 1,
 'dedicate': 2,
 'nor': 1,
 'far': 2,
 'cause': 1,
 'did': 1,
 'people': 3,
 'new': 2,
 'earth': 1,
 'long': 2,
 'have': 5,
 'hallow': 1,
 'final': 1,
 'these': 2,
 'proper': 1,
 'testing': 1,
 'here': 8,
 'be': 2,
 'score': 1,
 'come': 1,
 'whether': 1,
 'now': 1,
 'liberty': 1,
 'endure': 1,
 'is': 3,
 'honored': 1,
 'dead': 3,
 'might': 1,
 'above': 1,
 'battle-field': 1,
 'government': 1,
 'portion': 1,
 'dedicated': 4,
 'any': 1,
 'increased': 1,
 'continent': 1,
 'can': 5,
 'larger': 1,
 'task': 1,


Even shorter -- we can use a dictionary comprehension, like a list comprehension

In [189]:
c = {uw: count[uw] for uw in unique_words}

In [190]:
c

{'of': 5,
 'ago': 1,
 'their': 1,
 'perish': 1,
 'civil': 1,
 'should': 1,
 'four': 1,
 'do': 1,
 'not': 5,
 'freedom': 1,
 'altogether': 1,
 'a': 7,
 'unfinished': 1,
 'measure': 1,
 'or': 2,
 'little': 1,
 'before': 1,
 'proposition': 1,
 'live': 1,
 'all': 1,
 'created': 1,
 'living': 2,
 'struggled': 1,
 'say': 1,
 'great': 3,
 'world': 1,
 'note': 1,
 'fitting': 1,
 'add': 1,
 'in': 4,
 'resolve': 1,
 'rather': 2,
 'but': 2,
 'nation': 5,
 'detract': 1,
 'forget': 1,
 'power': 1,
 'dedicate': 2,
 'nor': 1,
 'far': 2,
 'cause': 1,
 'did': 1,
 'people': 3,
 'new': 2,
 'earth': 1,
 'long': 2,
 'have': 5,
 'hallow': 1,
 'final': 1,
 'these': 2,
 'proper': 1,
 'testing': 1,
 'here': 8,
 'be': 2,
 'score': 1,
 'come': 1,
 'whether': 1,
 'now': 1,
 'liberty': 1,
 'endure': 1,
 'is': 3,
 'honored': 1,
 'dead': 3,
 'might': 1,
 'above': 1,
 'battle-field': 1,
 'government': 1,
 'portion': 1,
 'dedicated': 4,
 'any': 1,
 'increased': 1,
 'continent': 1,
 'can': 5,
 'larger': 1,
 'task': 1,


## Q5: Foxes and dogs

### Part 1. Short words

Let's practice functions.  Here's a simple function that takes a string and returns a list of all the 4 letter words:

In [32]:
def four_letter_words(message):
    words = message.split()
    four_letters = [w for w in words if len(w) == 4]
    return four_letters

In [33]:
message = "The quick brown fox jumps over the lazy dog"
print(four_letter_words(message))

['over', 'lazy']


Write a version of this function that takes a second argument, n, that is the word length we want to search for

### Part 2: Panagrams

A _panagram_ is a sentence that includes all 26 letters of the alphabet, e.g., "_The quick brown fox jumps over the lazy dog_."

Write a function that takes as an argument a sentence and returns `True` or `False`, indicating whether the sentence is a panagram.

## Q6: Cath errors

We want to safely convert a string into a float, int, or leave it as a string, depending on its contents.  As we've already seen, python provides `float()` and `int()` functions for this:

In [30]:
a = "2.0"
b = float(a)
print(b, type(b))

2.0 <class 'float'>


But these throw exceptions if the conversion is not possible

In [31]:
a = "this is a string"
b = float(a)

ValueError: could not convert string to float: 'this is a string'

In [195]:
a = "1.2345"
b = int(a)
print(b, type(b))

ValueError: invalid literal for int() with base 10: '1.2345'

In [196]:
b = float(a)
print(b, type(b))

1.2345 <class 'float'>


Notice that an int can be converted to a float, but if you convert a float to an int, you risk losing significant digits.  A string cannot be converted to either.

### Your task

Write a function, `convert_type(a)` that takes a string `a`, and converts it to a float if it is a number with a decimal point, an int if it is an integer, or leaves it as a string otherwise, and returns the result.  You'll want to use exceptions to prevent the code from aborting.

## Q7: Tic-tac-toe

Here we'll write a simple tic-tac-toe game that 2 players can play.  First we'll create a string that represents our game board:

In [197]:
board = """
 {s1:^3} | {s2:^3} | {s3:^3}
-----+-----+-----
 {s4:^3} | {s5:^3} | {s6:^3}
-----+-----+-----      123
 {s7:^3} | {s8:^3} | {s9:^3}       456
                       789  
"""

This board will look a little funny if we just print it&mdash;the spacing is set to look right when we replace the `{}` with `x` or `o`

In [198]:
print(board)


 {s1:^3} | {s2:^3} | {s3:^3}
-----+-----+-----
 {s4:^3} | {s5:^3} | {s6:^3}
-----+-----+-----      123
 {s7:^3} | {s8:^3} | {s9:^3}       456
                       789  



and well use a dictionary to denote the status of each square, "x", "o", or empty, ""

In [199]:
play = {}

def initialize_board(play):
    for n in range(9):
        play["s{}".format(n+1)] = ""

initialize_board(play)
play

{'s1': '',
 's2': '',
 's3': '',
 's4': '',
 's5': '',
 's6': '',
 's7': '',
 's8': '',
 's9': ''}

Note that our `{}` placeholders in the `board` string have identifiers (the numbers in the `{}`).  We can use these to match the variables we want to print to the placeholder in the string, regardless of the order in the `format()`

In [200]:
a = "{s1:} {s2:}".format(s2=1, s1=2)
a

'2 1'

Here's an easy way to add the values of our dictionary to the appropriate squares in our game board.  First note that each of the {} is labeled with a number that matches the keys in our dictionary.  Python provides a way to unpack a dictionary into labeled arguments, using **

This lets us to write a function to show the tic-tac-toe board.

In [201]:
def show_board(play):
    """ display the playing board.  We take a dictionary with the current state of the board
    We rely on the board string to be a global variable"""
    print(board.format(**play))
    
show_board(play)


     |     |    
-----+-----+-----
     |     |    
-----+-----+-----      123
     |     |           456
                       789  



Now we need a function that asks a player for a move:

In [202]:
def get_move(n, xo, play):
    """ ask the current player, n, to make a move -- make sure the square was not 
        already played.  xo is a string of the character (x or o) we will place in
        the desired square """
    valid_move = False
    while not valid_move:
        idx = input("player {}, enter your move (1-9)".format(n))
        if play["s{}".format(idx)] == "":
            valid_move = True
        else:
            print("invalid: {}".format(play["s{}".format(idx)]))
            
    play["s{}".format(idx)] = xo

In [203]:
help(get_move)

Help on function get_move in module __main__:

get_move(n, xo, play)
    ask the current player, n, to make a move -- make sure the square was not 
    already played.  xo is a string of the character (x or o) we will place in
    the desired square



### Your task

Using the functions defined above,
  * `initialize_board()`
  * `show_board()`
  * `get_move()`

fill in the function `play_game()` below to complete the game, asking for the moves one at a time, alternating between player 1 and 2

In [None]:
def play_game():
    """ play a game of tic-tac-toe """
    
    play ={}
    initialize_board(play)
    show_board(play)

In [None]:
play_game()

## Q8: Shopping cart

Let's write a simple shopping cart class -- this will hold items that you intend to purchase as well as the amount, etc.  And allow you to add / remove items, get a subtotal, etc.

We'll use two classes: `Item` will be a single item and `ShoppingCart` will be the collection of items you wish to purchase.

First, our store needs an inventory -- here's what we have for sale:

In [204]:
INVENTORY_TEXT = """
apple, 0.60
banana, 0.20
grapefruit, 0.75
grapes, 1.99
kiwi, 0.50
lemon, 0.20
lime, 0.25
mango, 1.50
papaya, 2.95
pineapple, 3.50
blueberries, 1.99
blackberries, 2.50
peach, 0.50
plum, 0.33
clementine, 0.25
cantaloupe, 3.25
pear, 1.25
quince, 0.45
orange, 0.60
"""

# this will be a global -- convention is all caps
INVENTORY = {}
for line in INVENTORY_TEXT.splitlines():
    if line.strip() == "":
        continue
    item, price = line.split(",")
    INVENTORY[item] = float(price)


In [205]:
INVENTORY

{'apple': 0.6,
 'banana': 0.2,
 'grapefruit': 0.75,
 'grapes': 1.99,
 'kiwi': 0.5,
 'lemon': 0.2,
 'lime': 0.25,
 'mango': 1.5,
 'papaya': 2.95,
 'pineapple': 3.5,
 'blueberries': 1.99,
 'blackberries': 2.5,
 'peach': 0.5,
 'plum': 0.33,
 'clementine': 0.25,
 'cantaloupe': 3.25,
 'pear': 1.25,
 'quince': 0.45,
 'orange': 0.6}

### Item 

Let's write an item class now -- we want it to hold the name and quantity.  

You should have the following features:

* The name should be something in our inventory

* Our shopping cart will include a list of all the items we want to buy, so we want to be able to check for duplicates.  Implement the equal test, `==`, using `__eq__`

* We'll want to consolidate duplicates, so implement the `+` operator, using `__add__` so we can add items together in our shopping cart.  Note, add should raise a ValueError if you try to add two `Items` that don't have the same name.

Here's a start:

In [206]:
class Item:
    """ an item to buy """
    
    def __init__(self, name, quantity=1):
        """keep track of an item that is in our inventory"""
        if name not in INVENTORY:
            raise ValueError("invalid item name")
        self.name = name
        self.quantity = quantity
        
    def __repr__(self):
        return "{}: {}".format(self.name, self.quantity)
        
    def __eq__(self, other):
        """check if the items have the same name"""
        return self.name == other.name
    
    def __add__(self, other):
        """add two items together if they are the same type"""
        if self.name == other.name:
            return Item(self.name, self.quantity + other.quantity)
        else:
            raise ValueError("names don't match")

Here are some tests your code should pass:

In [207]:
a = Item("apple", 10)
b = Item("banana", 20)

In [208]:
c = Item("apple", 20)

In [209]:
# won't work
a + b

ValueError: names don't match

In [210]:
# will work
a += c
print(a)

apple: 30


In [211]:
d = Item("dog")

ValueError: invalid item name

In [212]:
# should be False
a == b

False

In [213]:
# should be True -- they have the same name
a == c

True

How do they behave in a list?

In [214]:
items = []
items.append(a)
items.append(b)
items

[apple: 30, banana: 20]

In [215]:
# should be True -- they have the same name
c in items

True

### ShoppingCart

Now we want to create a shopping cart.  The main thing it will do is hold a list of items.

In [216]:
class ShoppingCart:
    
    def __init__(self):
        # the list of items we control
        self.items = []
        
    def subtotal(self):
        """ return a subtotal of our items """
        pass

    def add(self, name, quantity):
        """ add an item to our cart -- the an item of the same name already
        exists, then increment the quantity.  Otherwise, add a new item
        to the cart with the desired quantity."""
        pass
        
    def remove(self, name):
        """ remove all of item name from the cart """
        pass
        
    def report(self):
        """ print a summary of the cart """
        for item in self.items:
            print(f"{item.name} : {item.quantity}")

Here are some tests

In [217]:
sc = ShoppingCart()
sc.add("orange", 19)

In [218]:
sc.add("apple", 2)

In [219]:
sc.report()

In [220]:
sc.add("apple", 9)

In [221]:
# apple should only be listed once in the report, with a quantity of 11
sc.report()

In [222]:
sc.subtotal()

In [223]:
sc.remove("apple")

In [224]:
# apple should no longer be listed
sc.report()

## Q9: Poker odds

Use the deck of cards class from the notebook we worked through class to write a _Monte Carlo_ code that plays a lot of hands of straight poker (like 100,000).  Count how many of these hands has a particular poker hand (like 3-of-a-kind).  The ratio of # of hands with 3-of-a-kind to total hands is an approximation to the odds of getting a 3-of-a-kind in poker.

### Bonus: 
Just to practice modules, write that into a `.py` file to allow you to import and reuse them here.

## Q10: Tic-Tac-Toe again

Revisit the tic-tac-toe game you developed in the functions exercises but now write it as a class with methods to do each of the main steps.  

## Q11: Rock-Paper-Scissors

Implement a set of games of rock-paper-scissors against the computer.  

  * Ask for input from the user ("rock", "paper", or "scissors") and the randomly select one of these for the computer's play.
  * Announce who won.
  * Keep playing until a player says that they no longer want to play.
  * When all games are done, print out how many games were won by the player and by the computer 

## Q12: Pascal's triangle

Pascal's triangle is created such that each layer has 1 more element than the previous, with `1`s on the side and in the interior, the numbers are the sum of the two above it, e.g.,:
```
            1
          1   1
        1   2   1
      1   3   3   1
    1   4   6   4   1
  1   5   10  10  5   1
```

1. Write a function to return the first `n` rows of Pascal's triangle.  The return should be a list of length `n`, with each element itself a list containing the numbers for that row.
2. Write a function to print out Pascal's triangle with proper formatting, so the numbers in each row are centered between the ones in the row above

## Q13: Calendar events

We want to keep a schedule of events.  We will do this by creating a class called `Day`.  It is sketched out below.  A `Day` holds a list of events and has methods that allow you to add an delete events.  Our events will be instances of a class `Event`, which holds the time, location, and description of the event.

Finally, we can keep track of a list of all the `Day`s for which we have events to make our schedule.

Fill in these classes and write some code to demonstrate their use:

  * Create a full week of days in your calendar
  * Add an event every day at noon called "lunch"
  * Randomly add some other events to fill out your calendar
  * Write some code that tells you the start time of your first meeting and the end time of your last meeting (this is the length of your work day)

In [225]:
class Day:
    """a single day keeping track of the events scheduled"""
    def __init__(month, day, year):
        # store the month, day, and year as data in the class
        
        # keep track of the events
        self.events = []
    
    def add_event(name, time=None, location=None):
        pass
    
    def delete_event(name):
        pass
    
    
class Event:
    """a single event in our calendar"""
    def __init__(name, time=9, location=None, duration=1):
        self.name = name
        self.time = time
        self.location = location
        self.duration = duration