# Introduction to programming with Python

## Python brief history 

Created in 1989 by Guido Van Rossum - Monty Python fan hence the name and the jokes!

+ Currently there are two versions available
  + 2.7.x (2010)
  + 3.7.x (2018)

+ Python 2.7 will be supported until 2020, but users are
  encouraged to move to Python 3 as soon as possible 

## The Zen of Python, by Tim Peters

Beautiful is better than ugly.  
Explicit is better than implicit.  
Simple is better than complex.  
Complex is better than complicated.  
Flat is better than nested.  
Sparse is better than dense.  
Readability counts.  
Special cases aren't special enough to break the rules.  
Although practicality beats purity.  
Errors should never pass silently.  
Unless explicitly silenced.  
In the face of ambiguity, refuse the temptation to guess.  
There should be one-- and preferably only one --obvious way to do it.  
Although that way may not be obvious at first unless you're Dutch.  
Now is better than never.  
Although never is often better than *right* now.  
If the implementation is hard to explain, it's a bad idea.  
If the implementation is easy to explain, it may be a good idea.  
Namespaces are one honking great idea -- let's do more of those!  

## Working with Jupyter notebook 

+ Edit cells
+ Command completion and help
+ Type of cells
+ Executing cells (ctrl+enter, shift+enter)
+ Command mode vs Edit mode (esc, enter)
+ Inserting new cells above and below (a,b)
+ Deleting cells (d d)
+ Saving

## Data types and Variables

Variables in Python hide what is inside of them. It is _extremely_ important that, whenever you are working with variables, you have in mind what TYPE of thing the variable holds. One way to help yourself is to use helpful names with variables (hint: don't use "x"!).

You can check the type of a variable with the built-in "type" function: 

```python
x = 5
type(x)
```

Use this to check the type of all the variables in the cells below (NOTE: Jupyter will automatically display the value of the last expression evaluated in the cell): 

In [None]:
# Numbers

x = 3
y = 3.2

In [None]:
# Strings are written between with double quotes ("") or single quotes ('')

x = "om" 
y = 'om'

In [None]:
# Boolean

x = True  
x = False 

In [None]:
# Null value in Python is a NoneType, written as "None"

x = None

## Basic operations with numerical data



```python
# Assignment
a = 10      # 10

# Increment/Decrement
a += 1      # 11
a -= 1      # 10

# Operations
b = a + 1   # 11
c = a - 1   # 9

d = a * 2   # 20
e = a / 2   # 5
f = a % 3   # 1   (modulo or integer remainder) 
g = a ** 2  # 100 (a to the power of 2)

# Operations with other variables
d = a + b   # 21
```

## Basic operations with strings


You can concatenate strings together with the + operator: 

```python
"Hello" + " " + "World"
```

The built-in function "len" can be used to find the length of a string: 

```python
len("foo")
```

## Printing

It can be useful to "print" what we are doing to the screen, this can be done with the built-in "print" command".

You might have noticed that Jupyter notebooks automatically displays the value of the last expression in a cell, when you execute it, so you don't need to print that!

In [None]:
# Printing - note how Jupyter automatically prints "y" to "out", 
# but we need to manually print "x" if we want to see it!

x = 15 / 2
print(x)
y = x > 2
y

## Logical operations

We can check the relationships between different types of data in Python

The output of such comparisons/operations are boolean variables

Lets see some examples 

In [None]:
# Comparing numbers

x = 1 >= 2 
y = 1 == 2 
w = 1 != 2

In [None]:
# It is canonical to use "is" instead of == for checking for NoneType:

x = None
y = x is None
z = x is not None

In [None]:
# There is a boolean algebra to combine comparisons (and/or)
# Note: parentheses not needed here, but they help readability!


x = (1 <= 2) and (1 > 0)
y = (1 > 2) or (1 < 3)

## Indentation 

Most languages don’t care about indentation.

Most humans do. We tend to group similar things together. 

Python encourages “readable” code by enforcing indentation.

## Making decisions on the basis of comparisons ("control flow"): if 

The structure is 

```python
if BOOLEAN: 
    ACTION 1
    ACTION 2 # note indentation! 
else: 
    ACTION 3```

For example: 

```python
gender = "male"
age = 20 
if gender == "female":
    if age > 18:
        print("woman")
    else: 
        print("girl")```

What will it print???

In [None]:
# Practise control flow: 
# write some code that prints "high" if
# the number in x is greater than 5, and "low"
# if the number is less than 5

x = 5

# your code here! Hint: use if/else and print.

## Looping 





<table><tr>
<td>  **This is the process of repeating a set of operations *when an index varies within a set*! Within the loop the data used in the operations can change** </td>
<td> <img src="dullboy.jpg"> </td>
</tr></table>


Looping is fundamental in Python! Let's see examples. 

## Lists

In order to loop, we need something to loop over! In Python, things that can be looped over are called "iterables". 

Whenever you want to loop in Python, think of everything that needs to change inside the loop and try to put that into an iterable. This might be different than you are used to in other languages!

One of the simplest iterables in Python is a list. Lists are created with square brackets: 

```python
my_list = ["She", "turned", "me", "into", "a", "newt"]
```

Here you can see we created a list of strings. We can also create a list of integers: 

```python
ages = [1,5,10]
```

But Python lists don't have to be homogenous, you can mix types! This is most useful for including the NoneType: 

```python
ages = [1, None, 10]
```

In [None]:
# A list!
ages = [1, 2, 10, None, 100]

# A loop! (note the indentation)
for x in ages:
    print("This persons age is: ", x)
print("Done")

In [None]:
# Challenge: repeat the above cell, but only print the age if the
# age exists (i.e. is not None). HINT: Look back at the cells on
# logical operations to see how to check if a value is None in Python


# Your code here

## Using Lists


Sometimes you want to access individual elements from a list. You can do this using square brackets together with the "index" of the element: 

```python
ages = [1,5,10,20,30]
ages[0]
```

The first element is indexed at 0, the second element at 1, etc. 

You can also access a contiguous range of elements: 

```python
ages[1:3] # second item (index 1) and third item (index 2) only!
```

You can also use negative indices to access items from the end. For example, the last item: 

```python
ages[-1]
```

You can concatenate multiple lists together with the +: 

```python
ages + [40, 50, 60]
```

And you can check for membership with "in": 

```python
"foo" in ["foo", "bar", "baz"]
```

## Operations on Lists

In data science, we deal with data! Data, being many datum, are often stored in lists (or list-like structures). 

There are three main operations we perform with lists: 

1. Aggregate (reduce)
2. Applying a function (map) to each element
3. Filter the elements

Let's look at examples to understand what these terms mean.

In [None]:
# Aggregation: 
# Summing the numbers in a list: 

nums = [30,1,4,3,10.5,100]

total = 0 

for num in nums:
    total += num
    
total

In [None]:
# Aggregation: 
# Finding the minimum number in a list of number: 

nums = [30,1,4,3,10.5,100]

min_num = x[0] 

for num in nums: 
    if num < min_num:
        min_num = num
    
min_num

In [None]:
# Challenge!

# Aggregation: 
# Counting the number of NoneTypes in a list: 

nums = [30, None, 4, 3, None, 10.5, 100]

total = 0 

for num in nums:
    # Your code here
    
total

In [None]:
# Applying a function: 
# Squaring each number in a list

nums = [30,1,4,3,10.5,100]

# This is called a "for comprehension"
# and is the Pythonic way to apply a function to 
# every element in a list
squared_nums = [num**2 for num in nums]
    
squared_nums

In [None]:
# Applying a function: 
# Getting the length of each string in a list: 

names = ["foo", "bar", "baz", "foobarbaz"]

# Note the "len" command to get the length of a string. 
# Hint: this same command can be used to ge the length of a list!
lengths = [len(name) for name in names]

lengths

In [None]:
# Filter:
# Remove all values less than 18:

ages = [0, 3, 21, 45, 10, 97]

adults = [a for a in ages if a > 17]

adults

In [None]:
# Filter:
# Remove NoneTypes from a list: 

names = ["foo", "bar", None, "baz"]

only_names = [name for name in names if name is not None]

only_names

## Tuples

Another iterable is called a "tuple". Rather than using square brackets, tuples are created with parentheses: 

```python
x = ("foo", 1)
```

But can also be created without any perentheses, implied by the comma: 

```python
x = "foo", 1
```

Elements in the tuple are also accessed via the index (like lists): 

```python
x[0]
```

Lists can be used most places that a tuple is used, so it can be confusing what the difference is between the two. Besides technical differences that we won't go in to here, the following rules can help you decide when to use a tuple and when to use a list: 

* LIST: Potentially many elements, unknown number of elements, relatively homogenous elements.
* TUPLE: Few elements, fixed number of elements, completely heterogeneous elements.


The name comes from here: _double, triple, quadruple, quintuple, sextuple, septuple, octuple._ Which gives a hint that they should be of fixed length! Because of this, we rarely iterate over them in a for loop like lists. 

Because they have a fixed length, we often use them with destructuring: 

```python
name,num = x
```
Now the variable "name" contains the value "foo" and the variable "num" contains the value 1. This may not seem particularly useful at the moment, but we will soon see how it can be used. 

In [None]:
# Destructuring tuples in a for loop:

# Note: a list of tuples is a useful data structure 
# when your data is a set of "pairs":

scoreboard = [("om", 100), ("nandan", 10000), ("arapakis", 55)]

for name,score in scoreboard: 
    print(f"{name} has scored {score} points") # string interpolation with f""!

In [None]:
# Aggregating a list of tuples: 

# Challenge:
# Return the name of the top scoring teacher.
# Hint: this is an aggregation!

scoreboard = [("om", 100), ("nandan", 10000), ("arapakis", 55)]

# Your code here

## Dictionaries


We saw that it can be great to put our data into a tuple if it is easily represented as a pair (or a triple, quadruple, etc.). But sometimes our data is more complicated than that, and we don't want to try and remember the "order" of each distinct part (as we need in a tuple). 

Dictionaries are another basic type in Python. 

They are "associative" data structures. Like the eponymous dictionary, they associate a KEY with a VALUE and are created with the {}: 

```python
teacher = {"name": "nandan", "score": 10000}
```

You can access the value via the key:

```python
teacher["name"]
```

You can also set a value in a similar way: 

```python
teacher["name"] = "nandan rao"
```

Note that each key can ONLY HAVE ONE VALUE. In the above example, I have overwritten the original "name" key with a new value.

In [None]:
# Challenge: 
# Collect all the likes of the teachers into one list: 
# Hint: this is an aggregation!

# Notice: What is "teachers"? 
# A list of dictionaries, but each dictionary has three keys 
# and the "likes" key contains a list of strings!
teachers = [{"name": "om", "score": 100, "likes": ["statistics", "more statistics", "even more statistics"]},
            {"name": "nandan", "score": 10000, "likes": ["ice cream"]},
            {"name": "arapakis", "score": 55, "likes": ["R", "D3"]}]

# Your code here

## Advanced operations with strings

Strings are actually iterables, just like lists! They can be subset just like lists: 

```python
x = "my python string"
x[3:9]
```

You can also turn a string into a list of strings via the "split" method: 

```python
x = "my python string"
y = x.split(" ")
y == ["my", "python", "string"]
```

And the reverse is also possible via the "join" method: 

```python
space = " "
space.join(y)
y == x
```


You can also make everything lower (or upper!) case, replace certain substrings with other substrings, and check for the existence of a substring with "in":  

```python
z = "My Python String"
z.lower() == x
z.upper() == "MY PYTHON STRING"
z.replace("Python", "R")
"Python" in z
```
There are many more easy-to-use, built-in tools for working with text data in Python. You can read more here: https://docs.python.org/3/library/stdtypes.html#string-method

In [None]:
# Challenge: 
# Count the instances of "foo" in the following text, 
# ignoring case:

x = "Hello, I would like a foo. Foo went for a walk. Foo bar baz. Baz to the foo."

# Your code here

## Data and operations in a bundle: functions

Input data (if any) --> Set of operations --> Output data (if any)

```python
def name(input):
    operations
    return output
```

For example, here is a function that takes a number, and its square:

```python
def squared(x):
    return x**2
```

Here is a more general function that takes a number and the power, and returns the numbe to that power: 

```python
def power(x, n):
    return x**n
```

Here is a function that returns the minimum and the sum of a list of numbers:

```python
def minsumfun(x):
    minx = x[0] if len(x) > 0 else None
    sumx = 0.0  
    for y in x: 
        if y < minx:
            minx = y
        sumx += y 
    return minx,sumx # notice multiple outputs, technically a tuple!
```

Now we can call that function: 

```python
m,s = minsumfun([1,5,0.3,-1]) # Destructuring! Very Pythonic! :)
```

In [None]:
# Functions are very useful for transforming data:
# Let's say we want the highest scores for each teacher: 

teachers = [{"name": "om", "scores": [100, 200, 150]},
            {"name": "nandan", "scores": [10000, 9999, 99987 ]},
            {"name": "arapakis", "scores": [55, 100, 5]}]



# Fill in this function! 
# It should take a dictionary, as in the list above, 
# and it should return a 2-tuple with their name and
# their highest score: (name, score)
def highest_score(person):
    # Your code here



# Apply the function to each element in the list "teachers": 

# Your code here


### Reading and writing from files

We can think of files as data types actually! Python effectively does this, and you should then not be suprised that they have attributes etc. 

There are two modes, read and write

In short, the built-in `open` function creates a Python file object, which serves as a link
to a file residing on your machine. After calling `open`, you can transfer strings of data
to and from the associated external file by calling the returned file object’s methods.

`with` is a file context manager which allows us to wrap file-processing code in a logic layer that
ensures that the file will be closed automatically on exit.

To understand how Python works with such data, lets work with the text file *textfile.txt*


In [None]:
## Lets read the lines in the file
with open("textfile.txt","r") as f:
    content = f.readlines()
print(content)
print(content[0])

In [None]:
# Since we have a list, we can resort to the usual tricks, e.g. 
with open("textfile.txt","r") as f:
    linesf = f.readlines()
    for line in linesf: 
        print(line)

In [None]:
# We can also read single lines

with open("textfile.txt","r") as f:
    line1 = f.readline()
    print(line1)

## More advanced concepts: behind the scenes of value assignement

This is a again a deep concept, that connects to the notion of *memory pointers*, a working knowledge of which, however, is critical to avoid creating an unintentional mess!

The best way to understand what this is about is with the following example. What do you think will happen to objects a and b after these operations? 

 

In [None]:
a = [1,2,5]
b = a
b[2] = 10

## Copy vs assignement

What happens is that really a and b are *pointers* to the same address in the memory and share the same data. 

The way to create an object that will *copy* the data in a but not *share* the data with a is to do a ... copy! 

```python
b = a.copy()
```

Things become a little trickier (although it does make perfect sense!) when you deal with lists of lists; for this reason there is also the deepcopy. Try at home what happens with the following example

In [None]:
a = [1,[2,3],5,"om"] 
b = a.copy()
b[1].append(100)
## Print a, b and be surprised!

In [None]:
## Try now 
from copy import deepcopy
b = deepcopy(a)
b[1].pop()
print(b)
print(a)

# Python and Object Oriented Programming (OOP)


This is a topic that can get very deep but a working knowledge is *critical* to start benefiting from Python's possibilities. 

Key ideas like classes, instances and inheritance relate to this.

## More advanced concepts: classes, instances and inheritance



The short, precise but incomprehensible story: Classes are factories of that produce instances

The longer, looser but workable story: A Class is a vehicle to define a particular "class" of data that has predefined ways of interacting (accessing and modifying) with it.

The attribution that classes permits makes it distinct from a function, the other main tool in Python to work with data and operations. 

In fact, functions can become attributes of a class, and we have been using this all along! These are called methods.  


In [None]:
## Example of class: 

class Shape2d:
    def __init__(self,x,y): # this is a way to pass data into the 
                            # instances of this class
        self.x = x          # it is a function!
        self.y = y
    
    def area(self): 
        return self.x*self.y
    
    def perimeter(self):
        return 2 * self.x + 2 * self.y


In [None]:
# Instatiating the class

my2dshape = Shape2d(3,4)

In [None]:
# See the methods at work! recall TAB!

my2dshape.area()

In [None]:
# Example of inheritance

class Shape3d(Shape2d):
    def __init__(self,x,y,z):
        self.z = z
        self.x = x
        self.y = y
        
    def volume(self):
        return self.area()*self.z

## More advanced concepts: mudules and imports

*Modules* are python files, recognised in the computer as filename.py

Data and methods (functions) defined in the module can become part of Python's *namespace* by using *import*

To appreciate what the name space contains lets experiment with the following

In [None]:
x = sin(5)

In [None]:
from math import sin
x = sin(5)
print(x)

In [None]:
sin = 3
print(sin(3))

## Typical import structures

```python
from math import sin # imports a single function

from math import sin as sinus # nickname, this is useful when the function imported has long and complicated name


import math # this imports the module in the name space, methods can then be accessed e.g.
math.sin(3)

from math import * #imports all methods in the namespace, not recommended!

```

In [None]:
# Another import example: importing my own functions

from omsuselessfunctions import themostuselessfunctionever as f1

f1()


## More advanced concepts: default values and variable number of arguments in functions

In Python we get to assign default values to inputs of functions. For example 

```python
def f(a=1, b=2):
    return a+b

# This can be validly be called in the following ways (guess the answers!)

f()
f(10)
f(b=4)
f(10,4)
f(a=10,b=4)
f(b=4,a=10)

# but NOT like this!!!
f(a=10,4)
```

## Variable number of arguments

There are instances in intermediate programming with Python that we wish a function to be able to handle an a priori unspecified number of arguments

An example: I want to write a function that returns the maximum of a set of numbers, that will be passed on as arguments, and will work regardless of how many numbers are passed on. 

I would also like the function to default to Inf if no number is passed on. 

In [None]:
# Function for maximum of a variable number of inputs

def maxmany(x=float("inf"),*extra): # this specification makes 
    #                               extra a tuple!!
    runmax = x
    for y in extra: 
        if runmax<y:
            runmax = y
    return(runmax)

# What is going on behind the scenes is that * unpacks a tuple!!

## Passing on dictionaries as inputs

The previous idea can be taken a step further by passing on dictionaries as inputs. 

Dictionaries are unpacked by ** 

This is particularly useful since often we like to specify parameters in a function and refer to them with intuitive names. 

Consider the following example (in which we also use some advanced Python constructs, such as *decorators*) 


In [None]:
from math import log
from abc import ABC, abstractmethod


class Density(ABC):
    
    # Destructuring in parameter declarations
    def __init__(self, **params):
        self.params = params
    
    @abstractmethod
    def log_density(self, some_parameter):
        pass       
    
    def dens(self, x):
        
        # Destructuring in function invocation - the opposite!
        return self.log_density(x, **self.params)
    
    
class Gaussian(Density):    
    def log_density(self, x, mu, sigmasq):
        return -((x-mu)**2)/(2*sigmasq)       
    
class Exponential(Density):
    def log_density(self, x, theta):
        return -theta*x
    
class Bernoulli(Density):
    def log_density(self, x, p):
        return x*log(p)+(1-x)*log(1-p)
    
models = [
    Bernoulli(p = .2),
    Gaussian(mu = 0, sigmasq = 1),
    Exponential(theta = .1)
]

# This is polymorphism! They are different classes, but they have
# the same INTERFACE. This means that at this point in the code,
# we don't need to know how the different models were parameterized, 
# we call the same method on all!

for m in models:
    print(m.dens(.5))

## Programming project: basic text analytics

Open the file "textfile.txt", which has copied a passage from the book Learning Python". 

The task is to read the text into Python and do some basic text analysis. In particular, write a Python code that:

+ counts how many sentences there are in the text
+ counts how many words there are in the text
+ finds all the different words in the text and for each computes the frequency of its appearance and stores the output of this analysis in a convenient way so that it is easy later to find out how often a given word appears

As a first step, and as a simpler exercise, do the above for a single paragraph of the text. Do this first and if you do not manage to finish the larger project submit this as your solution. 

