# A quick introduction to Python 3 for CSC course takers

## ToDo: add images before/after cells where student should write their own code or experiment

A word of warning to start with, this tutorial covers the newer version of Python, Python 3. Version 2.7 is still very widely used. Where relevant, different things have been marked with *Version 2.7 works differently*

It is a cliche that every programming tutorial starts with a hello world


In [1]:
print("Hello World!")

Hello World!


Try it yourself!

Write a similar print statement in the field below and press the button that looks like play (in the toolbar to evaluate the code you have just written. 
![Play button](//csc-it-center-for-science.github.io/python-data/assets/img/play.png "play")

The name Python comes from the comedy group Monty Python and it's customary for programming examples to contain references to Monty Python sketches. If you can recall any, feel free to use one as your example string.

If you don't like Monty Python then we apologize for your poor sense of humor.

Throughout this document we'll mark down places to type your own content with ![Python image](//csc-it-center-for-science.github.io/python-data/assets/img/python-50px.png "python")

In [None]:
# write your answer here and press the play-like button in the bar


What you just witnessed was a call to the built-in function *print()* with a string parameter of your choosing. *Version 2.7 works differently, print is not a function there*

For the built-in functions see [this list](https://docs.python.org/3/library/functions.html) . You'll run into them several times in this notebook.

In Python strings can be denoted with single ' or double " quotes. The suggested way is to pick one and stick to it.

**Comparison** is done as in most other languages

    "example" == 'example'
    2 <= 5
    object() is not None

**None** is the Python equivalent of null, i.e. an uninitialized object. The keyword *is* should be used to compare object identity and the double equals (==) for value. 

Let's try those out

![Python image](//csc-it-center-for-science.github.io/python-data/assets/img/python-50px.png "python")

In [13]:
print("example" != 'example')
print(2 <= 5)
print("string" is not None)
# comments are prepended by a #, 
# there is no block comment syntax

False
True
True


Variable assignment uses a single **=**

Variables *don't* need to be declared. This is convenient but means that you have to be wary of typos in variable names (or use a linting tool).

Try it out

![Python image](//csc-it-center-for-science.github.io/python-data/assets/img/python-50px.png "python")

In [58]:
first = 5
second = 3
print(first + second)


8


All variables have a type, even though you don't declare it. Python is **strongly typed** so it will not automatically cast values.

You can check the type of a variable using the [built-in](https://docs.python.org/3/library/functions.html) *type* function

Try out the following

![Python image](//csc-it-center-for-science.github.io/python-data/assets/img/python-50px.png "python")

In [20]:
str_ = "This is example number"
int_ = 5
print(type(str_))
print(type(int_))

print(str_ + int_) # this will fail


<class 'str'>
<class 'int'>


TypeError: Can't convert 'int' object to str implicitly

It produces an error because *string* and *int* have different types.

Note how the variables str_ and int_ have an _ appended to them to avoid redefining [built-ins](https://docs.python.org/3/library/functions.html) str and int, respectively. Python would let you redefine those but it's a good convention to avoid doing so.

the *str()* function converts an integer to a string and it is possible to concatenate two strings using + .

Go ahead and fix the above error so that the print function will succeed.

### Program structure

Python is _whitespace_ aware, i.e. code blocks are indented using **whitespace** (preferably spaces) and code indented to the same level belongs to the same block. There are **no curly braces**, except to denote operator precedence in a statement in case there is risk of confusion.

A colon (:) at the end of the previous line indicates that the next line will begin a new code block. 

A block must never be empty. If you must have a block but don't want to execute anything the *pass* statement is used.

An example of this is the if-elif-else statement

In [22]:
var = 5
if var < 5:
    print("var is less than 5")
elif var == 5:
    print("var is precisely 5")
elif var == 6:
    pass # having no statement here would be an error
else:
    print("var is more than 6")


var is precisely 5


### Task: flooring division

Some programming languages use floor integer division, i.e. 

    (3 / 2) == 1 

and others convert the output to a floating point, i.e. 
    
    (3 / 2) == 1.5

Construct an if-else statement that prints out either "Python 3 uses floor integer division" 
or "Python 3 converts to a float" to test how Python 3 does things. *Version 2.7 works differently*

It's often easier to just test what something does in Python shell or notebook than to read through the document. 
                                                                    
![Python image](//csc-it-center-for-science.github.io/python-data/assets/img/python-50px.png "python")

In [None]:
#write your code here

## Functions

The core of any programming language is a **function**. A simple example of a function is 

    def add(a, b):
        """ adds two arguments together.
        """
        return a + b
        
The first line defines (*def*) a function called *add* that takes in two parameters, a and b.

The text inside triple quotes is a docstring that documents what the function does. There are several docstring formatting conventions but those are outside the scope of this tutorial. However it is always a good idea to write down at least a sentence or two of what you think you do. 

Functions can have arguments and keyword-arguments. Arguments are compulsory and keyword arguments aren't. It's also necessary to have all the arguments before the keyword arguments.

    def add(a, b=1):
        """ adds two arguments together or 1 if called with one parameter
        """
        return a + b
        
Note how the first add-function could be called with strings or integers but the second one can be called with two strings, two integers or one integer but not with one string (it would result in a TypeError). Yet we're not explicitly testing for the types before using the +-operator. The Python philosophy is that it's easier to ask for forgiveness than permission. This philosophy is often abbreviated as EAFP.

In Python functions are first-class citizens. That is to say that you can pass a function as the parameter of another function. Like this:
        

In [3]:
def apply_(fun, arg1, arg2): #there's a builtin called apply as well
    print(fun)
    print(arg1)
    print(arg2)
    return fun(arg1, arg2) # *args and ** are a handy way to collect all the
    
def subtract(a, b):
    return a - b

apply_(subtract, 5, 3)

<function subtract at 0x104671268>
5
3


2

The apply_-function isn't very generic because it can only handle the two-variable case.

Python has a handy notation for referring to all the arguments and keyword arguments.

In [33]:
def apply_(fun, *args, **kwargs):
    print(fun)
    print(args)
    print(kwargs)
    return fun(*args, **kwargs)
    

def add(*args):
    return sum(args) #sum is another builtin function

apply_(add, 1, 2, 3, 4, 5)

<function add at 0x104591ea0>
(1, 2, 3, 4, 5)
{}


15

## Task: simple function
Okay, do it yourself: write a function that subtracts the second parameter from the first one.
    
Then call it with two parameters and make sure the outcome works out to what you'd expect it to.

![Python image](//csc-it-center-for-science.github.io/python-data/assets/img/python-50px.png "python")

## Task: no return
In Python all functions have a return value, even if you don't use the *return* keyword. Consequently it's often easy to mess up by forgetting to return a real value.

Use the cell below to try out what a Python function without a *return* statement returns. Remember that you can create a code-block that does nothing with the keyword *pass*. You can for instance print the return value of a function that doesn't return anything.

![Python image](//csc-it-center-for-science.github.io/python-data/assets/img/python-50px.png "python")

## Task: recursion

Write a recursive function that computes the factorial for a number. Remember that there is a special place in hell for programmers who don't check for the end condition in recursion.

![Python image](//csc-it-center-for-science.github.io/python-data/assets/img/python-50px.png "python")



## Extra fun: Recursion depth

Python can be used as a functional language but it doesn't support tail recursion. Therefore there is such a thing as a maximum recursion depth.

Use the following count-function to experiment how deep the rabbit hole you can go before exceptions occur.

Food for thought: why might the factorial program be a bad candidate for testing maximum recursion depth? Try out if you can overflow the Python integer by counting a large(ish) factorial.

![Python image](//csc-it-center-for-science.github.io/python-data/assets/img/python-50px.png "python")


In [4]:
def count(value):
    if value == 0:
        return 0
    return count(value-1) +1

count(2)

2

## Data types and iteration

Python has Integers, strings and floating point numbers like most programming languages and we've already worked with them.

It also has lists,

    list_ = [1, 2, 3, 4] # the [] brackets imply a list
    
tuples

    tuple_ = (1, 2, 3, 4) # () are used to signify a tuple, but actually just the comma is sufficient so
    tuple_2 = 1, 2, 3, 4
    
dictionaries

    dict_ = {
                "key": "value",
                "key2": 5
            }
                   
Data structires in general aren't typed so you can put anything in them. The one requirement is that the keys in a dictionary must be hashable i.e. they cannot be mutable. In plain English this means that you can't use lists as dictionary keys.

For iterating through a structure there is a very simple syntax:

In [34]:
list_ = [2, "foobar", False, print] #switch the brackets to braces to see that the syntax works for tuples too
for item in list_:
    print(item)

2
foobar
False
<built-in function print>


Lists are accessed by [] notation, which is smart and supports negative indexing and slicing. 

Indices naturally start from 0. Python was written by/for **people with good taste**, not barbarians.

In [50]:
list_ = [1, 2, 3, 4]
print(list_[0])
print(list_[-1])
print(list_[1:3])

#the built-in len()-function tells you the size of various collections

len(list_)

# lists also support some features familiar from other data structures, e.g.
list_.pop()
# you should know when/if you need them

len(list_)

# and of course sorting and other batteries are batteries included

list_.sort(reverse=True) # note that sort() modifies the list in place and returns None
print(list_)

1
4
[2, 3]
[3, 2, 1]


There is no table concept in the Python language. Libraries like NumPy operate on objects that are like tables, but Python-the-language in itself doesn't have any. You can do the same with nested lists if you need to

    double_list = [
                    [1,2],
                    [3, 4]
                   ]
                   
Complete the tasks in the comments below
![Python image](//csc-it-center-for-science.github.io/python-data/assets/img/python-50px.png "python")

In [48]:
double_list = [
              [1, 2, 3, 4],
              [5, 6, 7, 8],
              [9, 10, 11, 12]
              ]
# Access the first list of the lists
# Access the last item in the last list
# Access the middle 2 elements of the second list

Dictionaries are also accessed by []-notation.

They can be iterated over and iteration happens over the keys. *Version 2.7 works differently*

In [44]:
dict_ = {
        "key": "value",
        (2, 3): 5,
        5: False
        }

dict_["yet_another_key"] = "value"

for key in dict_:
    val = dict_[key]
    print(str(key) + " " + str(val))

key value
(2, 3) 5
5 False
yet_another_key value


What happens if you try to access an index or a key that does not exist? Try it out:
![Python image](//csc-it-center-for-science.github.io/python-data/assets/img/python-50px.png "python")

In [None]:
list_ = [1, 2]
dict_ = {"only_key": 0}
# go ahead and try accessing index 2 or "another_key" from the above data structures.
# in fact, try accessing an index in a dict and a key in a list and see what happens

### List comprehensions

How often do you write code that, in Python would look something like the following?

    #you have list1 defined
    list2 = []
    for item in list1:
        list2.append(some_function(item))
        
Python programmers have determined that the answer to that is "way too often". This is why there 
is a special syntax for something called a **list comprehension**.

In [61]:
#range is another built-in that returns an iterable
list1 = range(10)

def square(x):
    return x*x

list2 = [square(number) for number in list1]
list2

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

List comprehensions are the idiomatic Python way of dealing with cases like this. There is extra syntax to add filtering to make the expression more powerful.

In [63]:
def even(x):
    return x % 2 == 0

list3 = [square(number) for number in list1 if even(number)]
list3

[0, 4, 16, 36, 64]

## Exceptions and try/except

A lot of things in Python raise exceptions. It's idiomatic to assume keys and indices are present and catch errors. This coding style is called "easier to ask forgiveness than permission" and sometimes abbreviated EAFP. The language supports this by making exception handling a relatively fast task unlike some other languages.

    variable = 0
    output = 0
    try:
        output = 5/variable
    except ZeroDivisionError as error:
        print("stupid programmer tried to divide by zero: " + str(error))
    except Exception as exc:
        print("something totally unexpected happened" + str(error)
    finally:
        print("output value is "+ str(output))
        
The syntax reads as follows: the code inside the try-block is run. If an exception is raised, the except-statements are read through until the first one matches and that is executed. All exceptions and errors inherit the Exception class (more on classes later) and a subclass matches it's parent classes so remember to order the exceptions from the specific to the generic if catching many exceptions.

The code inside the finally block is guaranteed to run, whether or not an exception ocurred.

It's also possible to catch multiple exceptions in the same except

    try:
        call_a_function()
    except (RuntimeError, TypeError, NameError):
        pass
 
While convenient it's often desirable to act differently when encountering different exceptions or to just catch a superclass that contains most likely exceptions (IOError is a good example of this).

Sometimes it's a good idea to intercept an exception and and then re-raise it to inform other parts of the software.

    try:
        call_a_function()
    except IOError as err:
        print("the world is flawed! " + str(err))
        raise



# Classes

Python is a so-called **multiparadigm programming language**. Unlike e.g. Java or C# it's not absolutely necessary to encapsulate everything in classes. This is a benefit in short, powerful scripts but many programmers prefer to encapsulate their code in classes as the complexity increases.

The following is a minimal class

In [89]:
import math #import math functions, we'll discuss imports in a bit

class Vector(object):
    """ Implements a vector
    """
    
    def __init__(self, dimensions):
        self.dimensions =  dimensions
        
    def add(self, another):
        """ returns a new Vector. 
        
            Will raise an error if vectors don't have the same number of dimensions.
        """
        new_dims = [x+y for x, y in zip(self.dimensions, another.dimensions)]
        return Vector(new_dims)
    
    def length(self):
        return math.sqrt(sum(x ** 2 for x in self.dimensions))

p1 = Vector([3, 4]) 
p2 = Vector([1, 1]) 

p1.add(p2)

<__main__.Vector at 0x1047f1278>

The vector class we created contains an initialization method, __init__ that gets two parameters, itself and a list of dimensions 
(assumed to be valid numbers).

Note that the init method does not return anything. It just modifies the object referred to as self.

It also contains a method that creates a new vector by vector adddition.

The string representation of the vector is not very informative.

Python contains many **magic functions** that are used to simplify programming with objects.
Let's extend the object a little bit with a couple of magic methods.

In [91]:
class RepresentableVector(Vector): # inheriting another class
        
    def __repr__(self):
        """string representation of the object"""
        return "<Vector: [%s]>" % ",".join([str(x) for x in self.dimensions])
    
class ComparableVector(Vector):
    
    def __gt__(self, other):
        """ greater than comparison for using the > operator """
        return self.length() > other.length()
    def __ge__(self, other):
        """ greater than or equal comparison using >= """
        return self.length() >= other.length()
    def __lt__(self, other):
        """less than comparison"""
        return self.length() < other.length()
    def __le__(self, other):
        """less than or equal"""
        return self.length() <= other.length()
    
    def __eq__(self, other):
        """equality comparison for the == operator. Two vectors are equal if all their dimensions are equal."""
        return all(a == b for a, b in zip(self.dimensions, other.dimensions))
    
class SmartVector(RepresentableVector, ComparableVector): #multiple inheritance
    """ 
    """
    
    def __len__(self):
        """ this is the length of the vector i.e. the number of dimensions
        """
        return len(self.dimensions)
    
    def __getitem__(self, key):
        """accesses the N:th dimension in the vector starting from 0"""
        return self.dimensions[key]
    
    def __setitem__(self, key, value):
        """sets the N:th dimension in the vector"""
        self.dimensions[key] = value
    
    def add(self, other):
        #here we call the same vector of one of the parent classes
        dumb_vector = super(SmartVector, self).add(other)
        return SmartVector(dumb_vector.dimensions)
    
    def __add__(self, other):
        """vector addition with the + operator"""
        return self.add(other)
    
    def __mul__(self, scalar):
        """vector multiplication with a scalar"""
        return SmartVector([x*scalar for x in self.dimensions])


    def __neg__(self):
        """vector negation with the minus (-) operator"""
        return self * -1
    
    def __sub__(self, other):
        """vector subtraction"""
        return self.add(-other)
    


## Now you can do all kinds of cool stuff
vector1 = SmartVector([1,1])
vector2 = SmartVector([2,3])

# Like addition of Vector objects
vector3 = vector1 + (vector2*3)
vector3[0] = 20
# multiplication 
vectors = [vector1*n for n in range(5)]

vectors.append(vector3)

#And sorting too!
vectors.sort(reverse=True)
vectors


[<Vector: [20,10]>,
 <Vector: [9,9]>,
 <Vector: [8,8]>,
 <Vector: [7,7]>,
 <Vector: [6,6]>,
 <Vector: [5,5]>,
 <Vector: [4,4]>,
 <Vector: [3,3]>,
 <Vector: [2,2]>,
 <Vector: [1,1]>,
 <Vector: [0,0]>]

If you want, you can take the Vector objects out for a spin in the box below. If you're pressed for time don't sweat it, though.

In [18]:
## Optional space for fun Vector coding

## Importing and the Python ecosystem

Python has a nice, clean and legible syntax and it's quite usable as a programming language.

The strength of Python for rapid software development comes from the extensive **[standard library](https://docs.python.org/3/library/)** and all the packages contributed by people into the **[Python Package Index (PyPI)](https://pypi.python.org/pypi)** . 

Need to open a zipped file? The [gzip](https://docs.python.org/3/library/gzip.html), [bz2](https://docs.python.org/3/library/bz2.html) and [zipfile](https://docs.python.org/3/library/zipfile.html) libraries will be helpful. 

The file is a csv and you want to parse it? [csv](https://docs.python.org/3/library/csv.html) is the library to use.

In general, the libraries in the standard library tend to work, be maintained and solid.

The libraries in the *ecosystem* as a whole can be either **stellar or of poor quality** or anything in between and it's a good idea to check the activity of a project before committing to use it in a real-world project. 

Some good examples of widely used high-quality packages are 
* [Django](https://www.djangoproject.com/), a web framework, 
* [Requests](http://docs.python-requests.org/en/master/), a library that makes HTTP a fun protocol to use
* [scikit-learn](http://scikit-learn.org/stable/), a machine learning library
* [numpy](http://www.numpy.org/), a near-replacement for Matlab and a Swiss army knife for computing in general
* [nltk](http://www.nltk.org/), the Natural Language ToolKit

OK, let's take it out for a spin.

The following snippet gets a CSV file from the Internet (hosted in Github Pages), parses it and prints the rows in the file. The page contents returned by *requests* is a string and we want to make it look like a file-like object for *csv* so we need to use a *StringIO* object to make a string look like a file.

In [14]:
import requests
import csv
from io import StringIO

page = requests.get("https://csc-it-center-for-science.github.io/python-data/assets/data/example.csv")
reader = csv.DictReader(StringIO(page.text))
for line in reader:
   print(line)

{' val2': ' 1', 'id': 'example', ' val1': ' 0'}
{' val2': ' 3', 'id': 'example2', ' val1': ' 2'}


This apparent simplicity of using multiple libraries to create short, powerful scripts/programs is typical of Python.

Now try it out yourself. 

There is a JSON encoded file at https://csc-it-center-for-science.github.io/python-data/assets/data/example.json

**Read** the file using *requests*, parse the resulting JSON and print the parsed object.

**Hint**: this is so common that *requests* supports it out of the box. This is the so called batteries included principle.

Read [the documentation](http://docs.python-requests.org/en/master/) to find out the simplest way to do it. It is, of course permitted to use the built-in *json* library as well.

![Python image](//csc-it-center-for-science.github.io/python-data/assets/img/python-50px.png "python")

In [16]:
import requests
page = requests.get("https://csc-it-center-for-science.github.io/python-data/assets/data/example.json")
print (page.json())


{'nobody expects': 'the Spanish Inquisition'}


## The Python Philosophy

That's basically it! We glossed over working with files and string formatting and some of the finer details of the language like lambdas, but this is basically all there is to know about Python-the-language. Python-the-ecosystem is a lifelong journey and you're welcome to join the rest of us on it. 

You can only learn by doing and Python makes doing easy!


As the last task, type

        import this
        
into the input below and find an easter egg hidden in all modern Python interpreters. Some of the wording is an inside joke, but most of it bears thinking on.

![Python image](//csc-it-center-for-science.github.io/python-data/assets/img/python-50px.png "python")