<table align="left">
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/dtrad/geoml_course/blob/master/Practice2_IntroToPythonFull.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
</table>

# Introduction to Python
* By Ian Allison (Compute Canada, May 2020), (with some additions and changes for GOPH699.50).

For the most part Python is very forgiving and intuitive. You can get pretty far by just experimenting and doing a few google searches. The files in this directory introduce some of the basic ideas that we will be using for the other notebooks. If you are really lost, [this list of tutorials](https://wiki.python.org/moin/BeginnersGuide/Programmers) is probably a better place to start. With Jupyter and Collab, you can skip over all of the installation instructions and just start trying some stuff out in the cells below. 

There is a companion notebook in this directory called [Types](./TypesSolved.ipynb) which delves into the python type system (along with some other stuff), you can probably guess enough about types to read this notebook first, but it is worth checking out the python collective types (lists, dictionaries, sets etc.), they are used _everywhere_ and can make your life much easier once you have them down.

* [Program Structure](#Python-program-structure)
* [Conditionals](#Conditionals)
* [Loops](#Loops)
* [Functions](#Functions)
* [Printing](#Printing)
* [Classes](#Classes)
* [Generators](#Generators)
* [Types](./Types.ipynb) (including collections)
 

# Python program structure

The Python program structure is:

1. Programs are composed of modules
1. Modules contain statements
1. [Statements](https://docs.python.org/3/reference/simple_stmts.html) contain expressions
1. Expressions create and process objects

The distinction between expressions and statements is a bit blurry, but here is a definition:
> A statement is a complete line of code that performs some action, while an expression is any section of the code that evaluates to a value. Expressions can be combined “horizontally” into larger expressions using operators, while statements can only be combined “vertically” by writing one after another, or with block constructs.
Defining some of these terms

## Assigments

In python, assignments are done with `=`. Names are created when you make an assignment and you don't need to worry about declaring types

In [62]:
a = 1

You can also assign multiple elements at the same time in either tuple or list notation

In [63]:
a = 1; b = 2
c, d = (1, 2.2)
type(c),type(d),type((c,d))

(int, float, tuple)

And you can assign to multiple names at the same time

In [64]:
a = b = 1

Python also supports the idea of Sequence assignments. If the thing on the right hand side can be considered a sequence (e.g. a string is a sequence of letters) the assignment will still work...

In [65]:
a, b, c, d, e = 'apple'
print(type(a))
print(type(a+b+c+d+e))

<class 'str'>
<class 'str'>


Since `=` is used for assignment, another operator is needed to test for equality and python chooses `==`. The other comparison operators look similar: `>=`, `<=`, `<`, `>`, `!=`, `<>`. The last two of these are equivalent forms of "not equal", but almost everyone uses `!=`.

### Objects and Identifiers
Objects are spaces in memory that contain information.
Identifiers are references to the object.\
We associate variables to these identifiers so we can use the object, but the object exists independently
of the reference to it \
(as long as there is at least one reference to the object)


In [66]:
p=2   
b=p
print('the object is',p,'with type',type(p))
print('p identifier is',id(p))
print('b identifier is',id(b))  
print('p and b point to the same object-->',p is b)

p='two'
print('the variable p now points to a different place in memory',id(p))
print('the type of p now is',type(p))
print('the object created by b=p, i. e. ', b, ' still exists as is pointed by the variable b',id(b))
b='three'
print('now the object does not exist')

the object is 2 with type <class 'int'>
p identifier is 140323179653456
b identifier is 140323179653456
p and b point to the same object--> True
the variable p now points to a different place in memory 140322706141808
the type of p now is <class 'str'>
the object created by b=p, i. e.  2  still exists as is pointed by the variable b 140323179653456
now the object does not exist


## Conditionals

Python implements conditionals via `if`, `elif` (short for "else if") and `else`. There is no `case/switch` statement in python.

### if/elif/else

The format for if statements is 
```python
if <test>:
  <statements>
elif:
  <statements>
else:
  <statements>
```

Where only the first condition, `if`, is actually required and you can string together as many `elif`'s as you need.

In [67]:
a = 5
if a < 4:
    print("A is less than 4")
elif a == 4:
    print("A is equal to four")
else:
    print("A is greater than 4")

A is greater than 4


Python includes a `pass` statement for cases where a statement is required *syntactically*, but you don't want that statement to do anything. The `pass` statement does just that: nothing!

In [68]:
if 1 > 0:
    pass

## Loops

Python has two main types of loop, `while` and `for`. 

### While loops

Evaluate the condition at the top of the loop and if it is true, execute the body

Typical form
```python
while <test>:
    <statements>
```

In [69]:
a = 0
while True:
    a = a + 1
    if a > 10:
        break
    print(a, end=","),

1,2,3,4,5,6,7,8,9,10,

The `break`, `continue` and `pass` keywords will also modify control flow within a `while` loop. Look them up with the jupyter help system to understand more. They are useful, but can obfuscate your code so use them sparingly!

### For loops

For loops are very common in Python and are similar to `for` in other languages, but one nice twist with Python is that you can iterate over any sequence

General form:
```python
for <target> in <object>:
     <statements>
```

For the traditional for loop over integers there is a `range` keyword which will generate an arithmetic progression for you to loop over, but in general it's best to iterate over lists directly rather than indexing them. 

In [70]:
for animal in ['cat', 'dog', 'elephant']:
    print(animal, len(animal))

cat 3
dog 3
elephant 8


In [71]:
for i in range(10):
    print((i, i**2), end="\n")

(0, 0)
(1, 1)
(2, 4)
(3, 9)
(4, 16)
(5, 25)
(6, 36)
(7, 49)
(8, 64)
(9, 81)


In [72]:
range?

[0;31mInit signature:[0m [0mrange[0m[0;34m([0m[0mself[0m[0;34m,[0m [0;34m/[0m[0;34m,[0m [0;34m*[0m[0margs[0m[0;34m,[0m [0;34m**[0m[0mkwargs[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m     
range(stop) -> range object
range(start, stop[, step]) -> range object

Return an object that produces a sequence of integers from start (inclusive)
to stop (exclusive) by step.  range(i, j) produces i, i+1, i+2, ..., j-1.
start defaults to 0, and stop is omitted!  range(4) produces 0, 1, 2, 3.
These are exactly the valid indices for a list of 4 elements.
When step is given, it specifies the increment (or decrement).
[0;31mType:[0m           type
[0;31mSubclasses:[0m     


Notice that range starts at zero. Have a look at the help for `?range` and see how to change that.

**Exercise**: 
    1. Make a for loop to store numbers from 1 to 99
    1. Print in another loop items that are multiples of 10

In [73]:
a=[]
for i in range(0, 100, 1):
    a.append(i)
for i in range(len(a)):
    if ((i%10)==0): print(a[i])

0
10
20
30
40
50
60
70
80
90


**Exercise**: for the following list of country names, print the country if exists on the second list 

In [74]:
# This is a terrible data structure to use here, some better options 
# might be to flatten the list, or make it a dictionary of lists

countries = [
    ['Canada','USA', 'Mexico'],
    ['France', 'Germany', 'Romania'],
    ['Australia', 'New Zealand']
]

countries2=['Argentina','Brazil','Mexico']

In [75]:
countries[0][0]

'Canada'

In [76]:
for i in range(3):
    for country in countries[i]:    
        for country2 in countries2:
            #print(country,country2)
            if (country==country2): print(f"{country}")

Mexico


When the loop body is small and simple, you can also use a list comprehension in place of a for loop. Once you get used to the syntax these are very handy, but they can make your code a bit harder for newcomers to follow and it is easy to get carried away so use them sparingly. The syntax is

```python
[<statement in x> for x in <list>]
```
and it will generate a list of the values of `<statement in x>`. Actually you can include an optional if statement after the `<list>` to filter the list but again it's best to keep list comprehensions short and simple.

In [77]:
[x**2 for x in range(10)]

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

**Exercise:** Write a while loop which will iterate over the numbers less than 100 and print out those divisible by 10\
Rewrite the loop in one line using comprehension list

In [78]:
i = 1
while i < 100:
    if (i % 10) == 0:
        print(i)
    i += 1

10
20
30
40
50
60
70
80
90


In [79]:
x=[print(i,end=",") for i in range(100) if (i%10)==0]

0,10,20,30,40,50,60,70,80,90,

## Functions

Functions let you encapsulate and reuse logic. They also let you break down your code into chunks which you can test, debug and extend independently. 

Defining functions in Python is very easy with the def keyword

Typical form
```python
def <name>(<arguments>):
    <statements>
    return <object>
```

Here, `def` is creating some executable code and giving it the name `<name>`


In [80]:
def double(x):
    return 2 * x

double(3)

6

Adding more arguments is easy

In [81]:
def multiplyby(x, n):
    return x * n

multiplyby(5, 2)

10

The arguments and the return value(s) don't have to be simple types...

In [82]:
multiplyby("hip hip, ", 3)

'hip hip, hip hip, hip hip, '

**Exercise:**: Write a function which takes two strings as its arguments and returns a tuple where the first item is both strings concatenated and the second is their combined length.

In [83]:
def joiner(string1, string2):
    joined_string = string1 + ' ' + string2
    return (joined_string, len(joined_string))

a,b=joiner("apple", "orange")
type(a),type(b)

(str, int)

Functions are first-class citizens in Python:\
They can be passed as arguments to other functions, returned as values from other functions, and
assigned to variables and stored in data structures.

In [84]:
def myfunc(a, b):
    return a + b
#list of functions
funcs = [myfunc]
print(funcs[0])
print(funcs[0](2, 3))

<function myfunc at 0x7f9f6c3f3b80>
5


### Scope and the LEGB rule

Python uses namespaces to keep variables from clobbering (overwriting) one another and to make modules and code more portable. For example, when you define $\pi = 3$ you don't want the value defined in the `scipy` module to clobber it. With namespacing you can safely set the variable `x` in two different contexts and not have them interfere with each other. When you _want_ to have them interfere with each other, you have to understand the hierarchy of namespaces that python defines (the scope of the name `x`).

The basic hierarchy is something like this...

* **B**uilt in: e.g KeyWords `open`, `range`, ...
    * **G**lobal (module): Things at the top level of a module e.g. random inside numpy
        * **E**nclosing function locals
            * **L**ocal (function): names assigned within a function and not set global
       
The further down that list you go, the more specific the name is and the idea is that the most specific should win (like CSS etc.). \
It is usually referred to as the **LEGB** rule. As an example, if I do `from numpy import random`, then define random as a variable, my definition "wins"

In [85]:
try:
    del random
except:
    pass

In [86]:
try:
    del random
except:
    pass
from numpy import random
print(type(random))
random=3
type(random)

<class 'module'>


int

Sometimes you might want need to access a variable from one of the outer scopes, you can do this as with the `global` keyword as follows

In [87]:
x = 3
def increment_x():
    x = 0
    x += 1
    print('x inside=',x)
    
increment_x()
print(x)    


x inside= 1
3


In [88]:
## try not to do this in real programs, globals are bad practice!
x = 3
def increment_x():
    global x
    x += 1
    print('x inside=',x)
    
increment_x()
print(x)

x inside= 4
4


### Lambda functions

Python also has the idea of `lambda` functions. These are basically "anonymous functions". \
You can use them anywhere you would normally use a function, but you don't want to go to the bother of actually naming the thing. \
This sounds odd, given the description of modules I gave above, but it is sometimes useful, *I swear!* 

Let us create a function of functions without lambda first:

In [89]:
def pp(x,n):
    return x**n

In [90]:
def operateon(f, x, n):
    return f(x, n)

In [91]:
operateon(pp, 3, 4)

81

Now with lambda

In [92]:
operateon(lambda x, n: x**n, 3, 4)

81

Lambda functions typically come up where someone has written code which expects a function as one of the arguments (e.g. massaging numbers to look like dates so that pandas can ingest them). \
Similar to list comprehensions and generators, you might skip over lambda functions when first learning python but they are worth picking up at sooner or later because they can make your code much neater and more efficient.\
Also, in many cases lambda functions are problematic when switching between python 2.X and 3.X. \
Often it is possible to use comprehension lists instead (see later)

**Exercise** Write a lambda function to create multiples of the input 

In [93]:
def myfunc(n):
  return lambda a : a * n

mydoubler = myfunc(2)
mytripler = myfunc(3)

print(mydoubler(11))
print(mytripler(11))

22
33


### Function arguments

Functions normally act on arguments passed to them between the parentheses. Going beyond the simple examples above, Python adds a little flexibility to how arguments are specified to 

  1. Argument lists can be arbitrarily long and each argument can be an arbitrary python object.
  1. You can include both positional and keyword arguments. Positional arguments are just a list of names `(x, y, z)`, while keyword arguments include values (`x=1, y=2, z=3`). You can mix the types of arguments, but the positional arguments must come first.
  1. You can specify default values when writing keyword arguments. e.g If you include `x=1` in the argument list but don't include a value for `x` when calling the function, the value `1` will be used.
  1. Functions can support arbitrary numbers of positional arguments. To do this, you prefix the argument with a `*`. Inside the function you can iterate over this argument as a list.
  1. Functions can support arbitrary keyword arguments. To do this, you prefix the argument with `**`. Inside the function you can iterate over this argument as a dictionary of whatever the caller decided to pass in.
  
These last two points might sound arcane, but they are important and widely used. A good example is matplotlib where plotting functions can use hundreds of arguments. It is much easier to prepare a dictionary of all of your settings and expand that as needed.


In [94]:
def arguments(a, b, *args, c=1, **kwargs):
    print(f"a and b are required arguments: {a}, {b}")
    print(f"and c always has a value: {c}")
    for arg in args:
        print(f"I found an extra argument: {arg}")
    
    for key, value in kwargs.items():
        print(f"I found an extra keyword argument: {key}:{value}")
        
        
arguments(1, 2, 3, 4, 5, 6, fruit="banana", time="noon")
print('')
arguments(1,2,fruit="pear")

a and b are required arguments: 1, 2
and c always has a value: 1
I found an extra argument: 3
I found an extra argument: 4
I found an extra argument: 5
I found an extra argument: 6
I found an extra keyword argument: fruit:banana
I found an extra keyword argument: time:noon

a and b are required arguments: 1, 2
and c always has a value: 1
I found an extra keyword argument: fruit:pear


## Printing

In python 3, `print` is a function which can take an arbitrary number of arguments. The default is to just contatenate them (`sep=''`) and add a line break `end='\n'`

In [95]:
pi=3.14
z=complex(1,0)
print('{:.2f}'.format(pi),'{:}'.format(z))

3.14 (1+0j)


In [96]:
?print

[0;31mDocstring:[0m
print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)

Prints the values to a stream, or to sys.stdout by default.
Optional keyword arguments:
file:  a file-like object (stream); defaults to the current sys.stdout.
sep:   string inserted between values, default a space.
end:   string appended after the last value, default a newline.
flush: whether to forcibly flush the stream.
[0;31mType:[0m      builtin_function_or_method


There isn't a whole lot more to `print` itself, but it is worth looking at some other string syntaxes. My favourite one is a recent python3 addition called `f-strings`. To make an `f-string` just prepend the string with `f`, then you can surround your variables with curly braces.

In [97]:
name = "Ian"
trees = ["alder", "beach", "coconut"]

In [98]:
print(f"{name} is shorter than a {trees[-1]} tree.")

Ian is shorter than a coconut tree.


The stuff inside the curly braces can be a python expression

In [99]:
print(f"{1+1}")

2


You can split them across multiple lines by just writing them one after the other

In [100]:
print(
    f" This"
    f" is"
    f" a"
    f" long"
    f" string"
)

 This is a long string


When you need to adjust the default format, the syntax is `{name:conversion}`, where the format specifiers ar a [mini language](https://docs.python.org/3.6/library/string.html#format-specification-mini-language) defined as part of the language. e.g. a is an int, but convert it to a float before printing

Give me a fixed number of decimal places

In [101]:
f"{1/3:.3f}"

'0.333'

You can also align fields, pad them etc. Take a look at the link above then try

**Exercise**: Write a for loop to the following fixed width table.
```
   1 | 1.0000
   2 | 0.5000
   3 | 0.3333
   4 | 0.2500
   5 | 0.2000
   6 | 0.1667
   7 | 0.1429
   8 | 0.1250
   9 | 0.1111
```

In [102]:
for i in range(1, 10):
    print(f"{i} | {1/i:.4f}")

1 | 1.0000
2 | 0.5000
3 | 0.3333
4 | 0.2500
5 | 0.2000
6 | 0.1667
7 | 0.1429
8 | 0.1250
9 | 0.1111


## Classes

Object-Oriented programming in Python is a large topic, important for Machine Learning. \
The basics aren't too hard, at a bare minimum, it is worth knowing

* How to define classes.
   * Classes can have methods (functions) and attributes (variables)
* Some common methods `__str__`, `__repr__`, `__init__`, ...
* How to inherit from a class

You can get a lot done by inheriting from the right class and just tweaking a few things you need. The Jupyter ecosystem does this __a lot__.

In object oriented programming, the basic idea is to implement the logic and data of a problem with objects which share relationships. A simple example might be

In [103]:
class Vehicle:
    def honk(self):
        print("HONK!")

This is a valid class with a single method (`honk`). The idea is from this object template we should be able to create mutiple instances of Vehicle (a car, a bike, etc.). To create instances from classes you just add parenthes to the end of the class name ( `e.g. Vehicle()`). Under the hood, this calls a special method called `__init__` which can do any setup logic we need. For now we'll just use the default `__init__` method.

In [104]:
BlineBus = Vehicle()
BlineBus.honk()

HONK!


The honk method definition looks like an ordinary function definition, but with the word `self` as the first argument. The reason for this will become clear when we start thinking about the namespaces of classes and instances. Buses and Cars might go honk, but if we make a bike it could have a bell. 

There is some data that is naturally associated with *instances* of the class rather than the class itself. When we call a method on an instance, the instance itself gets passed in as the first argument (conventionally called `self`) which we can then modify as we need.

In [105]:
class Vehicle:
    def alerttype(self, value):
        self.alert = value
        
    def honk(self):
        print(self.alert)

In [106]:
BlineBus = Vehicle()
IansBike = Vehicle()
IansBike.alerttype('RING!')
BlineBus.alerttype('HONK!')

IansBike.honk()
BlineBus.honk()

RING!
HONK!


We have two objects in memory, a bike and a bus. They each have their own memory where they can store their alert noise.


We glossed over another the `__init__` method but it is worth another look. To create a vehicle, we added parenthes to the class name (`Vehicle()`). Under the hood, this called a method called `__init__` which lets us take care of any setup tasks we want our objects to have. 

When we define a new class it inherits definitions from any _superclasses_. Every class in Python inherits from a special class called `object` which defines `__init__` and a few other generic methods. In the examples above we were falling back on that definition but we could override it to set the alert type when creating the instances

In [107]:
print(isinstance(IansBike, object))
print(isinstance(IansBike, Vehicle))

True
True


In [108]:
class Vehicle:
    def __init__(self, value="Honk"):
        self.alert = value
        
    def honk(self):
        print(self.alert)

In [109]:
car1=Vehicle()
IansBike = Vehicle('RING1')
IansBike.honk()

RING1


Taking that a step futher, we could define another class which inherits from `Vehicle`. For example, all busses are a vehicles but they also have numbers so we could inherit from Vehicle and only need to add the number.

In [110]:
class Bus(Vehicle):
    def __init__(self, number, alert="HONK!"):
        super().__init__(alert)
        self.number = number
        
    def __repr__(self):
        return f"Bus: {self.number}, goes {self.alert}"

In [111]:
Bline = Bus(99)
print(isinstance(Bline,Vehicle))
Bline

True


Bus: 99, goes HONK!

### Class example 
**(from John M Stewart, Python for Scientists)** \
Below we see an example for a numerical class to operate with fractions

In [112]:
def gcdr(a,b):
    if b==0:
        return a
    else:
        return gcdr(b,a%b)
def gcd(a,b):
    while b:
        a,b=b,a%b
    return a

In [113]:
class Frac:
    """ Fractional class. A Frac is a pair of integers num, den
    with (den!=0) whose GCD (greatest common denominator) is 1.
    """
    def __init__(self,n,d):
        """ Construct a Frac from integers n and d.
            (TODO: if d=0 should output an error)
        """
        hcf=gcd(n,d)
        self.num, self.den= n/hcf, d/hcf
    def __str__(self):
        """ Generate a string representation of a frac"""
        return "%d/%d"%(self.num,self.den)  
    def __mul__(self,another):
        """ Multiply two Fracs to produce a Frac"""
        return Frac(self.num*another.num, self.den*another.den)
    def to_real(self):
        return float(self.num)/float(self.den)
        

In [114]:
a=Frac(3,7)

In [115]:
a?

[0;31mType:[0m           Frac
[0;31mString form:[0m    3/7
[0;31mDocstring:[0m     
Fractional class. A Frac is a pair of integers num, den
with (den!=0) whose GCD (greatest common denominator) is 1.
[0;31mInit docstring:[0m
Construct a Frac from integers n and d.
(TODO: if d=0 should output an error)


In [116]:
type(a)

__main__.Frac

In [117]:
print('{:.0f}'.format(a.num),'/{:.0f}'.format(a.den))

3 /7


In [118]:
print(a) # uses __str__

3/7


In [119]:
b=Frac(1,2)

In [120]:
c=a*b  #notice how the "*" can be used directly without calling "__mul__"

In [121]:
print(c)

3/14


In [122]:
print(c.to_real())

0.21428571428571427


In [123]:
d=Frac(3,2)

In [124]:
print(d*c)

9/28


## Generators 
https://static.packt-cdn.com/products/9781788995573/cover/smaller

We can create functions that do not just return one result but rather an entire sequence of results, by using the yield statement. These functions are called generators. Python contains generator functions, which are an easy way to create iterators and are especially useful as a replacement for unworkably long lists. A generator yields items rather than builds lists. For example, the following code shows why we might choose to use a generator, as opposed to creating a list:

In [125]:
import time

In [126]:
def oddGen(n,m):
    while n<m:
        yield n 
        n += 2

In [127]:
def oddLst(n,m):
    lst=[]
    while n<m: 
        lst.append(n) 
        n+=2
    return lst

In [131]:
#the time it takes to perform sum on an iterator 
t1 = time.time()
print(sum(oddGen(1,1000000)))
print("Time to sum an iterator: %f " % (time.time() - t1))

250000000000
Time to sum an iterator: 0.073734 


In [132]:
#the time it takes to build and sum a list 
t1=time.time()
print(sum(oddLst(1,1000000)))
print("Time to build and sum a list: %f " % (time.time() - t1))

250000000000
Time to build and sum a list: 0.075900 


In [144]:
#print odd list
for i in oddLst(1,10): print(i)

1
3
5
7
9


In [143]:
#print odd generator
for i in oddGen(1,10): print(i)

1
3
5
7
9


In [145]:
# creation of a list 
list1 = [10** i for i in range(1,5)]
print(type(list1))
print(list1)
for x in list1: print(x)

<class 'list'>
[10, 100, 1000, 10000]
10
100
1000
10000


In [146]:
# creation of generator object 
gen1 = (10** i for i in range(1,5))
print(gen1)
for x in gen1: print(x)

<generator object <genexpr> at 0x7f9f6c2157b0>
10
100
1000
10000


# Importing additional files or data in collab

When running notebooks in Collab, you may need acces to additional files.\
There are 2 ways to upload data:
- You can upload files from your machine using the "upload files" on the side. This is the easiest but it has to be re-done each time you close collab.
- You can mount your google drive if you have one. This requires to copy and paste your personal key from your browser but the mounting will be redone automatically.

In the second approach, you may want to use a Flag as below so they notebook does not give you an error when it is run locally

In [147]:
import sys
try:
  import google.colab
  IN_COLAB = True
except:
  IN_COLAB = False

In [148]:
print(IN_COLAB)

False


In [149]:
#To use the gcp.py function, first we need to mount our Google Drive to Google Colab:
if (IN_COLAB):
    from google.colab import drive
    drive.mount('/content/drive')
    #Then, we need to append the directory to your python path using sys:    
    sys.path.append('/content/drive/My Drive/Colab Notebooks')   #You can change the name of the last folder in case you saved the .py file in a different one

In [150]:
#For example try, to import and run the fib.py example
import fib

In [151]:
fib.fib(10)

55