# Python
## By Ian Allison (Compute Canada, May 2020), 
(with some addtions and changes for Geo-Machine Learning Course)

<table align="left">
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/dtrad/geoml_course/blob/master/TypesSolved.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
</table>

## Types

Data in python takes the form of _objects_. These can be a combination of intrinsic object types, classes and other objects built in python or external objects imported from libraries etc. In general an object will have some value(s) and associated operations. You can use the `type` keyword to examine the type of just about any object.

* [Numbers](#Numbers)
* [Strings](#Strings)
* [Lists](#Lists)
* [Dictionaries](#Dictionaries)
* [Tuples](#Tuples)
* [Sets](#Sets)
* [Files](#Files)
* [Other Types](#Other-Types)


In Python3, everything is an object, and the objects that come as part of the language standard are pretty comprehensive and powerful. For short programs, you can often get away with using only intrinsic types

### Numbers
  
For literals, you can just treat python as a calculator and start feeding it numbers. The usual integer and floating point things are available. For integer arithmetic you don't need to worry about precision, python integers have arbitrary length. The usual type casting rules apply and you have the following operations

 * Ordinary arithmetic: +, -, *, **, / (Use // if you need integer division in Python3)
 * Bitwise operators: >>, &, | etc.
 * Functions: pow, abs, round, int, hex, bin, etc.

The language also includes the `math` and `random` modules in the standard library, but for the most part it's best to use numpy for anything beyond basic arithmetic.

In [1]:
z=complex(1,2)
print(z)
print(z.conjugate())

(1+2j)
(1-2j)


In [2]:
from fractions import Fraction

In [3]:
Fraction(2/4)

Fraction(1, 2)

In [4]:
Fraction(2.25)

Fraction(9, 4)

### Strings

In Python 3, there are two types that represent sequences of characters: bytes and strings. \
Instances using bytes are 8-bit values.\
Instances of strings use Unicode characters.\
Unicode is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. \
All strings are unicode in Pythøn3, this is wonderful because it lets us type whatever needed in any language.\
The downside of this is you end up encoding and decoding everything. If you know the unicode for the thing you want to type you can just enter it as e.g. "\u265E"


In [5]:
"\u265E"

'♞'

In [7]:
s="1234"

We can also use index notation to examine parts of the string. This notation is *very* widely used in the Python ecosystem. It is worth mastering before moving on. The general syntax is `sequence[start:stop:stride]`. `start` is inclusive so the character at position `start` _will_ be included, but `stop` is exclusive! If you omit the `start` value, the start of the string (position `0` in python) is assumed, if you omit `stop`, the end of the string (inclusive) will be assumed, and if you omit the `stride`, 1 will be assumed.

In [8]:
s[0:4:2]

'13'

You can also use negative numbers to specify locations relative to the _end_ of the string with negative numbers

In [9]:
s[-2]

'3'

Strings are immutable, once they have been created you can read values from them, but you can't update them in place. \
This isn't as limiting as it sounds because you can assign your transformed strings to a new (or the same) variable.

In [10]:
s[0]=8

TypeError: 'str' object does not support item assignment

Stings also have lots of methods which you can use to transform the. Try typing s.<TAB> and see what Jupyter suggests in the completions

In [11]:
s.replace('1','9')

'9234'

### Lists

Lists might be the most flexible collection object in Python. They are ordered collections which you can fill them with different types of objects (strings, numbers, other lists, etc.). You can iterate over them, add + remove elements etc. To make a list you surrond the elements with square brackets and separate items by commas

In [12]:
L=[1,'two',3.0]
L

[1, 'two', 3.0]

You can use the same indexing syntax as before...

In [13]:
L[0:3:2]

[1, 3.0]

Lists are mutable, so you can update them in place

In [14]:
L[0] = 'one'
L

['one', 'two', 3.0]

You can also `.append` to them, `.pop` items off the end, etc. Create a list and hit <TAB> twice to get some ideas

In [15]:
L.append('four')
L

['one', 'two', 3.0, 'four']

In [18]:
L.reverse() # N.B. This will reverse in place, try running it twice
L

['four', 3.0, 'two', 'one']

In [19]:
L.reverse()
L

['one', 'two', 3.0, 'four']

Lists are iterable so you can easily loop over them, e.g.

In [20]:
for number in L:
    print(number)

one
two
3.0
four


Before we move on, one common idiom you will see for iterating over lists is the "list comprehension". This is a quick and neat way of building a list (it returns a list):

In [21]:
L2 = [2*number for number in range(4)]
L2

[0, 2, 4, 6]

It is possible to add conditions in the for loop or to put lots of logic in the expression at the beginning but this is usually a bad idea, they can quickly become unreadable.

### Dictionaries

Together, dictionaries and lists cover most of the collections you will see in python. A Dictionary is an (unordered!) list of key:value pairs. A HUGE portion of programming problems you will face boil down to implementing some sort of hash lookup and this is where dictionaries excel. You can create dictionaries with the curly brace notation

In [22]:
D = {'two':2,'three':3}
D.values()

dict_values([2, 3])

The same square bracket notation is used to get items out of the dictionary

In [23]:
D['three']


3

Dictionaries are mutable so you can change items in place

In [24]:
D['three']=1
D

{'two': 2, 'three': 1}

In [25]:
D['four']=4
D

{'two': 2, 'three': 1, 'four': 4}

Dictionaries have lots of methods (look at Documentation<TAB>), but one particularly important is items(). 
There is also a concept of iterating over the dictionary (in a few distinct ways) but dictionaries aren't ordered. If you _need_ them to be ordered, you can work around this by sorting the keys (or values) in the loop

In [26]:
for key, value in D.items():
    print(f"{key}: {value}")

two: 2
three: 1
four: 4


In [27]:
for key,value in sorted(D.items()):
    print(f"{key}: {value}")

four: 4
three: 1
two: 2


Dictionaries also have a comprehension idiom for quickly creating simple dictionaries, but again, use it sparingly and where it won't cause confusion.

In [28]:
D2 = {str(i): i for i in range(5) }
D2

{'0': 0, '1': 1, '2': 2, '3': 3, '4': 4}

The `in` keyword will test for the existence of keys in a dict, e.g.

In [29]:
'2' in D2

True

In [30]:
'33' in D2

False

The ** in front of a variable is used to indicate both the key and the value in a dictionary.
Using that syntaxis you can manipulate dictionaries.
Also this ** is used in function variables to indicate that the input is an arbitrary number of keys and values

In [32]:
x = {'a': 1, 'b': 2}
y = {'b': 3, 'c': 4}
z = {**x, **y}
print(x)
print(y)
print(z)

{'a': 1, 'b': 2}
{'b': 3, 'c': 4}
{'a': 1, 'b': 3, 'c': 4}


### Tuples

Tuples are conceptually similar to Lists, but they are immutable. This makes them much more efficient in some contexts; you know they aren't going to be modified and you'll quite often see functions and methods returning tuples rather than lists. The syntax for creating them is very similar to lists, but with parentheses

In [33]:
T=(1,2,3)

Tuples kind-of have a notion of a comprehension, but they return a generator rather than all of the complete collection. Generators are a bit more of an advanced topic but basically they implement the idea of a stream with a .next() method which will let you walk along generating new values "lazily", only when they are requested.

In [34]:
a = (i for i in range(4))
print(a)
print(list(a))
print(type(a))
print(type(list(a)))

<generator object <genexpr> at 0x7ff8db3e4c80>
[0, 1, 2, 3]
<class 'generator'>
<class 'list'>


Once a generator has yield all its contents it is empty.

In [35]:
print(list(a))

[]


Tuples are often created implicitly to wrap several numbers, for example when returning from a function (only one variable can be returned but can be a tuple)

In [36]:
import numpy as np
def giveMaxAndMin(x):
    return x.max(),x.min()
x=np.array([1,2,4])
a,b=giveMaxAndMin(x)
print(type(a),type(b))
print(a,b)


<class 'numpy.int64'> <class 'numpy.int64'>
4 1


In [37]:
c=giveMaxAndMin(x)
print(type(c))
print(c)

<class 'tuple'>
(4, 1)


Tuples are created implicitily in many other python statements, whenever several values have to wrapped in one variable.

In [38]:
a=np.linspace(0,1,10)
b=np.power(a,2)
[print("%.1f -> %.2f"%(x,y)) for x,y in zip(a,b)]
c=zip(a,b).__next__()
print(type(c))

0.0 -> 0.00
0.1 -> 0.01
0.2 -> 0.05
0.3 -> 0.11
0.4 -> 0.20
0.6 -> 0.31
0.7 -> 0.44
0.8 -> 0.60
0.9 -> 0.79
1.0 -> 1.00
<class 'tuple'>


In [39]:
zip?

[0;31mInit signature:[0m [0mzip[0m[0;34m([0m[0mself[0m[0;34m,[0m [0;34m/[0m[0;34m,[0m [0;34m*[0m[0margs[0m[0;34m,[0m [0;34m**[0m[0mkwargs[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m     
zip(*iterables) --> A zip object yielding tuples until an input is exhausted.

   >>> list(zip('abcdefg', range(3), range(4)))
   [('a', 0, 0), ('b', 1, 1), ('c', 2, 2)]

The zip object yields n-length tuples, where n is the number of iterables
passed as positional arguments to zip().  The i-th element in every tuple
comes from the i-th iterable argument to zip().  This continues until the
shortest argument is exhausted.
[0;31mType:[0m           type
[0;31mSubclasses:[0m     


### Sets

Sets are not very common, but they can be useful in some context. They are mutable so you can add new items to them, but if the item already exists in the set no change is made. This can be useful because the items of the set are unique by construction. If you look at the methods on a set you will see the usual set operations, intersection, union etc.

In [40]:
a = set((1, 2, 3))

a == set((1, 2, 3, 3))

True

### Files

There are many different ways to interact with files.
We can use the same syntaxis as C or matlab:

In [41]:
f = open('myfile.txt', 'w') # w is the mode of the operation, write in this case
f.write("GOPH699\n")
f.write("Lecture 02\n")
print(type(f))
f.close()

<class '_io.TextIOWrapper'>


We opened in write mode, this mode will clobber existing files called 'myfile.txt' in this directory. Take a look at the help for `open` for append and other modes. The close() method at the end is important, it will flush any remaining output to the file consistently.

In [42]:
f = open('myfile.txt', 'r')
contents = f.read() # This reads the _whole_ file! Be careful
f.close()
print(type(contents))
print(contents)

<class 'str'>
GOPH699
Lecture 02



In [44]:
!cat myfile.txt

GOPH699
Lecture 02


By default open will work with text files, but you can use modes `w+b`, `r+b` to deal with binary files 

Before we move on there is one commoon idiom you will see when opening files

### Other Types

There are lots of other types, but the two most common you will see are Booleans and the special type None.\
Booleans take two values `True` or `False`. And they behave as you would expect

`None` is kind of a placeholder similar in spirit to NULL from other languages. You will often see it in tests (e.g. `if databaseConnection is None:`)
or use it as the input to an argument you want to ignore.

In [45]:
0 == None

False