### Introduction to Week 2
This week, we will look a little more into manipulating numerical variables and structures.

For the second half of the lesson, we will look at some more data structures (tuples and dictionaries), indexing, and list comprehensions. 

We will close it out by looking at how Libraries work, and some basic file handling.

#### Part 1.1: Numerical variables II

In [1]:
# we can also set a variable equal to a number
a = 1

In [2]:
print(a)

1


In [3]:
# We can use this fact to do basic mathematical operations. Addition:
b = 3
c = a + b
print(c)

4


In [4]:
# subtraction
d = b - a
print(d)

2


In [5]:
# multiplication
e = d * c
print(e)

8


In [6]:
# division
f = e / b
print(f)

2.6666666666666665


In [7]:
# finding the remainder of a division:
g = e % b
print('the remainder of %s divided by %s is %s' % (e, b, g))

the remainder of 8 divided by 3 is 2


In [8]:
# raising items to the power of another item
print(e)
print(b)
print('%s cubed is %s' % (e, e ** b))

8
3
8 cubed is 512


In [9]:
# we can start to do more fancy operations with numpy, e.g. matrix math
import numpy as np

In [10]:
Arr1 = np.array([[3,4,5],[5,7,8]])

In [11]:
Arr1

array([[3, 4, 5],
       [5, 7, 8]])

In [12]:
Arr2 = np.array([[2,2,2],[2,2,2]])
Arr2 * Arr1

array([[ 6,  8, 10],
       [10, 14, 16]])

In [13]:
# arrays need to be the same shape to be broadcast together. For that, we can use commands like reshape / transpose

In [14]:
Arr2 = np.array([[1,2],[8,7],[5,7]])

In [15]:
Arr2

array([[1, 2],
       [8, 7],
       [5, 7]])

In [16]:
Arr1 * Arr2

ValueError: operands could not be broadcast together with shapes (2,3) (3,2) 

In [17]:
Arr2 = Arr2.transpose()

In [18]:
Arr2

array([[1, 8, 5],
       [2, 7, 7]])

In [19]:
Arr2 * Arr1

array([[ 3, 32, 25],
       [10, 49, 56]])

In [20]:
# Generate a frame of all zeroes with np.zeros()
np.zeros([3,3])

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

In [21]:
# Populate the numpy array with (x,y) style indexing. Note - it is flipped! rows first then columns
empty_frame = np.zeros([3,3])
empty_frame

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

In [22]:
empty_frame[2][2] = 9
empty_frame[0][2] = 4
empty_frame

array([[0., 0., 4.],
       [0., 0., 0.],
       [0., 0., 9.]])

### Part 2: Data Structures II: Tuples, Lists II, Dictionaries

#### Part 2.1: Tuples

In [23]:
# tuples are a fundamental data structure. They are data, contained within parantheses
I_am_tuple = (4,5)
I_am_also_a_tuple = ('apex','legend')

In [24]:
# They can have more than 2 elements:
mega_tuple = (4,5,6,7,8,9)

In [25]:
# you can reference within them:
mega_tuple[0]

4

In [26]:
mega_tuple[3]

7

In [27]:
mega_tuple[-1]

9

In [28]:
# 'tuples' are immutable. This means you can't edit them. 
mega_tuple[0] = 7

TypeError: 'tuple' object does not support item assignment

In [29]:
# They are iterable, however:
for i in mega_tuple:
    print(i)

4
5
6
7
8
9


In [30]:
# The numerical ordering of tuples can be quite a useful property, if data is stored in a regular way. 
# Let's make a small phone book example

In [31]:
Frodo = ('Frodo','+202 569 8745','frodo@baggins_shire.com')
Sam = ('Sam', '+202 456 5646', 'sam@samwise_gamgee.com')

In [32]:
# Let's print just the people's names:
for person in [Frodo, Sam]:
    print(person[0])

Frodo
Sam


In [33]:
# Let's print just the people's emails:
for person in [Frodo, Sam]:
    print(person[2])

frodo@baggins_shire.com
sam@samwise_gamgee.com


In [34]:
# Let's print out s formatted string:
for person in [Frodo, Sam]:
    print('My name is %s, and you can call me on %s or email me at %s' % (person[0], person[1], person[2]))

My name is Frodo, and you can call me on +202 569 8745 or email me at frodo@baggins_shire.com
My name is Sam, and you can call me on +202 456 5646 or email me at sam@samwise_gamgee.com


#### Part 2.2: Dictionaries

In [35]:
# A dictionary is another fundamental data structure. The value of a dictionary is the ('key':'value') organisation, much like
# a standard dictionary!

In [36]:
fellowship = {'hobbit_1':'Frodo',
             'hobbit_2':'Sam',
             'hobbit_3':'Pippin',
             'hobbit_4':'Merry'}

In [37]:
# we can call the values from the keys:
fellowship['hobbit_1']

'Frodo'

In [38]:
# we can also generate lists of both the keys:
fellowship.keys()

dict_keys(['hobbit_1', 'hobbit_2', 'hobbit_3', 'hobbit_4'])

In [39]:
# ...and the values:
fellowship.values()

dict_values(['Frodo', 'Sam', 'Pippin', 'Merry'])

In [40]:
# we can store complex objects in the dictionary in nested structure
fellowship_contact_info = {'Frodo':Frodo,
                          'Sam':Sam}

In [41]:
fellowship_contact_info['Frodo']

('Frodo', '+202 569 8745', 'frodo@baggins_shire.com')

In [42]:
# We can also reference objects within objects:
fellowship_contact_info['Frodo'][2]

'frodo@baggins_shire.com'

In [43]:
# remember that the components of this dictionary are tuples.
# alternatively, we could use dictionaries of dictionaries:

Frodo = {'name':'Frodo','cell':'+202 569 8745','email':'frodo@baggins_shire.com'}
Sam = {'name':'Sam','cell':'+202 456 5646','email':'sam@samwise_gamgee.com'}

# Put reformatted dicts into the contact_info dict instead of the tuples
fellowship_contact_info = {'Frodo':Frodo,
                          'Sam':Sam}

In [44]:
fellowship_contact_info['Sam']

{'name': 'Sam', 'cell': '+202 456 5646', 'email': 'sam@samwise_gamgee.com'}

In [45]:
fellowship_contact_info['Sam']['cell']

'+202 456 5646'

In [46]:
fellowship_contact_info['Sam']['email']

'sam@samwise_gamgee.com'

In [47]:
# This is actually the structure of a JSON file - which is organized as a dictionary of nested dictionaries!

#### Part 2.3 Lists II

In [48]:
# Lists can contain many different types of objects - even other lists
A = ['Mary','Kate','Clare']
B = ['Banana','Apple','Mango']
mega_list = [A,B]

In [49]:
mega_list

[['Mary', 'Kate', 'Clare'], ['Banana', 'Apple', 'Mango']]

In [50]:
# we can also reference individual objects inside other objects if they behave in the same way
mega_list[0][0]

'Mary'

In [51]:
mega_list[0][2]

'Clare'

In [52]:
mega_list[1][2]

'Mango'

In [53]:
# We don't have to just put lists inside of lists - they are very flexible containers:
phone_numbers = {'Clare': '01789', 'Mary': '01123', 'Kate': '01456'}
emails = {'Clare': 'clare@madeup.org', 'Mary': 'Mary@madeup.org', 'Kate': None}

In [54]:
information = [phone_numbers, emails]

In [55]:
# the referencing happens from left to right, in terms of the layers of the object. 
information[1]

{'Clare': 'clare@madeup.org', 'Mary': 'Mary@madeup.org', 'Kate': None}

In [56]:
# the referencing behaviour follows the behaviours required of the object. 
# lists objects can only be referenced by the 'index' the numerical position of the object within the list, counting up from 0
# dictionaries can't be indexed this way, as we have just learned - they are indexed by the 'key'.
# The first example below will fail, as 0 is not a key in the dictionary
information[1][0]

KeyError: 0

In [57]:
information[1]['Mary']

'Mary@madeup.org'

In [58]:
# We aren't restricted to just one level... We can go as many levels deep as we want:

In [59]:
A = [[1,2,3],[4,5,6]]
B = [[65,66,67],[456,765,123]]
C = ['a surprise string']

In [60]:
D = [[A,B],C]
D

[[[[1, 2, 3], [4, 5, 6]], [[65, 66, 67], [456, 765, 123]]],
 ['a surprise string']]

In [61]:
D[0]

[[[1, 2, 3], [4, 5, 6]], [[65, 66, 67], [456, 765, 123]]]

In [62]:
D[1]

['a surprise string']

In [63]:
D[0][0]

[[1, 2, 3], [4, 5, 6]]

In [64]:
D[0][0][1]

[4, 5, 6]

In [65]:
D[0][0][1][2]

6

In [66]:
# The same is also true of dictionaries:

In [67]:
phone_numbers = {'UK':{'Clare': '01789', 'Mary': '01123', 'Kate': '01456'},
                 'USA':{'Candice': '04568','Shirley':'041478','Dolly':'4815241'}}
emails = {'UK':{'Clare': 'clare@madeup.org', 'Mary': 'Mary@madeup.org', 'Kate': None},
         'USA':{'Candice': 'candice@USAUSAUSA.org','Shirley':'shirley@USAUSAUSA.org','Dolly':'dolly@DParton.org'}}
contact_info = {'cell':phone_numbers,'email':emails}

In [68]:
contact_info['cell']['USA']['Shirley']

'041478'

In [69]:
contact_info['email']['USA']['Shirley']

'shirley@USAUSAUSA.org'

In [70]:
# the order is important! When we mess with it, we will not get the same result
contact_info['email']['Shirley']['USA']

KeyError: 'Shirley'

In [71]:
# Python is very good about letting you use whitespace and line returns to break up ugly blocs of information. 
# The above can be represented as:

phone_numbers = {'UK':
                     {'Clare': '01789', 
                      'Mary': '01123', 
                      'Kate': '01456'},
                 'USA':
                     {'Candice': '04568',
                      'Shirley':'041478',
                      'Dolly':'4815241'}
                }

emails = {'UK':
              {'Clare': 'clare@madeup.org', 
               'Mary': 'Mary@madeup.org', 
               'Kate': None},
         'USA':
              {'Candice': 'candice@USAUSAUSA.org',
               'Shirley':'shirley@USAUSAUSA.org',
               'Dolly':'dolly@DParton.org'}
         }

contact_info = {
                    'cell':phone_numbers,
                    'email':emails
                                   }

# this can really help us keep track of what is going on! The tabs at each level must be consistent, however

### Part 3: Libraries and File Handling

#### 3.1 Importing Libraries

In [72]:
# Most of the work you do in python will not be your own code. 
# You will be importing other people's code ("libraries") all over the place. 
# The way we do this is through the import statement - a special word in python

In [73]:
import os

In [74]:
# The above line imports a standard library - the 'os' or 'operating system' library. 
# This library comes pre-installed with python
# Others you will have to install yourself using pip or conda - more on that later

In [75]:
# sometimes, you will not want the entire library as it is big (e.g. scikit learn). You may want a small part of the libary. 
# handily, most libararies are organized into sub folders. 
# In each case you are importing either functions or classes; if you know the name of what you want, you can just import that one item:

In [76]:
from shapely.geometry import LineString

In [77]:
# Here, I import from the shapely library, looking in the 'geometry' sub-module, for the object called 'LineString'

In [78]:
# If you don't know what you are looking for, you can use the * operator to import everything from a location:

In [79]:
from shapely.geometry import *

In [80]:
# along a similar vein, the following two are equivalent

In [81]:
import shapely 

In [82]:
from shapely import *

In [83]:
# You will often see contractions of library names in code. 
# These are defined in the import statement through the use of the 'as' specialized word. 
# Some common examples:

In [84]:
import numpy as np # libary for maths / matrix handling
import pandas as pd # python implementation of referenacble tabular data structures (excel, but in python)
import geopandas as gpd # geospatial version of pandas, for handling data with location attributes (excel+)

In [85]:
# A library is simply a saved .py file that contains functions and classes. It is nothing more complicated than that. 
# If you want to be able to import functions and classes you have written before in other projects, follow these steps:

In [86]:
# assume a file called filename.py in a location saved on your PC. Importing it works like this:

import sys # import system library
path_to_my_previous_work = r'C:\Users\charl\Documents\GitHub\GOST_PublicGoods\GOSTNets\GOSTNets' # set path to filename.py
sys.path.append(path_to_my_previous_work) # add path to your libary to the system path (special! ask C.F. about this...)

import filename
OR
import filename as fn

ModuleNotFoundError: No module named 'filename'

In [87]:
# dir command is very useful for inspecting the layout of libraries and what you can do with them

In [88]:
dir(shapely)

['__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__path__',
 '__spec__',
 '__version__',
 'affinity',
 'algorithms',
 'coords',
 'ctypes_declarations',
 'errors',
 'ftools',
 'geometry',
 'geos',
 'impl',
 'linref',
 'ops',
 'predicates',
 'prepared',
 'speedups',
 'topology',
 'wkb']

In [89]:
# things starting with double underscore are attributes - not sub-folders or objects

In [90]:
shapely.__version__

'1.6.4.post2'

In [91]:
shapely.__path__

['/anaconda3/envs/geo5/lib/python3.6/site-packages/shapely']

In [92]:
dir(shapely.geometry)

['CAP_STYLE',
 'GeometryCollection',
 'JOIN_STYLE',
 'LineString',
 'LinearRing',
 'MultiLineString',
 'MultiPoint',
 'MultiPolygon',
 'Point',
 'Polygon',
 '__all__',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__path__',
 '__spec__',
 'asLineString',
 'asLinearRing',
 'asMultiLineString',
 'asMultiPoint',
 'asMultiPolygon',
 'asPoint',
 'asPolygon',
 'asShape',
 'base',
 'box',
 'collection',
 'geo',
 'linestring',
 'mapping',
 'multilinestring',
 'multipoint',
 'multipolygon',
 'point',
 'polygon',
 'proxy',
 'shape',
 'shapely']

#### Part 3.2 File Handling

In [93]:
# methods of importing files will depend on the file type. This is not as easy as double-clicking on something in the 
# Microsoft office suite! Different libraries have different methods for loading data

In [94]:
# This is standard practice for .csv file types

In [95]:
import pandas as pd

file_name = r''
file_path = r''
full_path = os.path.join(file_name, file_path)

df = pd.read_csv(full_path)

FileNotFoundError: [Errno 2] File b'' does not exist: b''

In [96]:
# We can also use this to open excels as well, with a couple of small modifications
file_name = r''
file_path = r''
full_path = os.path.join(file_name, file_path)

df = pd.read_excel(full_path)

ImportError: Install xlrd >= 1.0.0 for Excel support