# Welcome to python in jupyter notebooks!

Google colab is an easy way to work in python: it hosts a particular **python environment** which you can access using a **jupyter notebook**.


*   **Python environment**: A term used to describe the version of python you have installed, along with any extra python packages you have enabled (more on this later).
*   **Jupyter notebook**: A useful way of working with python that's a little more user friendly than coding directly into a python file.

## Important Notice: Google colab hosts its files outside RSA's secure network. Do not upload any RSA data to Google colab under any circumstances!



## Jupyter notebook breakdown

![jupyter](https://datascience.foundation/backend/web/uploads/blog/Working-with-Python-on-Cloud.png)

As opposed to being a flat code file (like writing SQL or SAS in a single file), jupyter notebooks are made up of **cells** (think excel cells for comparison). Cells can either contain **python code** or **markdown text**. 

### Markdown text
This cell contains markdown text, which uses special characters for formatting so that it can be shown correctly on various websites. 

 * Double click this cell to make the markdown version of the text appear!
 * Click away (on another cell) and the text will display as normal.

You can easily create code or text cells by pressing the ```+ Code``` and ```+ Text``` buttons towards the top of your screen. Cells can be easily deleted by clicking in a cell and clicking the bin icon on the toolbar in the upper right of the cell.


---

### Python code

![python](https://www.python.org/static/community_logos/python-logo-master-v3-TM-flattened.png)

Code cells allow pieces of code to be easily rewritten/run in isolation, so that you can make changes without needing to rewrite/run your entire code (just like running a single part of a sql/sas file). The output of a code cell appears beneath it (once run), and can also be seen by other cells in the notebook.

The cell directly below this one contains some simple python code. You can run it in a few different ways:
 * Click into the cell and click the "play" button to the far left of the cell.
 * Click into the cell and use the keyboard command ```ctrl + enter```, which is the keyboard shortcut for running a cell.
 * Click into the cell and use the keyboard command ```shift + enter```, which is the keyboard shortcut for running a cell and immediately moving to the next cell underneath it. If no cell exists, this command creates a new one.

Try running the cell below now. 

In [None]:
print('Hello world!')

The next two cells show how information can be shared between cells in the notebook: we're going to save some information in the first cell, and then print it in the second. 

Both cells have to be run for the code to succeed and show us the output. Once we have run our cell to save our customer's name, the notebook will remember it until it is overwritten or the notebook is restarted. 

In [None]:
customer_name = 'Harry Potter'

In [None]:
print(customer_name)

It's important to be careful with memory when working in a Jupyter notebook. You might need to run multiple cells together to make sure your code updates correctly!

Jupyter notebook cells have lots more interesting features, but we won't go into detail on them now. Feel free to explore and play around with code and text cells - there's no better way to understand how they work!

# Variables and Functions

We're going to start simply by talking about three types of python objects: **comments**, **variables** and **functions**.

 * Comments in python work exactly the same as comments in other programming languages. I'll use them in this notebook to help give some instruction in code cells. They can be written by beginning any line of code with the '#' character
 * Variables hold data in memory so that we can work with it easily and without needing to type it out by hand. Variables can have lots of different types, but we'll focus on the most common python types in this notebook. Hopefully a few will be familiar!
  * Int: Contains whole, integer numbers
  * Float: Contains any number, including decimals
  * String: Contains text, including single characters and full sentences
  * Bool: Contains either True or False
 * Functions are tools in python that allow us to manipulate our variables. In python, these are normally a command word followed by brackets. Some common examples are:
  * print(): Prints information as text for us to read
  * type(): Returns the type of a variable
  * max(): Returns the maximum of a list of values
  * min(): Returns the minimum of a list of values
  * Standard maths operators (+, -, *, /)   
Most functions in excel, sql, and sas will have a python equivalent - more often than not, they can be found by googling something like "excel sum function in python"

Let's play with some of these now! 

## Assigning variables and basic variable types

Creating a variable in python is easy: we just need to name our variable whatever we want, and set its value using '='. Variables can be named almost anything, but need to follow three rules:
 * Variables can't contain spaces (but can contain underscores)
 * Variables can't begin with numbers
 * Variables can't be named after any terms that already serve a function in python (like 'int' or 'float').
 
Python will automatically select an appropriate type for the variable (although this can be manually overridden later on).

Run the cell below, which creates some variables of a fictional driver and prints their types. Try replacing the variables with values of your own, or create new variables to see what type python assigns them. 

In [None]:
# First let's assign some variables
customer_name = 'Harry Potter'
age = 42
driver_rating = 3.5
only_driver = True

# Now let's print the types of our variables
print(type(customer_name))
print(type(age))
print(type(driver_rating))
print(type(only_driver))

The best variable names are short and describe what information the variable holds, but can actually be named whatever you like.

In [None]:
# Let's assign the same variables to new names
name_of_my_nextdoor_neighbour = 'Ron Weasley'
number_of_dogs_I_own = 42
how_many_timeas_did_I_fall_asleep_at_work_today = 3.5
am_I_going_to_eat_a_whole_pizza_for_dinner = True

# And now print them just as before
print(type(name_of_my_nextdoor_neighbour))
print(type(number_of_dogs_I_own))
print(type(how_many_timeas_did_I_fall_asleep_at_work_today))
print(type(am_I_going_to_eat_a_whole_pizza_for_dinner))

In [None]:
# When it comes to using maths operators with variables, we have to be 
# careful about mixing variable types. 

# Set the variables below using the drop down menus, and see what
# happens when you try to add them together. Don't forget to run the
# cell once you've chosen your variables!

variable1 = 10 #@param [10, 3.75, "'Hello'"] {type: "raw"}
variable2 = 20 #@param [20, 36.3333339, "'World'"] {type: "raw"}

print(variable1 + variable2)
print(type(variable1 + variable2))

Strings can be added to strings to create longer strings (the strings are "concatenated").  
Two ints added together will remain ints, but an int added to a float will become a float.

In [None]:
# When it comes to using maths operators with variables, we have to be 
# careful about mixing variable types. 

# Set the variables below using the drop down menus, and see what
# happens when you try to add them together. Don't forget to run the
# cell once you've chosen your variables!

variable1 = 10 #@param [10, 3.75, "'Hello'"] {type: "raw"}
variable2 = 20 #@param [20, 36.3333339, "'World'"] {type: "raw"}

print(variable1 * variable2)
print(type(variable1 * variable2))


Strings can be multiplied by ints to repeat them many times, but can't be multiplied by floats.   
Ints multiplied by ints remain as ints.  
Ints multiplied by floats become floats. 

# Importing packages

Importing packages is an extremely common python practice. Core python is very lightweight because some more advanced properties aren't included by default. Many packages are included by default in every python installation, but custom packages can also be written by other authors for more specific functions. 


It's good practice to put all of your import statements at the top of your python file/notebook, because it allows users to see if they have all of the required packages installed before running the code.

Additional packages can be imported using the import function which looks like:

```
import package_name
```

Sometimes, packages will be imported using an alias which is shorter and easier to type (useful for some more advanced techniques). This looks like:

```
import package_name as alias
```
Alias can be thought of like a variable name - it's completely decided by the user and is just a shorthand to access the package's information.

Some common imports to see in data science/analysis are pandas, numpy, and pyplot (a subpackage of matplotlib). We'll discuss these more in future sessions, but these imports commonly look like:
```
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
```

# Tuples, dictionaries, and lists

One of python's most useful features is the ability to group multiple basic objects into a single collection which can be easily referred to and used in code. This is done through the use of **tuples**, **lists**, and **dictionaries**: all collect data of all types together, but in different ways: 

![tuples, lists, dictionaries](https://miro.medium.com/max/795/1*DxD-6bZrRg7qSa7hg2bbtA.png)

*   **Tuples** are collections of ordered data that are immutable (cannot be changed)
*   **Lists** are ordered collections of data that are mutable
*   **Dictionaries** are ordered collections of data stored in key-value pairs
*   **Sets** aren't particularly relevant for us, so we'll skip over them for now

These definitions can be difficult to visualise, so we will explain with some examples. 




### Tuples

Tuples are created using normal brackets e.g.:   
```my_tuple = (100, 150, 200)```

Individual elements of a list can be accessed using square brackets e.g.:   
```print(my_tuple[0])```


Remember: python starts counting at zero. This means the first element of the tuple is element 0, the second is element 1, and so on. 


```print(my_tuple[0])``` returns ```100```   
```print(my_tuple[1])``` returns ```150```   
```print(my_tuple[2])``` returns ```200```


Tuples are immutable, which means they can't be changed. New elements can't be added, and existing elements can't be deleted.   
Elements of the tuple can be accessed, but they cannot be individually changed.

In [None]:
# First let's set the tuple
# Tuples can contain a mix of data types, or contain only data of a single type
my_tuple = (0.5, 'Dog', 17)

# Print it so we can see it
print(my_tuple)

# Print the type to check it's really a tuple
print(type(my_tuple))

# Print the tuple's first element
print(my_tuple[0])

In [None]:
# Let's try changing the first element of the tuple
# Uncomment the line below and run this cell. 
# This code should have an error, because the tuple is immutable

#my_tuple[0] = 5

The main feature of tuples that makes them unique is that they are immutable. They are the best for storing data that should be protected from being changed e.g. coordinates on a graph.

### Lists

Lists are created using square brackets e.g.:   
```my_list = [100, 150, 200]```

Lists are accessed with square brackets e.g.:   
```print(my_list[0])```

Reminder: python starts counting at zero.    
```print(my_list[0])``` returns ```100```   
```print(my_list[1])``` returns ```150```   
```print(my_list[2])``` returns ```200```

Lists are ordered and mutable. Created lists can be changed, but new elements are always added to the end of the list. 

We can add elements to the list with append:   
```my_list.append('New element')```

We can remove elements with pop:   
```my_list.pop(0)```

In [None]:
# First let's set the list
my_list = [99, 'cat', True]

# Print it so we can see it
print(my_list)

# Print the type to check it's really a list
print(type(my_list))

# Print the list's first element
print(my_list[0])

In [None]:
# First let's print the original list so we have a reference
print(my_list)

# Now let's add an element to the list and check that it's updated. 
my_list.append('New element')
print(my_list)

# And now use pop to remove the same element
# Instead of counting the list to find the index of the last number
# we can just use -1 as a shortcut
my_list.pop(-1)
print(my_list)

# And finally let's update an element of the list
my_list[0] = 'dog'
print(my_list)

### Dictionaries

Dictionaries are created using curly brackets and key-value pairs e.g.:   
```my_dict = {'name': 'john doe', 'age': 17}```   

Dictionaries are accessed using keys e.g.:   
```print(my_dict['name'])```



```print(my_dict['name'])``` returns ```'john doe'```   
```print(my_dict['age'])``` returns ```17```   


We can add new elements to the dictionary with update e.g.:   
```my_dict.update({'driver_score': 3.5})```

And we can remove elements with pop e.g.:  
```my_dict.pop('age')```

In [None]:
# First let's set the dictionary
my_dict = {
          'name': 'Harry Potter', 
          'age': 42
}

# Print it so we can see it
print(my_dict)

# Print the type to check it's really a dictionary
print(type(my_dict))

# Print the dict's name element
print(my_dict['name'])

In [None]:
# First let's print the original dict so we have a reference
print(my_list)

# Now let's add an element to the list and check that it's updated. 
my_dict.update({'driver score': 3.5})
print(my_dict)

# And now use pop to remove the same element
my_dict.pop('driver score')
print(my_dict)

# And finally let's update an element of the list
my_dict['name'] = 'Ron Weasley'
print(my_dict)

### Nested storage

An important and frequently used feature of python is its ability to nest tuples, lists, and dictionaries. The most common examples are dictionaries that contain lists as their values, or lists of lists of information. 

Let's look at an example:

In [None]:
# This dictionary contains regular variables as values, but also tuples, lists
# and dictionaries

restaurant_dict = {
    'Name': 'Party Pizza',
    'Phone numbers': {'manager': '618-555-6178', 'restaurant': '618-555-WICH'},
    'Menu': 
      {'Classic cheese': 6.00,
       'Hawaiian heaven': 6.50,
       'Party pepperoni': 8.00,
       'Shredder special': 9.50,
       'Splinter supreme': 10.00},
    'Staff': ['Leo', 'Raph', 'Donny', 'Mike', 'April', 'Casey'],
    'Opening hours': (10.30, 00.00)
}

In [None]:
# We can store more complex items in variables to access them, or just use
# multiple indices at the same time

menu_dict = restaurant_dict['Menu']

print(menu_dict['Splinter supreme'])
print('--')
print(restaurant_dict['Menu']['Splinter supreme'])

# If statements

If statements are checks we can make in our code to run certain code when a particular condition is met. If statements are similar to WHERE or CASE WHEN statements in sql. 

![if statement](http://www.trytoprogram.com/wp-content/uploads/if.jpeg)

Let's take a look at an example:

In [None]:
# First let's set a variable to check with an if statement

number = 5 #@param [5, 10, 15, 20] {type: "raw"}

# Now let's check if the number we've chosen is less than 10

if number < 10:
  print("Number is less than 10")

This works pretty much as expected: we only see our printed message if our number is less than 10.

Notice that if we set our variable to be 10 we don't see our message (because 10 is not less than 10). This is because of the symbol we used in our if statement. We can use any mathematical comparison we like, and it works as normal i.e.:
 * less than ```<``` / greater than ```>```
 * less than or equal ```<=``` / greater than or equal ```>=```
 * equal to ```==``` (We can't just use ```=```, because python uses this to assign variables)
 * not equal to ```!=```   

There are plenty more comparisons we can make, but going through all of them would take too much time!

Indentation is important for if statements. For example, in the following code:
```
if condition:
  python code 1
python code 2
```
the line ```python code 1``` will only be run if the condition in the if statement is met, but the line ```python code 2``` will always be run even if the condition is not met. 

### elif and else

What if we want to make a comparison with multiple conditions? Only using if can introduce problems if we're not careful:

In [None]:
# First let's set our variable to compare

number = 5 #@param [5, 10, 15, 20] {type: "raw"}

# This time we'll use a variable for our message too, just to see why only using
# if statements can caused problems. For now let's leave it blank

message = ""

# Let's copy our last if statement, but add another condition to check if our
# number is less than 15

if number < 10:
  message = ("Number is less than 10")
if number < 15:
  message = ("Number is less than 15")

print(message)

This code works fine for 10, 15, and 20, but runs into problems with the number 5. Because both of our if conditions are running even after a message is set, our code is saying "Number is less than 15" when it should be saying "Number is less than 10". 

This can be fixed with elif and else statements. These join our two separate if statements together:

In [None]:
# First let's set our variable to compare

number = 5 #@param [5, 10, 15, 20] {type: "raw"}

# This time we'll use a variable for our message too, just to see why only using
# if statements can caused problems. For now let's leave it blank

message = ""

# Let's copy our last if statement, but add another condition to check if our
# number is less than 15

if number < 10:
  message = ("Number is less than 10")
elif number < 15:
  message = ("Number is less than 15")
else:
  message = ("Number is greater than 15")

print(message)

Now if we run our code, it works correctly for all of our numbers. This is because our elif statement is only looked at if our first if statement is false. Finally, our else statement is only look at if all other statements are false. 

# Loops

Loops allow us to repeat snippets or sections of code without having to type it out manually. The most useful type of loop is the "for loop", where the loop runs its code for every item in a given range. For example: 

* Every item in a list
* Every number between 1 and 10
* Every letter in a string

For loops take the following form: 

```
for x in y:
  code to run
```

Indentation is important for loops. After the colon, only indented lines of code will be run. For example, in the following code:
```
for x in y:
  python code 1
python code 2
```
only the line ```python code 1``` will be looped, and the line ```python code 2``` will be run only once as for normal code. 

In [None]:
# Let's create a list and then loop over it with a for loop

my_list = ['Harry', 'Ron', 'Hermione', 'Neville']

# Now we can loop over the list to print every name

for name in my_list:
  print(name)

# "name" is a variable we set to refer to whatever we're looping over
# name makes sense in this list, but we can name it just as we would any
# other variable

for really_big_cat in my_list:
  print(really_big_cat)

In [None]:
# Looping over a series of numbers is just as easy

for number in range(0, 5):
  print(number)

print('--')

# The range function is important for this to work. If we want to start at 0,
# we don't really need to specifiy it in the rnage function but it makes it
# easier to read

for number in range(5):
  print(number)

print('--')

# We can also loop between numbers without starting at 0

for number in range(15, 20):
  print(number)

In [None]:
# Looping over a string is just as easy

for character in 'string':
  print(character)

In [None]:
# Loops can also be nested, which can be a very useful trick (although can
# easily become resource intensive if you're not careful!)

# First, let's define a list of lists. This is not uncommon to see, but we will
# learn in future that there are better ways to store data.

my_list= [
    ['Harry', 'Ron', 'Hermione'],
    ['Wayne', 'Garth', 'Cassandra'],
    ['Luke', 'Leia', 'Han'],
    ['Dom', 'Brian', 'Letty']
]

# Now we can loop over both lists. Indentation is key to making both of these
# loops work as we want. The innermost loop must be indented enough to be 
# completely contained within the first loop. 

for film in my_list:
  for character in film:
    print(character)

# Functions

Functions allow us to write sections of code that can easily be called back to with a single short command. While we have already seen some pre-existing functions in python like ```type``` and ```print```, we can also write our own functions. 

Functions normally have the following form:
```
def function_name(arguments):
  '''Docstring'''

  code to execute

  return output_variable #Optional

```
Defining a function has several important components:


*   The ```def``` command is what allows us to *define* our new function. 
*   The arguments within the brackets detail the variables that the function requires to run.
*   A docstring explains the purpose of a function and how it works. Normally docstrings use specific formats, but we'll keep it simple for this tutorial. 
*   Our code makes up the main body of the function, and can be in whatever form we decide
*   A return statement returns a final variable from the function, and allows our function output to be saved as a variable for us to use in our main code. This step is optional and not needed for every custom function. 

Let's take a closer look with some examples

In [None]:
# First, let's define a function, while keeping it as simple as possible. 

def print_hello():
  '''Prints the word hello'''

  print("Hello!")

In [None]:
# When the cell above is run, we can see that there is no output.
# This is because our function has been defined, but the code isn't
# run until we call our function. Let's do that now:

print_hello()

In [None]:
# Our above function didn't need a return statement, it just executes our code
# and ends immediately. Let's write a slightly more complicated function that
# adds two numbers together. 

def add(number1, number2):
  '''Returns sum of two input numbers'''

  return number1 + number2

In [None]:
# Now let's call our function. We can choose any two numbers we like, and our
# function should return their sum. We can either print that value or use it
# like we would any other variable

sum = add(3, 6)
print(sum)

print('--')

print(add(123456789, 987654322))

print('--')

x = 9.75
y = 3.14159265
print(add(x, y))

## Optional arguments and default values

Functions have a lot of additional functionality which can be very useful for specific jobs. Optional arguments are arguments that *can* be included, but aren't required for the function to run. 

These are created by using default values, which is the value the function can use if the argument is not directly provided by the user. 

In [None]:
# Let's create a new function to subtract two numbers and include an optional
# argument by using default values

def subtract(number1=0, number2=0, number3=0):
  '''Subtracts upto three numbers from eachother. Righthand numbers will
  be subtracted from the leftmost number provided.'''

  return number1-number2-number3

In [None]:
# Our function can have up to three numbers used in the subtraction. If any
# number argument is not provided, a value of 0 will be used in its place.

print(subtract(100, 25, 25))
print(subtract(100, 25))
print(subtract(100))
print(subtract())

# Error handling

So far, our functions have always worked because we have been very careful about how we use them. But what would happen if, instead of providing numbers into our add() and subtract() functions, we gave them strings or lists? 

Python's solution to unexpected behaviour is error handling. This topic can become very advanced, but the general idea is that we can raise our own errors if our functions aren't used properly. 

Some common examples of error handling are try/except statements and raise statements.


---



Try/except statements take a similar form to if/else statements:
```
try:
  code to try and run
except:
  code to run if the code fails
```
we can even include else statements just like with if/elif/else. 


---

We can raise an error wherever we choose, but normally they follow an if statement to check if a variable/condition is behaving as expected:
```
if x < 0:
  raise Exception("Number cannot be lower than zero")
```
if the code this exception is in fails, our custom message will be printed for the user to see. 


---

We wont go through any examples of error handling here to save time, but it can be extremely helpful when writing code that other teams will interact with. 

# Putting it all together!

Functions can be extremely powerful, even calling other functions. Professional python code typically only consists of classes and functions (or methods, which are functions that specifically belong to classes). 

Let's write another function that uses everything we've talked about in this tutorial:

*   If statements
*   Lists
*   Dictionaries
*   Loops
*   Functions
*   Error handling

These technqiues on their own are very simple, but can be used together to write some incredibly powerful and streamlined code!




In [None]:
# First, let's make some fake data for us to work with. This is a list of ten
# fake customers with only three pieces of information: 
# First name, date of birth, and driving risk

customer_info = [['Harry', '31/07/1980', 'Medium'], 
                  ['Ron', '01/03/1980', 'Medium'],
                  ['Hermione', '19/09/1979', 'Low'],
                  ['Neville', '30/07/1980', 'Low'],
                  ['Wayne', '25/05/1970', 'Medium'],
                  ['Garth', '02/06/1971', 'Low'],
                  ['Cassandra', '02/01/1972', 'Low'],
                  ['Dom', '29/08/1976', 'High'],
                  ['Brian', '14/07/1978', 'High'],
                  ['Letty', '07/09/1983', 'High']
]

Let's imagine a fictional scenario: this data has just come in and we need to input it in to our systems. Before we do that, we need to do two things:



*   Label our data. Currently all we have is a list of lists, and we need to move this into a list of dictionaries which have appropriate keys. 
*   Add a new column to our data called "Age" which contains the age of the customer.

We'll attempt this in a function based way, and then execute all of our code as the very last step. 



---

The first thing to consider is how our code should be arranged. In order to keep our code as efficient as possible, we need to make sure we don't perform any unnecessary operations. 

For example, we need to keep the number of times we loop through our data list as low as possible (as looping over a long list of data takes a long amount of time).

Let's design our solution to contain three functions:


*   A function to convert a single list into a dictionary
*   A function to calculate age from a given date of birth and add it to a dictionary
*   A function to loop over our input list once, and perform our other two functions wherever they're needed. (Functions like this that actually call all of our simplre functions are commonly called "wrapper" functions).  



In [None]:
# First things first: let's write a function that will let us change a list
# into a dictionary.

# Our function will need two input arguments: an individual customer's info
# and a list of keys for our final dictionary.
def to_dictionary(input_list, key_list):
  '''Converts a pair of lists into a dictionary. Keys provided in key_list.'''

  # Let's start with some error handling. Our code won't work if our inputs
  # aren't a pair of lists, or if our lists aren't the same length. We also need
  # to make sure our lists aren't empty of values. 

  if len(input_list) < 1 or len(key_list) < 1:
    raise Exception('Input lists cannot be empty')
  if type(input_list) != list:
    raise Exception('Input customer list must be of type "list"')
  if type(key_list) != list:
    raise Exception('List of keys must be of type "list"')
  if len(input_list) != len(key_list):
    raise Exception('Your list of keys is too short')

  
  # We can use the "zip()" function to combine our two lists into a tuple
  zip_tuple = zip(key_list, input_list)

  # And now all we need to do is convert our tuple into a dictionary
  cust_dict = dict(zip_tuple)

  # The very last line in our function (outside our loop) returns our
  # output customer dictionary
  return cust_dict

In [None]:
# Now let's write a function that will calculate a customers age and add it to
# a customer dictionary.

# Because we'll be deailing with dates, we should import python's date package.
# Normally we would put this line at the top of our notebook, but for the
# sake of this tutorial we'll just do it here. Python's datetime package has a 
# strange structure, so our import statement also looks a little strange

from datetime import datetime

In [None]:
# With datetime imported, we can start writing our actual function. 

# Our function will only need a single argument, which is the dictionary to 
# use as input. 
def add_age(cust_dict):
  '''Calculates a customer's age from their date of birth and adds it to
  a dictionary.'''

  # Let's again start with some error handling. Our code will fail if our
  # input is not a dictionary, if it doesn't contain a DOB key, or if the DOB
  # key doesn't give us a valid date. 

  if type(cust_dict) != dict:
    raise Exception('Input customer info must be of type "dict"')
  if 'Date of Birth' not in cust_dict.keys():
    raise Exception('Input dict must contain field "Date of Birth"')
  try:
    datetime.strptime(cust_dict['Date of Birth'], '%d/%m/%Y')
  except:
    print('Date of Birth string could not be converted to a valid date')
    print('Please check that the Date of Birth string is valid')

  # Now let's write the code to calculate our customer's age. We can use
  # datetime's today function and take the difference from the customer's
  # date of birth. The strptime function lets us use our string date with the 
  # datetime package. 

  today = datetime.today()
  born = datetime.strptime(cust_dict['Date of Birth'], '%d/%m/%Y')

  # We can access the years, months, and days of each of our dates to work 
  # out our customer's age. We can use if statements to check if the customer's
  # birthday has happened yet this year, which will effect their final age.
  
  if (today.month, today.day) < (born.month, born.day):
    cust_age = today.year - born.year - 1
  else:
    cust_age = today.year - born.year

  # Now that we have our customer's age, the final step is to add it to the
  # original input dictionary.
  cust_dict.update({'Age': cust_age})

  # Our function doesn't need a return statement because we're updating our
  # original input dictionary, so we can end our function there.

In [None]:
# The very last step is to write our wrapper function that will tie everything
# together. We want to be able to use this function for any list that comes in
# no matter what the headings should be, so our final function needs both
# our unlabelled data and a list of headings. 

def label_data(customer_list, key_list):
  '''Create a list of labelled data dictionaries including customer age from an
  unlabelled list of customer data.'''

  # Just like always, let's start with some error handling. Most problems are
  # taken care of by the specifics of our functions, so all we need to prepare
  # for are problems specific to our wrapper function. There's only one new
  # way for this to fail: if customer_list is either not a list, or does not
  # contain any customers. 

  if len(customer_list) < 1:
    raise Exception('Input customer list cannot be empty')
  if type(customer_list) != list:
    raise Exception('Input customer list must be of type "list"')
  
  # The next step is to define an empty list to store our output dicts
  output_list = []

  # Now we loop over our list of customer info and call our other functions
  # where they are needed
  for customer in customer_list:

    # First, turn each customer's info into a labelled dictionary
    cust_dict = to_dictionary(customer, key_list)

    # Then add an age column to each dictionary
    add_age(cust_dict)

    # Finally add our dictionary to an output list for our labelled data
    output_list.append(cust_dict)

  # The very last step is to output our list of labelled dictionaries
  return output_list

## The moment of truth

All of our functions are coded and ready. Here's a reminder of how everything will work before the big moment:



1.   label_data will loop over our input unlabelled customer info list
2.   to_dictionary will convert each customer's individual list into a dictionary using the list of keys we provide
3.   add_age updates each dictionary by adding an age field
4.   Finally, label_data returns our output list of labelled data dictionaries to use as we please.

Let's run our final function now and see the results!



In [None]:
# First we have to create our list of keys for label_data to use
key_list = ['Name', 'Date of Birth', 'Driving Risk']

# Now let's call our function, and save the output into our cust_list variable
cust_list = label_data(customer_info, key_list)

# The final step: loop over our new list and check everything worked!
for customer in cust_list:
  print(customer)

In [None]:
# Let's print our original list as a comparison
for customer in customer_info:
  print(customer)

As you can see from our output, each of our customers has changed from a list of unlabelled data into a dictionary with appropriate headings. On top of that, we've even added a new age column (and we only had to run a single line of code!)

# More python!

Hopefully this tutorial has been useful and given you at least a bit of understanding into some of python's most basic features. Python's building blocks are very straightforward, but can be combined to create very powerful porgrams and applications!

This section is made of two parts and contains: 

*   Some python exercises for you to attempt on your own, which should help you test some of the techniques we've gone over here
*   Links to some resources of some additional resources to further your python training, and some examples of the things that python code can do!



## Exercises

These exercises are designed to test your skills and build on what you've learned today. There is no single perfect solution, and getting the answer right isn't even the point (although I do have some code of my attempts, if you want to check your work). 

The point of these exercises are just to give you a chance to apply your new talents - don't be afraid to look back at what's been written above, google anything that you're not clear on, or even just message me directly if you want some help!

Hopefully these will be a bit of fun, and a way to kill some time while waiting for SAS ans SQL to finish running.

### Exercise 1: Fizzbuzz


---


Fizzbuzz is a classroom maths game. Kids sit in a circle and count from 1 to whatever number they decide. Each player says the next number in the sequence, unless the word is a multiple of 3 (which they replace with "Fizz"), 5 (which they replace with "Buzz"), or both (which they replace with "Fizzbuzz").

![fizzbuzz](https://code.kx.com/q/img/fizzbuzz.png)

In this exercise, trying writing some code that will automatically play the game of Fizzbuzz for the numbers 1 to 100. 

1.   Write some code that loops over the numbers from 1 to 100. For each number, print the number out to be read unless:
  *   The number is a multiple of 3, in which case print the word "Fizz" (and nothing else)
  *   The number is a multiple of 5, in which case print the word "Buzz" (and nothing else)
  *   The number is a multiple of 3 *and* 5, in which case print the word "Fizzbuzz" (and nothing else)
2.   Change your code so that you now print:
  *   "Fizz" on every multiple of 2
  *   "Buzz" on every multiple of 7
  *   "FizzBuzz" on every multiple of both
3.   Wrap your code in a custom function (if you haven't already), which should have the following arguments:
  *   The range of numbers to loop over
  *   The number for "Fizz" to be printed on each of its multiples
  *   The number for "Buzz" to be printed on each of its multiples

At the end of each step, make sure you take a look at your output and check that your function is working correctly!

Tips:
*   For this exercise, the modulo maths operator (%) will be very useful! It wasn't covered in this tutorial, but is an easy one to google.




### Exercise 2: Rock, Paper, Scissors


---



Playing a game of rock, paper, scissors is very easy for humans but requires some careful thought to program. 

Unlike the previous exercise, you'll need to import some extra packages so that you have access to some more specific functions (like selected random choices). To program this code, you'll need to be able to both generate a random choice from a list and read in a guess from your user. 

![Rock, paper, scissors](https://miro.medium.com/max/800/1*8du96SQUQ0NlWmWvVu20Zw.png)

1.   Import the "random" package and take a look at how it works. 
2.   Write some code that selects a random choice from a list of ```['rock', 'paper', 'scissors']```, and returns the choice as a variable.
3.   Write some code that reads your guess and stores it in a variable.
4.   Write a function that takes both guesses as an argument, and prints both guesses along with whether the computer or the player is the winner.
5.   Write some error handling for your function to handle unexpected behaviour. 
     * What happens if the player spells their guess wrong? 
     * What happens if the game is a tie? 
     * What happens if the player enters a number?   

See if you can solve some of the above problems in your function so that they are no longer errors (e.g. have your game automatically restart in the case of a tie). 

Tips: 
*   To select the computers guess, see if you can generate a number between 0-2 and remember that you can access elements of the list using indicies (e.g. ```my_list[0]```).
*   There are a few ways that input can be read into a python program: text can either be read from a file, or read directly in from the keyboard using the ```input``` function. You can even try using some of google colab's advanced features if you're feeling brave!

### Exercise 3: Plot a graph


---



Using python to create graphs is very helpful for analysis, as you can visualise features of datasets without having to leave the program where you're doing your investigating. The most common way to do that is with the pyplot package from matplotlib. 

In this exercise, we'll draw our first plots using matplotlib.

![matplotlib](https://camo.githubusercontent.com/109927a15915074d15313889468aa9aa688de3b9e38cc4359a01f665d351114e/68747470733a2f2f6d6174706c6f746c69622e6f72672f5f7374617469632f6c6f676f322e737667)

We'll break this exercise into two halves: making your first plot, and improving your plotting code. 

For the first part:

1.   Import the pyplot package from matplotlib (this was used as an example in the "Importing packages" section of this notebook)
2.   Create two lists called x_pts and y_pts, which should contain the x and y coordinates of points to draw on a scatter plot. 
3.   Use pyplot to plot your points on a scatter plot.


Tips: 
*   To complete this section, you'll need to investigate matplotlib and pyplot. These are both very common tools, so you'll find lots of questions being asked by people attempting the same/similar tasks

For the second part:

1.   Give your plot a title and axis labels (and feel free to play around with other design options like colours).
2.   Create another two lists, and try plotting them on the same canvas. Don't use the exact same coordinates again, or you might struggle to see your new list.
3.   Design a wrapper function that will take two lists of lists as input (one containing x coordinates, one containing y coordinates), and draw all of your lines at once on a single canvas.


Tips:
*   Just like before, there will be plenty of example on google that you can take advantage of and use as reference. You might also find matplotlib and pyplot's official documentation useful, which should have info on the layout and colour options that each function can take as arguments. 
*   Be careful with your use of plt.show() in your functions. plt.show() should only be run when you want to draw your final figure, so be careful not to run it too soon or some of your points might not appear. 
*   This exercise can definitely be tricky - feel free to peek at the solutions for inspiration if you find yourself stuck! 

## Further courses and applications

If you're still hungry for some more python training, hopefully the following links can be of use to you: 



### Kaggle

Kaggle is a company that runs machine learning competitions, and has many free courses on how to use python for data science and analysis. Kaggle's training includes time-series forecasting, neural networks, and even more advanced techniques. All of their courses are available here: 

https://www.kaggle.com/learn

(We'll hopefully be looking at pandas and some other bits and pieces available here in a future session).

### Your first neural network

Building a neural network (a simple version of the tools companies use whenever they talk about "Artificial Intelligence") is suprisingly easy in python! 
The link below will run you through a project on creating your first neural network to predict house prices (and might give you an idea of how similar techniques could be useful for insurance): 

https://www.freecodecamp.org/news/how-to-build-your-first-neural-network-to-predict-house-prices-with-keras-f8db83049159/

### Turtle racing!

This youtube video is an hour long tutorial of creating a simple game to race a number of turtles. While it's a little bit sillier than the other examples, it's still really good practice for coding in a function oriented way, and uses a lot of the techniques we've gone over here (along with some new ones!)

https://www.youtube.com/watch?v=gQP0geNsO4A

### Everything else

Hopefully at least some of these will have sparked off some ideas of simple python tools you might be able to write to make your life easier. Don't be afraid to take a crack at any idea you have for even some complicated python code - google colab makes it very easy to quickly test out code without having to download anything at all!

If you are interested in using python with some customer data, send me a message and I can talk you through the options for working with python for real.