# Python Fundamentals

---

<a id="learning-objectives"></a>
## Learning Objectives
*After completing this notebook, you will be able to:*

- Use Jupyter Notebook
- Understand why Python is well suited to data science 
- Perform simple operations with integers, strings, floats, lists and dictionaries in Python
- Import libraries
- Write functions
- Use loops and if statements
- Use classes and access their methods and attributes

## Contents:
* [What's Python and why are we learning it?](#intro)
* [Learning to use Jupyter](#jupyter)
* [Your first Python code](#helloworld)
* [Types in Python](#types)
* [Variables](#variables)
* [Strings](#strings)
* [Lists](#lists)
* [Dictionaries](#dictionaries)
* [Libraries](#libraries)
* [Functions](#functions)
* [If Statements, For and While Loops](#loops)
    * [Boolean logic](#bool)
    * [If/Else/Elif Statements](#if)
    * [For loops](#for)
    * [While loops](#while)
* [Classes](#classes)

<a id="intro"></a>
# <font color='blue'> What's Python and why are we learning it?

### Why does Python have such a ridiculous name?

Python was initially developed by Guido van Rossum in the 1990s, who was a big fan of Monty Python's Flying Circus, a BBC comedy sketch series televised during the late 60s and early 70s. It was conceived, written and performed by the comedians collectively known as Monty Python (or simply 'the Pythons').

<img src="img/monty_python.jpg" width=400 />


### Why Python?

**Python is open source**

This means it's free to use and distribute, and anyone can modify/customise the source code. This might seem surprising to anyone with experience of working in the private sector, but many huge advances in science and technology are only possible thanks to open source culture and a tradition of people freely sharing, distributing, modifying their own and each other's work. 

You can learn more about open source here: https://opensource.org/

**Python is an interpreted language**

If you're new to programming, you probably have no idea what this means. Let's take a quick(ish) detour through the history of computing. 

**What is programming?**

What do we actually mean when we talk about coding or programming? This sounds like an obvious question, but it's worth discussing.

When we write code (in any language from Python to C++) we're writing instructions that we want our computer to execute; those instructions could be as simple as 'add these two numbers and show me the result' or as complex as 'navigate this driverless car through rush hour traffic.' 

The CPU (central processing unit, or 'brain') in any computer can only understand and execute commands written in machine code, which looks something like this:

<img src="img/machine_code.png" width=400 />

For humans, machine code is:

* Very error prone 
* A nightmare to debug: imagine trying to spot a rogue 1 (Star Wars reference entirely unintentional but I'll take it) in a string of thousands of numbers and letters so you can fix your mistake
* Inefficient: it takes many lines of code to write even the simplest instructions

Luckily, nowadays we have at our disposal a huge range of programming languages from JavaScipt to Python to C++. All of these are much more similar to the English language, have their own vocabulary (commands that can be given to the computer) and syntax (rules for combining commands to write complex instructions) and feel far more intuitive than machine code.

But if computers only understand machine code, how is this possible?

<img src="img/grace_hopper.jpg"/>

In the 1950s a US Navy Rear Admiral named Grace Hopper was well aware of this problem. She built the first compiler, a middle-man piece of software that translated instructions written in a human-readable programming language called A-0 into computer-readable machine code. 

All the programming languages you might have heard of including FORTRAN, C++, Python, JavaScript, Swift, R and Matlab only exist today because of Hopper's innovation.

These modern languages are compiled or interpreted into a format that can be directly executed by a CPU. The main difference between a compiled language (like C++) and an interpreted language (like Python) lies in the result of the compilation or interpretation process. 

Code written in an interpreted language is executed by an interpreter, which reads the source code and translates it on the fly into machine code, and gives us the desired result or output of the code- this could be a number, some text, or movement in a robotic arm. This means we can type statements into the interpreter and they are executed immediately, giving us near-instant results. The source code has to be re-interpreted each time the code is executed. 

Lines of code written in compiled languages are converted into machine code by a compiler, to produce an executable file. To get the desired results of the code, this executable file then needs to be run. Most of the software we use on a daily basis is delivered as compiled binaries.

**Python has a library for almost anything**

In programming, a library is a big bundle of code snippets (or functions) that someone else has written and made freely available for other people to download and modify (open source culture again!). A single library will usually be designed to help people write code for specific applications or purposes. For example:

* Pandas is a library for data science
* Numpy is a library for numerical computing
* NLTK is a library for natural language processing
* Tensorflow is a machine learning library released by Google 

Using libraries rather than writing your own code from scratch is always a smart move (as one of my favourite undergraduate tutors used to say, a good engineer is smart and lazy)- you'll save time, and because open source code will have been reviewed and checked by hundreds and thousands of people, code from a library is far less likely to contain bugs than something you write on your laptop at 2am after your 10th cup of coffee of the day. 

**Python is object oriented**

Object-oriented programming, or OOP for short, is a way of structuring programs so that groups of properties and behaviours are bundled into things called 'objects'.

We'll learn more about why this is useful later.

---

<a id="jupyter"></a>
# <font color='blue'> Learning to use Jupyter Notebook

### What's an IDE?

We've downloaded Python. What does that actually mean? What exactly have we just installed on our machines? 

When we download and install Python, we download the Python interpreter- that is, a program that can take lines of code written in Python and translate them down to machine code and show us the results. That's great! But we need somewhere to feed lines of Python code to that interpreter, or in other words we need somewhere to write, test, and run our code and view the results or output. That's what an IDE or interactive development environment is for. 

An IDE is anywhere that you can write and run code, and see the results. There are lots of different IDEs available- some can be used to run code in many different programming languages (like Microsoft Visual Studio Code) and some can only handle one language (like pyCharm). 

Let's take a quick tour of a few different, commonly used, Python IDEs.

**The command line**

Typing ```python``` into the command line will launch Python in your Terminal window. You can then type Python commands, run them, and see the results. 

<img src="img/command_line.png" width='500' />

This is a quick and easy way of running code, but not well suited to writing longer blocks of code (we call these 'scripts') or saving the results. 

**Text editors**

You can write Python script in any text editor (Notepad, Atom, Sublime) and as long as you save it with a .py extension, it can be run from the command line.

<img src="img/atom.png" width='500'/>

<img src="img/run_atom.png" width='500'/>

**pyCharm**

pyCharm is a powerful IDE with lots of features to make writing, running and testing code easier. 

<img src="img/pycharm.png" width='500'/>

**Jupyter**

Jupyter notebook is just another example of an IDE, or interactive development environment. It has some neat features that make it particularly well suited to writing and running readable, shareable, nicely formatted code.

**Text Editors**

In addition to IDEs, developers also use text editors to create or edit code and files. Text editors or more commonly used for files that are executed via the command line, as well as for software and website development.  

Some common text editors that you may see or use include
- [Sublime](https://www.sublimetext.com/)
- [Atom](https://atom.io/)
- [Notepad++](https://notepad-plus-plus.org/) (Windows)
- [Vim](http://www.vim.org/)


### What is Jupyter?

The Jupyter Notebook (previously known as IPython) is an IDE that launches in your web browser but runs locally on your machine. 

It allows you to create and share documents that contain live code, equations, visualizations and explanatory text. Jupyter Notebooks have the .ipynb extension, although it is possible to export them as .py files, or even .html or .pdf files for easy sharing and presentation. 

There are a number of advantages of using a Jupyter notebook. It's interactive, easy to share and document results, and because equations/notes/graphs can be embedded in a notebook it's easy for other people to understand what you've done and why you've done it- which is incredibly important when checking, sharing, and reviewing results. Jupyter is also an IDE that can run many different programming languages (sometimes they're called kernels) from R to JavaScript, so it's versatile too. 

Many researchers (and large companies including Netflix: https://medium.com/netflix-techblog/notebook-innovation-591ee3221233) have become reliant on Jupyter notebooks to explain their results. Readers can reproduce those results and modify to create different output in order to facilitate the learning process.

### Using Jupyter

**Launching** Jupyter can be launched from Anaconda Navigator. It runs in your web browser, but doesn't require an internet connection to work- remember, Python and the IDE are both running locally in your machine, your browser is just a convenient user interface. 

A new notebook can be launched by selecting New, followed by Python 3. 

<img src="img/launch_notebook.png" width='500'/>

**Cells** Cells are the building blocks of Jupyter notebooks, and can contain different types of content from code, to markdown (nicely formatted text, like this cell) to LaTeX (for writing mathematical formulae) and embedded images. During this seminar we'll be using mostly code cells (to write and run our code) and markdown cells. 

New cells are inserted by selecting Insert, followed by Insert Cell Above or Insert Cell Below. The dropdown menu allows us to switch between Markdown cells and Code cells. 

**Running cells** Cells run one after the other. You can run cells by pressing Shift + Enter. The output from a code cell (if there is any) will appear below the cell. 

**Stopping cells running** If you'd like to stop a cell running (maybe you've accidentally written some code that runs in an infinite loop) you can use 

**Saving your work** Jupyter will autosave your work once every couple of minutes, but if you've just completed a very critical piece of code, it's best to save manually as well. 

**Shortcuts** Jupyter has plenty of shortcuts to save time:

``Shift + Enter``: Run current cell

``Esc+M``: Convert current cell to markdown

``Esc+Y``: Convert current cell to code 

``Esc + A``: Insert new cell above current cell

``Esc + B``: Insert new cell below current cell

``Esc + H``: Show shortcuts menu

---

<a id="helloworld"></a>
# <font color='blue'> Your first Python code

### Hello World!

It is customary to be introduced to any new programming language by writing a line of code that prints the phrase 'hello world'.

In [None]:
print('hello world')

In [None]:
print("hello again")

Let's think about what's happening here.

We're running (or calling) Python's built in 'print' function. We'll learn more about functions in programming later on, but for now we should just bear in mind that a Python function is very similar to a mathematical function- it takes a user defined input (or argument), does something with it, and gives us an output.

Our input in this case is 'hello world' and the output is the 'hello world' message that gets printed out. This might not seem like the most useful function, but later on we'll use it to print out the results of calculations and more complex functions.

Note that single quotes and double quotes are the same. Python is one of a few programming languages where ' ' and " " have the same functionality. A general rule of thumb is to choose one and stick with it.

---

<a id="3"></a>

# <font color='blue'> Types in Python

It's also good to briefly note that the 'type' of our input is a string (in programming, a string is a sequence of alphanumeric characters)

Different types of data (numbers, decimal numbers, strings of text, lists- more on those later) are treated differently by Python. This makes sense, because there are some operations or calculations that it makes sense to perform on numbers (e.g. division, multiplication) that would be nonsensical to try and perform on a piece of text, and vice versa.

Python has many different **types**, i.e. it can handle lots of different kinds of data, from numbers to text to data tables. 

We can always check the type of something in our code using Python's built-in 'type' function.

In [None]:
type('hello world')

In [None]:
type(35)

In [None]:
type(4.9)

In [None]:
type(2.3)

In [None]:
type('hello world')

In [None]:
type([1,2,3,4,5])

---

## <font color='red'> Now you try

Take a look at the Python commands below. For each command, **before you run the cell**, predict:

* What the input to the command is
* What the output will look like 
* What the command is doing

The clue is often in the name of the command; don't overthink this exercise!

**It's always helpful to think of Python commands as having an input and an output; this helps us understand what's going on**


In [None]:
'CAPS LOCK MAKES IT SEEM LIKE IM SHOUTING'.lower()

In [None]:
max([3,5,2,8])

---

### A comment on comments

Comments are lines of code that aren't executed by the Python interpreter. They're explanatory notes written by programmers, to explain what the code does and why it does it. Commenting your code is incredibly important for readable code- if you don't do this, you run the risk of forgetting what your own code actually does, or annoying your teammates by sending them hard to read, uncommented code. 

Comments follow the '#' symbol.

In [None]:
# comments can be on their own line
print('testing...')

In [None]:
print('testing testing') # or they can be on the same line as your code

In [None]:
"""
we can also use triple quote marks
to write comments spread across
multiple lines
"""
print('hello')

You should comment your code as much as you need. When I first began programming, I wrote around one line of comment for every line of code- maybe this was a bit too much, but it helped me understand what I was doing.

---
<a id="numbers"></a>

# <font color='blue'> Working with numbers

###  Ints and Floats

Integers in Python (numbers without a decimal point) have the type `int`, whereas numbers with a decimal point have the type `float` (short for 'floating point').

In [None]:
x=3; type(x)

In [None]:
x=3.0; type(x)

### Rounding

We can round numbers using the 'round' function, either to a whole number...

In [None]:
round(3.4)

In [None]:
round(3.9)

Or to a given number of decimal places (in this case, 6 decimal places)

In [None]:
round(3.2399817,6) # 2 parameter function

### Converting between numerical types 

We can convert between integers and floats using the 'int' and 'float' functions.

In [None]:
float(3)

In [None]:
int(5.5)

We can also convert ints and floats to strings.

In [None]:
str(6)

### Performing calculations

In [None]:
3+2 # sum two numbers

In [None]:
2-1

In [None]:
8**3 # raise one number to the power of another

In [None]:
4*3 # multiply two numbers

In [None]:
10%3 # get the remainder when one number is divided by another

Note that 8 ** 3 is legal/allowed, but 8 * * 3 is not. Whitespace around an operator doesn't matter. Spaces within the operator do!

Whitespace is the name given to spaces, tabs and indents. As long as you don't use it within an operator (e.g. you must always call the 'print' command as 'print', never 'p rint' it makes Python code easier to read and doesn't affect how your code runs (i.e. whitespace is ignored by the Python interpreter).

In [None]:
8 *   * 3

In [None]:
print(3+5)





print('hello')

---

## <font color='red'> Now you try

Insert new cells into this notebook to complete each of the following exercises.

1. Evaluate the following expressions: 
    
    (a) 7-15
    
    (b) $3^7+16/3$


2. The following lines of code will result in error messages when you try to run them. Fix the code so it runs without errors:

In [None]:
print('hello world!")

In [None]:
5 + '3'

In [None]:
print "hello?"

---
<a id="help"></a>

# <font color='blue'> Finding help

Do you feel like a Python programmer yet? Probably not, but don't worry- none of us, not even the ones who've been using Python for years, ever really feel like 'proper' programmers.

No one has the time to memorise every single Python function they'll ever need to use (and even if they did, it would be a very inefficient use of their time). Beyond a very small handful of functions, you're not expected to memorise the commands we cover in this seminar, and you should never hesitate to look for help online when you get stuck. 

Programmers sometimes joke that they spend 90% of their time Googling bits of code they've forgotten- that's not an exaggeration. Here are some excellent resources that I consult appoximately 100 times on any given working day:

* Stackoverflow: https://stackoverflow.com/
* Python documentation: https://docs.python.org/3/
* Google (obviously)
* The person sitting next to you

Understanding that it's ok and encouraged to look for help is a very important part of your Python journey.

---

<a id="variables"></a>
# <font color='blue'> Variables

### Why do we use variables?

To understand what variables are and why they're useful in programming, let's imagine we've written a piece of code to perform some calculations with Google's share price over the past five days. For this example, let's assume Google's share price over the past five days (in USD) have been: 

* Today: ``1300``
* 1 day ago: ``1310``
* 2 days ago: ``1290``
* 3 days ago: ``1370``
* 4 days ago: ``1365``

We can use print statements together with simple arithmetic to print out some information about how the share price has changed over the past few days.

In [None]:
print('The percentage change in share price over the past 2 days is ', 100*(1300-1290)/1290)

print('The percentage change in share price over the past 4 days is ', 100*(1300-1365)/1365)

print('The maximum share price over the past 4 days was',max([1300,1310,1290,1370,1365]))

print('The minimum share price over the past 4 days was',min([1300,1310,1290,1370,1365]))


Now imagine you're repeating this analysis one week later, with five new share prices:

* Today: ``1280``
* 1 day ago: ``1317``
* 2 days ago: ``1347.89``
* 3 days ago: ``1369.8``
* 4 days ago: ``1366.01``

---

## <font color='red'> Let's discuss

Is it a good use of your time to write the block of code above, updated for the new set of share prices. Why?

---

### What is a variable?

A variable is a way of assigning a name to a piece of data in Python. 

To create or **declare**, we use the operator = (not to be confused with ‘equal to’). 

The assignment ```myname = 'Bob'``` can interpreted as "the string ```'Bob'``` is assigned to the variable ```myname```". 

When defining variable names they should only contain numbers, letters (both upper and lower), and underscores \_. 

**They must begin with a letter or an underscore, and variable names in Python are case sensitive.**

As with most programming languages it is important that _reserved words_ that are already commands in Python (e.g. import, print, for) are not used for variable names. 

Here are some examples of variables: 

In [None]:
my_name = 'maryam'

In [None]:
my_name

In [None]:
my_list = [2,3,4,5,6,7]

In [None]:
my_list

In [None]:
share_price_2019 = 400

In [None]:
share_price_2019

Variable names are **case sensitive**, so if we try to retrieve the value of a variable called ``my_Name``, Python will tell us that variable doesn't exist because we haven't declared it yet. 

In [None]:
my_Name

### Variable names

A reserved word in Python is a word that's either

* Already the name of a built in function, like print or sum or list
* Already the name of an operator, like and, or or, or not 
* Already the name of a type, like int 

Bad things happen if you try to make a variable that has the same name as a reserved word. Don't do things like this: 

```print = 5 
list = [1,2,3,4]
set = 58```

This is another reason why having long, meaningful variable names is a good idea.

It's a good idea to choose meaningful names for variables: although it's used in the examples above, 'x' would be a very unhelpful name for a variable in practise, because the name doesn't tell us anything about what 'x' actually is.

Better examples of variable names (depending on the application) could be:

In [None]:
user_age = 29
user_name = 'Sarah'
days_in_year = 365

Variable names must start with a letter of the alphabet or an underscore.

The following are illegal names for variables:

In [None]:
x: = 1.0
1X = 1
X-1 = 1
for = 1

### What could go wrong?

Let's take a look at what will happen if you give a variable the same name as a built-in Python function. 

First, let's check the 'type' of the first function we met, `print`. As expected, its type is a built-in function. 

In [None]:
type(print)

In [None]:
print('hello')

Now let's intentionally make a mistake. We'll create a variable called `print` and assign it a value of 4.

In [None]:
print = 4

Now, when we try to use the `print` function, Python gives us an error. That's because `print` is no longer a function; it's an integer because that's what we just did above!

In [None]:
print('hello')

We can confirm this by checking the type of `print` again. Whoops.

In [None]:
type(print)

We can fix our mistake by deleting the variable `print` from Python's memory.

In [None]:
del print 

Now, when we check the type of `print` we can see it's back to being a built-in function as expected.

In [None]:
type(print)

<a id="strings"></a>
# <font color='blue'> Strings

Strings are essentially any character combination in between quotes. They are most often used as a way of storing text. Strings are used frequently, because most of the data that humans create are text-based, such as restaurant reviews or emails.

In [None]:
s = "Hello world"
type(s)

### Concatenation

Two strings can be joined/added together (concatenated) using the **+** operator:

In [None]:
"Hello, " + "world!"

This joining together of strings is very simple; however if you want words split by a space you have to put the space in. Here are some examples

In [None]:
"Hello, "   +  "world!" # space after the comma

In [None]:
"Hello,"  +  " world!" # space before second word

### Converting strings to integers

Strings can be converted to integers using the built-in function 'int', but only when we're asking Python to perform a sensible conversion!

In [None]:
type(int("10"))

In [None]:
int("-100")

In [None]:
int("100-10") # this gives an error

In [None]:
int("Shrubbery") # so does this!

### String Indexing

In some cases, we may want a part of the string (like the first character for alphabetizing or categorizing). Indexing helps us do that.

We can extract characters at specific index locations in a string using indexing.

In [None]:
# Indexing the first (index 0) character in the string:
s[0]

The number you enter after the variable name in brackets (the `[0]`) is called the index (its plural is indices).

_Counting in Python and many other programming languages begins at zero, as opposed to one. This is called zero-based indexing._

In [None]:
# This is called "slicing." We start at the left index 
#   and go up to but not include the right index.

# Objects at indexes 0, 1, and 2:
s[0:3]

Most ranges, or functions with ranges, have upper ends that are not inclusive. So, a range of `[0:5]` starts at `0` and stops before `5`.

A good mental trick is to look at something like `[5:25]` and say out loud "Starting at five and going up to (but not including) 25."

In [None]:
# From index 6 up to the end of the string:
s[6:]

In [None]:
# No start or end specified:
s[:]

In [None]:
# Can we index from the right side?
s[-1]

In addition to specifying a range, you can include a step size or character skip rate. This might be helpful if you want every other letter, for example. 

These indexing methods can also be used on lists, where asking for every other number might be a good use case.

In [None]:
# Every second character starting at 0 and ending at 10:
s[0:10:2]

In [None]:
# Define a step size of 2; i.e., every other character:
s[::2]

In [None]:
# The same, but for a list of numbers:
[0, 1, 2, 3, 4, 5, 6][::2]

### Escape sequences

Python allows escape sequences (a backslash followed by a letter) to insert whitespace into strings.

In [None]:
print("Hello, \nworld") # inserts a new line

In [None]:
print("Hello,\tworld") # inserts a tab

In [None]:
print("Hello,\\world")

### Slicing strings

Slicing is a method used to extract a portion of the string, or a substring. Slicing uses square brackets and the syntax ```mystring[a:b]``` to specify which parts of the string we want to extract, where ```a``` is the index (or position) of the start of the substring and ```b-1``` is the index of the end of the substring.

Python uses zero-indexing, which means the first element in a string of length ```n``` is at position 0, and the last element in the string has an index (or position) equal to ```n-1```.  

In [None]:
mystring = "Press return to exit"
print(mystring[0:14])

A string is an immutable object. Its individual characters cannot be modified with an assignment statement, and it has a fixed length. Any attempt to be violate this property will result in an error.

In [None]:
mystring[0]="p" # attempt to change P to p

---

## <font color='red'> Now you try

1. Declare the string "The quick brown fox jumped over the lazy dog" as a variable called ``mystring``, and write code to extract:

    (a) The following substring: "The quick brown fo"
    
    (b) The first character only
    
    (c) The last character only
    
    (d) The substring "over the lazy dog"
    
    
2. Figure out what the following syntax does: 

```mystring[i:j:k]```, ```mystring[-1]```, ```mystring[:i]```, ```mystring[i:]```, ```mystring[:]```, ```mystring[-2]```

3. Use the ``len`` function to find the length of the string.


4. The following code will throw an error when you try to run it. Use escape sequences so it prints the following result:

``She said "He's awake."``


In [None]:
print("She said "He's awake."")

---

<a id="lists"></a>
# <font color='blue'> Lists

### What's a list?

Lists are compound data types. This means they are sequences of values, where each element can be of any type. The syntax for creating a list is ``[...]`` where each element is separated with a comma.

In [None]:
list_of_stuff = ['Pizza',5,42.0,True]
list_of_stuff

Unlike strings, lists are mutable. This means their contents can be changed after creation.

In [None]:
list_of_stuff[0]="hello"
list_of_stuff

We can index lists in exactly the same way as strings.

In [None]:
primes = [ 2, 3, 5, 7, 11, 13, 17, 19]
primes[0]

We can append items to the end of a list using the 'append' **method** (we'll come back to methods later).

In [None]:
b=['peter','piper']
b.append(10)
b

To determine the length of a list; the earlier function len used for strings, can be used here.

In [None]:
primes = [ 2, 3, 5, 7, 11, 13, 17, 19] # index runs from 0 to 7 incl.
len(primes)

Note that we always read indexing from left to right. In the example above, the interpreter looks up names and gets the first element, which is the string "Anne". Then, the slice ([1:]) adds the first index of that string to the end of the original string, evaluating to "nne".


In [None]:
['Carol', 'Anne', 'Jessica', 'Michelle'][1][1:]

---

## <font color='red'> Now you try

1. Given the list below, write code to extract:

    (a) The first element

    (b) The last element

    (c) The substring 'ban' from the last element

    (d) The slice ``['a','d',1,3]`` from the list

    (e) The slice ``['a','e',3]`` from the list
    

2. Append an extra element, 'apples', to the list


3. Replace the first element of the list with the number 5


4. Reverse the order of elements in the list (you might need to Google around a bit to get the solution)


5. Use the ``len`` function to find the length of the list.

In [None]:
mylist = ['a','b','d','e',1,2,3,'124','bananas']

---

<a id="dictionaries"></a>
# <font color='blue'> Dictionaries 

Dictionaries are an alternative to lists for storing and accessing information. Instead of using an ordered index to access data stored in a dictionary, we use a system of key-value pairs.

A key is similar to a variable name.

A value is similar to the value assigned to the variable.

Instead of looking up (or 'indexing') values in a dictionary using numerical indices (e.g. ``mylist[0]``), we access elements in a dictionary using words, or keys.

We create dictionaries using curly brackets ``{}``. Remember this is different to the syntax for creating lists, which is ``[]``. 

Let's start by making a dictionary that contains the favourite foods of some people.

In [None]:
favourite_foods_dictionary = {'Alice': 'ramen',
                             'Bob': 'pizza',
                             'Sarah': 'curry' }


In [None]:
type(favourite_foods_dictionary)

In [None]:
favourite_foods_dictionary['Alice']

The keys stay the same  in a dictionary, but the values are changeable:

In [None]:
favourite_foods_dictionary['Bob'] = 'fish and chips'

Adding a new dictionary entry is easy:

In [None]:
favourite_foods_dictionary['Dave'] = 'sushi'

We can also nest elements in a dictionary; so an element in a dictionary can contain a list, or even another dictionary.

In [None]:
favourite_things_dictionary = {'Alice': 'ramen',
                             'Bob': ['dim sum','hotpot','noodles'],
                             'Sarah': {'food':'lasagna','drink':'coke'}}

So to find Bob's top favourite food...

In [None]:
favourite_things_dictionary['Bob'][0]

And to find Sarah's favourite drink...

In [None]:
favourite_things_dictionary['Sarah']['drink']

Elements of dictionaries can also contain multiple types: strings, ints, lists, and dictionaries.

In [None]:
people_information_dictionary = {'Alice': {'food':'pizza',
                                           'drink':['coffee','orange juice','tea'],
                                           'age':29},
                               'Bob': {'food':'dim sum',
                                          'drink':'water',
                                          'age': 21},
                               'Sarah': {'food':'lasagna',
                                         'drink': 'green tea',
                                         'age': 22}}

---

## <font color='red'> Now you try
    
Can you retrieve the following from the `people_information_dictionary` defined above?

* Bob's favourite food
* Sarah's age
* Alice's 3rd favourite drink

Here's a dictionary that contains other dictionaries. Using a combination of keys and (where necessary) list indexing, can you retrieve the following information from `company_info_dictionary`?

* The number of companies in the dataset
* The number of employees at Wiiwork
* MeTube's founder
* The 2nd of Bloogle's founders
* All of Bloogle's office locations
* The first name of Wiiwork's first founder


In [None]:
company_info_dictionary = {'meta_data':{'number_of_companies':3,
                                       'data_last_updated':'12/01/2020'},
                           
                          'company_data':[{'name':'Wiiwork',
                                         'employees':230,
                                         'office_locations':['New York'],
                                         'founders':['Andy Newman','Mike MacIntyre']},
                                          
                                          {'name':'MeTube',
                                         'employees':100,
                                         'location':['San Francisco','London'],
                                         'founders':['Sarah Williams']},
                                          
                                          {'name':'Bloogle',
                                         'employees':5001,
                                         'location':['Mountain View','New York','Paris','London'],
                                         'founders':['Steve Brine','Lee Peterson']}                                          
                                          
                                         ]}


<a id="libraries"></a>
# <font color='blue'> Libraries

### What's a library?

In programming, a library is a big bundle of code snippets (or functions) that someone else has written and made freely available for other people to download and modify (open source culture again!). A single library will usually be designed to help people write code for specific applications or purposes. For example:

* Pandas is a library for data science
* Numpy is a library for numerical computing
* NLTK is a library for natural language processing
* Tensorflow is a machine learning library released by Google

Let's say we have a list, and we'd like to use the statistics and/or numpy libraries to compute the median and mean of that list. The numpy and statistics libraries both offer functions that make this easy. 

Let's run numpy's `mean` function to get the mean of a list... 

In [None]:
my_list = [1,2,3,4,5]


In [None]:
mylist = [1,12,15,19,20]
numpy.mean(mylist)

### Importing libraries

Whoops! Before Python can access a library's functions, we need to ``import`` or 'load' the library. 

There are several ways to import libraries.

Avoid importing libraries like this, it's clunky and ugly and wastes time

In [None]:
import numpy 


numpy.mean(mylist)
numpy.median(mylist)

This will next method will work, but can lead to confusion with other libraries/you don't know where the function has come from 

In [None]:
from numpy import *
from statistics import *

median(mylist)
mean(mylist)

This last method is best practise

In [None]:
import numpy as np
import statistics as stats

np.median(mylist)
np.mean(mylist)

stats.median(mylist)
stats.mean(mylist)

Usually, all the library imports will happen at the very top of a Python notebook or script, but for the purposes of this class this won't be the case

---

## <font color='red'> Now you try

1. Import the ``statistics`` library as ``stats`` and the ``numpy`` library as ``np``.


2. Use ``stats.mean()`` and ``stats.median()`` to calculate the mean and median of this list of values: ``[1.5, 2.3, 6.7, 8, 10]``


3. Repeat these calculations using the 'numpy' library and confirm the results are the same.


4. Import the `math` library without giving it a short name, and try out its ``sqrt`` and ``exp`` functions to calculate the square root of 12


---

<a id="functions"></a>
# <font color='blue'> Functions
    
Functions in programming are a lot like mathematical functions. 

They take one or more inputs, do something to those inputs and then return one or more outputs.

In Python, there are built in functions (like ``print``) and library functions (like ``math.sqrt``), but we can also write our own functions to perform more specialised tasks.

You can think of writing a function as being similar to building a machine that takes inputs, transforms them, and gives you outputs. 

A function has to be defined first, and then called- just like a machine has to be built before you can use it.

The syntax and structure to define a function must include:

* The def keyword, followed by the function’s name
* The arguments of the function are given between parentheses followed by a colon
* The function body, correctly indented
* Optionally, the return statement

We can define a simple function to square a number as follows.

In [None]:
def square(x):
    return x**2

Defining the function can be thought of as 'building the machine'. Now our machine is built, we can use it with our own inputs.

We can feed inputs directly into the function, and the output will be whatever is in the ``return`` statement.

In [None]:
square(5)

Or we can declare the inputs as variables, and then feed them into the function.

In [None]:
x = 5
square(x)

This works regardless of what variable name we give our inputs. The ``x`` in our function definition is just a placeholder.

In [None]:
y = 6
square(6)

We can also assign the output of a function to a variable. The variable will take the value of whatever is in the ``return`` statement of our function.

In [None]:
y = 6
squared_number = square(y)
print(squared_number)

We can't access the value of anything in our function that isn't part of the ``return`` statement. All variable names inside a function are **internal to that function** and won't be recognised outside the function unless they're part of the ``return`` statement.


A function stops running as soon as it hits a ``return`` statement, so ``return`` should always be in the last line of your function; anything below this won't be executed.

In [None]:
def number_powers(z):
    
    squared = z**2
    cubed = z**3
    
    return squared
    

In [None]:
number_powers_result = number_powers(2)

In [None]:
cubed

If we want the function to return ``cubed`` as well as squared, we need to add it to the ``return`` statement.

In [None]:
def number_powers(z):
    
    squared = z**2
    cubed = z**3
    
    return squared, cubed

In [None]:
squared, cubed = number_powers(3)
print(squared, cubed)

Functions can have multiple inputs and multiple outputs. Just make sure the inputs are inside the round brackets in the first line of the ``def`` statement, and all outputs are part of the ``return`` statement.

In [None]:
def add_three_numbers(a,b,c):
    
    sum_1 = a+b
    sum_2 = a+c
    sum_3 = b+c
    
    return sum_1, sum_2, sum_3

In [None]:
add_three_numbers(5,6,7)

It's also possible to have a function without a ``return`` statement. This means a variable won't be explicitly returned by the function, but we can still view the results of the calculations inside the function using ``print`` statements.

In [None]:
def multiply_three_numbers(a,b,c):
    
    multiply_1 = a*b
    multiply_2 = a*c
    multiply_3 = b*c
    
    print(multiply_1,multiply_2,multiply_3)
    

In [None]:
multiply_three_numbers(3,4,5)

Another way of calling this function and assigning the outputs to variables would be:

In [None]:
multiply_1,multiply_2,multiply_3 = multiply_three_numbers(3,4,5)

Again, the names of the outputs could be anything we want. Variable names are **internal to a function** and the input/output names we use when calling a function can be whatever we like. 

---

## <font color='red'> Now you try

Let's write a function to calculate the final value of an investment that's accruing compound interest.

This is described by the formula:

$$TV = PV(1+r)^n$$

Where

$TV$ is the final value of the investment

$PV$ is the initial investment

$r$ is the interest rate per year 

$n$ is the number of years

1. Write a function that computes the final value of an investment, given the initial investment, interest rate, and number of years.


2. Use your function to compute the final value of a £130 investment, assuming an interest rate of 3% over 5 years.


3. Write a function that computes the number of years needed for an investment reach a given multiple of its initial value, also given the interest rate. (**hint: you'll need to rearrange the formula to make $n$ the subject**, and you'll need the ``math`` library)


4. Use your function to work out how many years are needed for an investment to double in value, assuming an interest rate of 0.01%

---

<a id="loops"></a>
# <font color='blue'> Boolean logic, Loops and If Statements
    
<a id="bool"></a>
## Boolean logic

Boolean logic allows us to test whether a statement is true or false.

The results of a Boolean logical test is always either ``True`` or ``False``; these two values have their own special type, the ``bool`` or Boolean type.

In [None]:
type(True)

In [None]:
type(False)

Some examples of logical tests are shown below. Just like we can use arithmetic operators (``+``, ``-`` etc) to perform calculations, Boolean logic has its own set of operators to perform logical tests:

**Equal to**

In [None]:
6==10 

**Greater than**

In [None]:
6 > 10

**Less than**

In [None]:
6 < 10

**Not equal to**

In [None]:
6 != 10

**Greater than or equal to**

In [None]:
3 >= 4

**Less than or equal to**

In [None]:
4 <= 4

## The ``and``, ``or`` and ``not`` operators

These three operators can be used to compare and combine the results of more than one logical test. The results of an ``and``, ``or``, or ``not`` operation is also a Boolean; that is, ``True`` or ``False``.

### The ``and`` operator

The result of an ``and`` operation is ``True`` if **both inputs** are ``True``, and ``False`` otherwise.

``True`` **and** ``True`` $\longrightarrow$ ``True``


``True`` **and** ``False`` $\longrightarrow$ ``False``


``False`` **and** ``True`` $\longrightarrow$ ``False``


``False`` **and** ``False`` $\longrightarrow$ ``False``


In [None]:
6<10 and 5>2

In [None]:
6<10 and 5<2

In [None]:
6==10 and 5<2

### The ``or`` operator

The result of an ``or`` operation is ``True`` if **either or both inputs** are ``True``, and ``False`` otherwise.


``True`` **or** ``True`` $\longrightarrow$ ``True``


``True`` **or** ``False`` $\longrightarrow$ ``True``


``False`` **or** ``True`` $\longrightarrow$ ``True``


``False`` **or** ``False`` $\longrightarrow$ ``False``

In [None]:
6<10 or 5>2

In [None]:
6<10 or 5<2

In [None]:
6==10 or 5<2

### The ``not`` operator

A ``not`` operation takes the result of a logical test and flips it.

**not** ``True`` $\longrightarrow$ ``False``

**not** ``False`` $\longrightarrow$ ``True``

In [None]:
not (6>10)

In [None]:
not (10>6)

In [None]:
not((5 < 3) and ((6 <= 6) or (5 != 6)))

---

## <font color='red'> Now you try

Predict what the outcomes of these logical tests will be, then run the cells to check your predictions.

In [None]:
(6 <= 6) and (5 < 3)

In [None]:
3 < 2 or 45 % 3 == 15

In [None]:
60 - 45 / 5 + 10 == 1

In [None]:
(6 <= 6) or (5 < 3)

In [None]:
(5 < 3) and (6 <= 6) or (5 != 6)

<a id="if"></a>
## The ``if`` statement

``if`` statements are a way of making different things happen/running different bits of code depending on the outcome of a logical test.

The simplest for of an ``if`` statement is shown below:

``if logical test is true:
        do some stuff``

In [None]:
if 5>3:
    print('five is greater than three')

## The ``if-else`` statement

The ``if-else`` statement runs one bit of code if the logical test is ``True``, and another bit of code if the logical test is ``False``.

The general form of an ``if-else`` statement is below:

``if logical test is true:
        do some stuff
  else:
        do something else``

In [None]:
if 1>3:
    print('one is greater than three')
else:
    print('one is NOT greater than three')

You'll notice that the line underneath each ``if/elif/else`` block is indented by four spaces, or one tab.

We discussed whitespace earlier, and how in all but a few special cases, it is ignored by Python. Python uses white space to indicate the start and end of ``if`` statements and loops, so indentation is important here.

When using ``if/elif/else`` blocks, all of the control blocks must have the same indentation level and all of the statements inside the control blocks should have the same level of indentation; or Python will return an error.

Returning to the previous indentation level instructs Python that the block is complete.

## The ``if-elif`` statement

The ``if-elif`` statement runs one bit of code if the logical test is ``True``, and another bit of code if **another** logical test is ``True``, and so on until the end of the block is reached. An ``if-elif`` block can end with an ``else`` statement, but this isn't necessary.

The general form of an ``if-elif`` statement is below:

``if logicaltest1 is true:
        do some stuff
  elif logicaltest2 is true:
        do something else
  elif logicaltest3 is true:
        do something else entirely``
        

## <font color='red'> Now you try
    
Write a function called ``odd_or_even``. It should take a single integer as an input.

If the integer is even, the output of the function should be the string "even!" and if the integer is odd, the output should be "odd!"

To do this, you will need to use the ``%`` operator.
    

---

<a id="for"></a>
## The ``for`` loop

``for`` loops allow us to loop over a list, and perform some calculation on each element in the list. The ``for`` loop will iterate (or cycle) across all items in the list, beginning with item at position ``0`` and continuing until the final element.

The generic form of a ``for`` loop is below:

``for every_element in my_list:``
        
    ``perform some calculation or operation with the current element of the list``


``for`` loops can also be used to iterate over other types (not just lists) such as ranges, tuples, arrays or matrices.

In the example below, notice how the value of **i** changes with every iteration of the loop, to take the value of the current element in the list.

In [None]:
for i in [1,2,'bananas',3,4,True]:
    
    print(i)

Here's an example of a ``for`` loop that iterates over a ``range`` rather than a list. A ``range`` is a useful way of quickly generating a sequence of numbers.

In [None]:
for x in range(0,5):
    
    print(x)

``for`` loops can also be used to update the values of variables, or append items to the end of lists.

In [None]:
mylist = [] # initialise an empty list

for y in range(0,5):
    
    mylist.append(y**2)

print(mylist)

**Indentation matters!** 

Any piece of code **inside** the indented ``for`` loop block will be executed on every cycle of the ``for`` loop. 

Any code not inside the intended block is not part of the loop, and is a regular stand-alone piece of code.

Notice the difference between the following code cell and the code in the cell above:

In [None]:
mylist = [] # initialise an empty list

for y in range(0,5):
    
    mylist.append(y**2)
    print(mylist) # on EVERY iteration of the loop, print the current state of the list
    

---
## <font color='red'> Now you try

Use a ``for`` loop together with the ``range`` function to print all the whole numbers from 1 to 100. 

But for multiples of three print "Fizz" instead of the number and for the multiples of five print "Buzz". 

For numbers which are multiples of both three and five print "FizzBuzz".

**For context**: The ``fizzbuzz`` question is well known as an exercise used in programming interviews. It was devised by Imran Ghory, and popularised by Jeff Atwood: 

https://imranontech.com/2007/01/24/using-fizzbuzz-to-find-developers-who-grok-coding/ 

---

<a id="while"></a>
## The ``while`` loop

A ``while`` loop keeps repeating an operation **while** a certain logical test remains ``True``. 

The general form is:

``while logical test remains True:
    do some stuff``


In [None]:
counter=1

while(counter<=10):
    print(counter)
    counter+=1 # we update the value of 'counter' this is a short way of writing counter = counter+1
    
print("Done") # this is outside the loop

---
## <font color='red'> Now you try

Let's go back to our compound interest formula:

$$TV = PV(1+r)^n$$

Where

$TV$ is the final value of the investment

$PV$ is the initial investment

$r$ is the interest rate per year 

$n$ is the number of years

Suppose we'd like to know for how many years we have to keep 100 pounds on a savings account to reach 200 pounds simply due to annual payment of interest at a rate of 5%. Write code using a ``while`` loop to show that this will take 15 years.

---

<a id="classes"></a>
# <font color='blue'> Classes
    
Python is an **object oriented programming language**.

This means that almost everything in Python- variables, lists, integers- is an object, with its own specific **properties**  and **methods**.

A **method** is a function that can be run on a specific object. We've used a lot of methods already, like the ``append()`` method for lists or the ``upper()`` and ``lower()`` methods for strings.

A **property** is an attribute of a specific object, like the ``.keys()`` property of a dictionary. 

A Class is like a "blueprint" for creating objects. For example, the **dictionary** class is a blueprint for how to create dictionaries and how they should behave, the **list** class is a blueprint for creating lists etc. 

Every time we create a new list, Python uses the **list** class as a template for creating that list. The functions that a list can call (e.g. ``append``) are also defined as part of this template.

We'll be **using** existing classes more than we'll be creating our own in this course, so let's start by accessing the methods and properties of different classes we've used so far.

Let's start by creating a dictionary.

In [None]:
my_dict = {'fav_food':'pizza',
          'fav_drink': 'tea',
          'shoe_size':4.5}

We can use ``dir`` to access all the **attributes** and **methods** associated with a class. **Attributes** are properties of an object, and are written using ``__attribute__`` notation.

**Methods** are functions that the class can call. 

In [None]:
dir(my_dict)

Let's try out some **methods** or functions that are specific only to dictionaries.

In [None]:
my_dict.keys() # get the keys

In [None]:
my_dict.popitem() # remove and return the last item

Now let's initialise a list.

In [None]:
my_list = [1,2,3,4] 

Let's use ``dir`` to view the attributes of the list, and check some of them out.

In [None]:
dir(my_list)

Let's try out some **list methods**.

In [None]:
my_list.append(5)

In [None]:
my_list.pop()

This is a way of accessing a list's length attribute (an alternative to using the ``len`` function)

In [None]:
my_list.__len__()

## <font color='red'> Codealong
    
Together, let's explore the **methods** and **attributes** of the ``LinearRegression`` class in ``scikit-learn``.

Let's start by importing the class from the ``scikit-learn`` library.

In [None]:
from sklearn.linear_model import LinearRegression
import numpy as np

Let's create an instance of the ``LinearRegression`` class. This creates an object that can:

* Find the line of best fit given some training data
* Make predictions about new data points 

In [None]:
my_linear_regressor = LinearRegression()

Let's take a look at its attributes and methods using ``dir``.

Which method do you think we'll need to fit a linear regression model to some data?

In [None]:
dir(my_linear_regressor)

Let's use the ``fit`` method to fit our model to some dummy data.

In [None]:
x = np.array([1,2,3,4,5]).reshape(-1, 1)
y = np.array([1,2,3,4,5]).reshape(-1, 1)

my_linear_regressor.fit(x,y)


Now let's access some **attributes** of our newly fitted linear regression model to find out what the line of best fit is.

Let's run ``dir`` to check which attributes will give us the information we're after.

In [None]:
dir(my_linear_regressor)

Notice how the `coef_` and `intercept_` attributes weren't visible using ``dir`` **before** we fitted our model to out data? Why is that?

In [None]:
my_linear_regressor.coef_

In [None]:
my_linear_regressor.intercept_

How should we interpret the coefficient and intercept of our straight line?