# Python in 10 minutes

Python is a popular language because the barriers to learning it are relatively low. You can do quite a lot in just a few lines of code, so in the spirit of doing being a great form of learning, this notebook is designed to run you through some basics in about 10 minutes to produce a chart.

**Important:** Before starting, please make sure all the cells are cleared by typing `Shift+C`

***

## Basic calculations

Let's start by using Python for some basic functions, such as calculation.

In the grey box below, type in any simple arithmetic calculation (e.g. `4+733` , or `898*432` ). Use the same characters for calculations as you would in Excel: + - * /

After you've typed it in and while the cursor is still blinking, run the program by pressing `Shift+Enter` , or click on the run button in the menu above ![runButton.png](attachment:runButton.png)

In [None]:
898*432

Did you get the result you were expecting?

At its heart, Python is a calculator, so it does things like that very well. Feel free to have another go and include things such as parentheses, decimal numbers or even powers _(you'll need to use a double asterisk for that - 3**2 is equivalent to $3^2$)_

***

## Text strings

Python can also handle text. In the box below, try typing out `print("Hi there!")` and run the program in the same way you did before (`Shift+Enter`). If you like, you can swap the bit between the inverted commas for any text you like - just make sure to keep the "commas"!


In [None]:
print("Hi there!")

Now if that went well, you should have seen Python return the text between the commas.

***

## Variables

This is all well and good, but we're not tapping into the real power of Python yet.  

An important step towards that is by using variables. Variables are like the memory button on old school calculators, which would store a number and bring it back when you pressed the button. With Python we get to store as many things as we like - the key is to give each one a unique name.

Let's try it.

If you're in a restaurant and you want to calculate the tip, you'll need two things: the price of the meal and the percentage value of the tip. Let's set both of those by typing in the following:

`meal = 77.76`<br/>
`tip = 12.5/100`

Don't forget to use `Shift+Enter` when the cursor is blinking in the box to run!

In [None]:
meal = 77.76
tip = 12.5/100

Don't worry if it looked as though nothing happened there - if it didn't work then you'd see an error message! You didn't see any output because Python was only given the command to store the values, and not to output anything.

Let's check to make sure it stored those values, just in case. In the box below, type `print(meal)` and/or `print(tip)`

In [None]:
print(meal)
print(tip)

Did you see the numbers you were expecting? Now we can work out how much the tip is going to be by typing `meal*tip` and running it

In [None]:
meal*tip

Hopefully that came out as £9.72 (or something else if you decided to use different numbers above. Can you go back and work out the total bill, including the tip based on what you know now?

If you like, you can go back and alter the values of the meal and the tip, just make sure you run each box in turn - i.e. run the box setting the variable before you run the last box!

***

## Functions

We really start seeing the power of Python when we use functions. You've probably already used functions in other programs like Excel - things like `sum( )`, or `countif( )`. These exist in Python too - for example, you've already used the `print( )` function.

You can write functions yourself if you want, and it's sometimes useful to do so. Alternatively, you could use one of the myriad of functions which other developers have written and bundled together in what Python refers to as libraries.

Here is where the real power of Python starts to reveal itself. Where Excel has a limited number of functions which might increase with each release every 2-3 years, Python has over 130,000 libraries with numerous functions in each, and it's increasing daily!

These functions cover a huge range of possibilties. If you have something you want to do in mind, it's more than likely Python has a function for it in one of its libraries.  Scraping web pages, artificial intelligence, automating software, producing PDFs, and visualising data are just the beginning. We're going to do something quickly now with the latter.

To access a library, you first need to import it. This simply means you're downloading it to be used in your commands.

We're going to import two libraries, one called pandas and the other called matplotlib. Pandas is going to do the data manipulation, and matplotlib is going to make the charts for us.

In [None]:
import pandas as pd
import matplotlib as mpl

Again, no result is a good thing.

Now we've already loaded some sample data from the [TfL open data area](https://tfl.gov.uk/info-for/open-data-users/our-open-data) so let's use one of the functions in pandas to read it in.  This is like calling data into an excel spreadsheet.

In the box below, run the command `oyster = pd.read_csv('data/Nov09JnyExport.csv')` if it's not already in there.  This simple tells python to read in the data file from a folder and store it under the variable "oyster"

In [23]:
oyster = pd.read_csv('data/Nov09JnyExport.csv')

This dataset provides a 5% sample of all Oyster card journeys performed in a week during November 2009 on bus, Tube, DLR, TfL Rail and London Overground.

For the record, it contains over two million lines of data.  Excel has a limit of one million rows, so what we're doing now would not be possible in Excel!

Just to prove it, run the command `oyster.count()` in the box below.  You should see the name of each column in the data listed, with the number of rows next to each one.  That number should be 2,623,487

In [None]:
oyster.count()

OK, now for the real magic!

One of the columns in the dataset is called _FinalProduct_.  This lists the type of Oyster card used, for example if it's a **Freedom Pass (Elderly)**, a **Staff pass**, or even if it's an **LUL Travelcard** for the year/month/period/week.  Let's see if we can show if there's a pattern for what time of day each type of pass is commonly used.

First, because the charts could look a little messy, let's set the size of the figure by running `mpl.rcParams['figure.figsize'] = (20,20)` in the box below (again, no output):

In [None]:
mpl.rcParams['figure.figsize'] = (20,20)

OK, now we're going to create a histogram for each card type by counting all the transactions within every 15-minute period from midnight to midnight.  

First we call the _oyster_ variable which contains our dataset.  Then we run it through the _hist_ function, which will create the histograms.  Before it does that though, it needs to know a couple of things.

The time of the tap is recorded in the column _EntTime_, so we need to point the function there first by setting `column=` to 'EntTime'.

Next, we want to group by each card type, so we need to point the function to the _FinalProduct_ column where it will get these values.  Setting `by=` to 'FinalProduct' will do this.

Similarly, we want to count these swipes in groups of 15 minutes, and there are 96 periods in 24 hours, so we want to tell it to group all records into 96 buckets, or `bins=` as python refers to it.

Finally, we make sure the x-axes are the same for all the histograms by setting `sharex=` to 'True', and we make it look a bit smoother by setting the `density=` also to 'True'.

So in the end we have a "formula" which looks like this: 

`oyster.hist(column='EntTime', by='FinalProduct', bins=96, sharex=True, density=True)`

So if you copy/paste that into the box below (if it's not already there), run it and wait about 10 seconds, you should see some output!

In [None]:
oyster.hist(column='EntTime', by='FinalProduct', bins=96, sharex=True, density=True)

And there you have it - you've just processed 2.6 million rows of data into some interesting charts in just a few minutes!

If you have any questions please have a chat with anyone attending the PyR desk at hte Analysts' Conference today, or use the [PyR Slack Group](https://tfl-pyr.slack.com/signup) to have a wider conversation - we'd love to have you along!