# Using functions

## Lecture objectives

1. Demonstrate how to create a function that can query an API

In the previous lecture, we experimented with the BART API. Now, let's combine all this code into a function.

If you haven't used functions before, [check out this tutorial](https://swcarpentry.github.io/python-novice-inflammation/08-func/index.html) first.

Why might we use a function?
* If we will use the same code repeatedly (e.g. requesting departures from different BART stations), a function makes our code more concise
* Functions can make the code more readable, by making it modular

Here is the code from the previous lecture that prints departures from 12th St station.

In [None]:
import requests
import json
import pandas as pd
from IPython.display import display

APIkey = 'XXXX'  # replace XXXX with the API key you used in the previous lecture
requestString = 'http://api.bart.gov/api/etd.aspx?cmd=etd&orig=12TH&json=y&key='+APIkey

r = requests.get(requestString)
d = json.loads(r.text)
etd = d['root']['station'][0]['etd']

for e in etd:
    print('\nTrains to {}'.format(e['destination']))
    display(pd.DataFrame(e['estimate']))

We can create a function by simply indenting the code, and adding a `def()` statement at the start, to define the function. 

In [None]:
def getArrivalTimes():
    requestString = 'http://api.bart.gov/api/etd.aspx?cmd=etd&orig=12TH&json=y&key='+APIkey

    r = requests.get(requestString)
    d = json.loads(r.text)
    etd = d['root']['station'][0]['etd']

    for e in etd:
        print('\nTrains to {}'.format(e['destination']))
        display(pd.DataFrame(e['estimate']))

We can now call this function any time we want to get the latest arrival times.

In [None]:
getArrivalTimes()

The advantages of creating a function will become much more clear if we generalize.

Our function above only returns departure times for 12th St station. What if we want to tell the function the station we are interested in? To do this, we need to add an *argument*. This argument creates a variable that is internal to the function (i.e., it disappears once the function terminates).

Let's call this argument `station`. We then insert the station into the string that we pass to the API, `requestString`. Note the use of `.format()` to put the station and API key into the string.

In [None]:
def getArrivalTimes(station):
    requestString = 'http://api.bart.gov/api/etd.aspx?cmd=etd&orig={}&json=y&key={}'.format(station, APIkey)

    r = requests.get(requestString)
    d = json.loads(r.text)
    etd = d['root']['station'][0]['etd']

    for e in etd:
        print('\nTrains to {}'.format(e['destination']))
        display(pd.DataFrame(e['estimate']))

getArrivalTimes('12TH')

We can get a list of stations using the `stns` command, according to the [documentation](https://api.bart.gov/docs/stn/stns.aspx).

Then let's try another station using our handy function.

In [None]:
requestString = 'http://api.bart.gov/api/stn.aspx?cmd=stns&key={}&json=y'.format(APIkey)
r = requests.get(requestString)
r.text

In [None]:
getArrivalTimes('WARM')

So far, our function only prints the output. We might want it to return the dataframe rather than (or as well as) printing it.

Let's do this by creating a list of dataframes. For each one, we want to add the destination (since this isn't a column in the output). Then, we can use `pd.concat()` to concatenate into a single dataframe.

In [None]:
def getArrivalTimes(station):
    requestString = 'http://api.bart.gov/api/etd.aspx?cmd=etd&orig={}&json=y&key={}'.format(station, APIkey)

    r = requests.get(requestString)
    d = json.loads(r.text)
    etd = d['root']['station'][0]['etd']

    df_list = []
    for e in etd:
        # create the dataframe
        df = pd.DataFrame(e['estimate'])
        # add the column with the destination
        df['destination'] = e['destination']
        # add it to the end of the list
        df_list += [df] # df_list.append(df) would also have worked
    
    # now the loop has finished, return the concatenated dataframes
    bigDf = pd.concat(df_list)
    return bigDf

getArrivalTimes('12TH')

<div class="alert alert-block alert-info">
<strong>Exercise:</strong> Adapt the function to add a column with the origin station too.
</div>

<div class="alert alert-block alert-info">
<h3>Key Takeaways</h3>
<ul>
  <li>Functions are simple to create, and help to organize your code in a logical way.</li>
</ul>
</div>