# Saving our Work With Functions

### Introduction

Ok, now at this point, we have learned almost all of the work to really go forth and code.  And we did some really good work in the process.

But if we're going to put this code to use, and build some nice projects, we'll need to store some of our procedures in functions.

### Saving our Work

Now we've already seen how to save our work with variables.

In [1]:
cities = ['nyc', 'los angeles', 'chicago']

The code above **is something**.  It is a list, and we stored it as such.

But how do we save code that **does something**?  For example, our code below goes to Wikipedia and gathers the information then converts it to a list of dictionaries.  We may want to save that process to easily do it again and again.

In [2]:
import pandas as pd
url = 'https://en.wikipedia.org/wiki/List_of_United_States_cities_by_population'
tables = pd.read_html(url)
cities_table = tables[4]
cities = cities_table.to_dict('records')
cities[:2]

[{'2022 rank': 1,
  'City': 'New York[d]',
  'State[c]': 'New York',
  '2022 estimate': 8335897,
  '2020 census': 8804190,
  'Change': '−5.32%',
  '2020 land area': '300.5\xa0sq\xa0mi',
  '2020 land area.1': '778.3\xa0km2',
  '2020 population density': '29,298/sq\xa0mi',
  '2020 population density.1': '11,312/km2',
  'Location': '.mw-parser-output .geo-default,.mw-parser-output .geo-dms,.mw-parser-output .geo-dec{display:inline}.mw-parser-output .geo-nondefault,.mw-parser-output .geo-multi-punct,.mw-parser-output .geo-inline-hidden{display:none}.mw-parser-output .longitude,.mw-parser-output .latitude{white-space:nowrap}40°40′N 73°56′W\ufeff / \ufeff40.66°N 73.94°W'},
 {'2022 rank': 2,
  'City': 'Los Angeles',
  'State[c]': 'California',
  '2022 estimate': 3822238,
  '2020 census': 3898747,
  'Change': '−1.96%',
  '2020 land area': '469.5\xa0sq\xa0mi',
  '2020 land area.1': '1,216.0\xa0km2',
  '2020 population density': '8,304/sq\xa0mi',
  '2020 population density.1': '3,206/km2',
  'Lo

If we want to save code that does something, we can wrap it in a function.

> Let's just do it.  We'll explain this code later.

In [3]:
def gather_cities():
    url = 'https://en.wikipedia.org/wiki/List_of_United_States_cities_by_population'
    tables = pd.read_html(url)
    cities_table = tables[4]
    cities = cities_table.to_dict('records')
    return cities

Now that it's in a function, we can execute this code whenever like.  We do so by typing the name of the function followed by parentheses `function_name()`.  

> Want to go to Wikipedia and scrape the webpage?  Coming right up.

In [4]:
collected_cities = gather_cities()
collected_cities[:1]

[{'2022 rank': 1,
  'City': 'New York[d]',
  'State[c]': 'New York',
  '2022 estimate': 8335897,
  '2020 census': 8804190,
  'Change': '−5.32%',
  '2020 land area': '300.5\xa0sq\xa0mi',
  '2020 land area.1': '778.3\xa0km2',
  '2020 population density': '29,298/sq\xa0mi',
  '2020 population density.1': '11,312/km2',
  'Location': '.mw-parser-output .geo-default,.mw-parser-output .geo-dms,.mw-parser-output .geo-dec{display:inline}.mw-parser-output .geo-nondefault,.mw-parser-output .geo-multi-punct,.mw-parser-output .geo-inline-hidden{display:none}.mw-parser-output .longitude,.mw-parser-output .latitude{white-space:nowrap}40°40′N 73°56′W\ufeff / \ufeff40.66°N 73.94°W'}]

So you can see that we were able to store our procedure of collecting city information in our `gather_cities` function.

This is very useful, because it allows us to think of our programs as tasks.  For example, first gather the list of cities, then select the names and populations, and then plot our data.  

> Remember we said a lot of coding is breaking things into steps?  Functions are a great way to do that.

Ok, let's learn how to write a function.

### Function mechanics

Working with functions involves two steps:
1. Defining our function
2. Then executing our function.

* Defining our function

We define our function with the following pattern.

In [5]:
def function_name():
    return 'data'

> Press shift + enter on the cell above.  And then the cell below.

Notice that when we define a function, we do not see an output.  This is similar to how we do not see an output when we assign a variable.  We need to execute the function to see an output.  

> Press shift + enter on the cell below.

In [6]:
function_name()

'data'

Let's focus in on the first line where we defined our function: `def function_name():`.

* `def` is how we tell Python we are about to define a function.
* The `function_name` is how we'll refer to the function.
* And then we end our first line with parentheses and a colon `():`.

Now it's your turn.  Define a function called `collect_data`.  We wrote the second line `return data` for you.

In [7]:
# write code here
def collect_data():
    return 'data'

> **You can check** that you did it correctly by pressing `shift + return` on the cell above, **and then** on the cell below.  If you did it correctly, you will see the word data.  

In [8]:
collect_data()
# 'data'

'data'

Ok, so we just saw how to write the first line of a function.  Now let's talk about that second line `return 'data'`.

In [9]:
def function_name():
    return 'data'

There middle of the function is called the body of the function.  Below `greeting` and `name` are in the body of the function.  The body of the function can be as long as we like.  But it's best to keep our functions under five lines (excluding the function name).

### Functions are a dungeon

There is something interesting about functions. Functions trap everything inside of them, like the walls of a dungeon.

In [10]:
def function_name():
    greeting = 'hello'
    name = 'susan'

> Press shift + enter on the cells above and below.  We'll see that the cell below results in an error.

In [11]:
greeting

NameError: ignored

So you'll see that even though we defined the variable greeting above, it is not available.  This is because it is only available inside the walls of the function.

To get to be released from the function, we must catapult a value over the walls with the word `return` followed by what we want returned.

In [12]:
def function_name():
    greeting = 'hello'
    name = 'susan'
    return name

In [13]:
function_name()

'susan'

So now, `susan` was thrown over the walls of the function.

> Notice that the inside of the code must be tabbed.  Or there must be two spaces for each line.  The indentation is how we indicate that something is inside of the function.

In [14]:
def function_name():
    trapped_inside = 'hello'
    catapulted_over = 'susan'
    return catapulted_over

So that is our pattern for a function.

In [15]:
def function_name():
    body_of_function = 'hello'
    return body_of_function + ' world'

In [16]:
function_name()

'hello world'

Now, in the cell below, write a function called `catapult` that returns the word `slime`.

In [17]:
# write function here
def catapult():
  return 'slime'

In [18]:
catapult()
# slime

'slime'

## Back to our project

Ok, so how can this help us?  Well functions allow us to store an entire procedure, and then name that procedure.  Once written, we can largely forget about how the function works.  

All we care about is what the function does, which is execute a procedure and then throw something over the walls.

So we can think of function like our cellphone: we only need to know how the wires underneath work when something breaks.  Otherwise, we can just call the function and get an output.

Ok, so let's wrap some more code in functions, so that we can move more into the push button, get an output mode.

To do so, we wrap our ordinary code with the beginning line `def name_of_function():`.  And we end our function with returning an output.

> Here is our original code.

In [19]:
url = 'https://en.wikipedia.org/wiki/List_of_United_States_cities_by_population'
tables = pd.read_html(url)
cities_table = tables[4]
cities = cities_table.to_dict('records')

> And here is that code wrapped in a function.

In [20]:
def gather_cities():
    url = 'https://en.wikipedia.org/wiki/List_of_United_States_cities_by_population'
    tables = pd.read_html(url)
    cities_table = tables[4]
    cities = cities_table.to_dict('records')
    return cities

In [21]:
cities = gather_cities()
cities[:1]

[{'2022 rank': 1,
  'City': 'New York[d]',
  'State[c]': 'New York',
  '2022 estimate': 8335897,
  '2020 census': 8804190,
  'Change': '−5.32%',
  '2020 land area': '300.5\xa0sq\xa0mi',
  '2020 land area.1': '778.3\xa0km2',
  '2020 population density': '29,298/sq\xa0mi',
  '2020 population density.1': '11,312/km2',
  'Location': '.mw-parser-output .geo-default,.mw-parser-output .geo-dms,.mw-parser-output .geo-dec{display:inline}.mw-parser-output .geo-nondefault,.mw-parser-output .geo-multi-punct,.mw-parser-output .geo-inline-hidden{display:none}.mw-parser-output .longitude,.mw-parser-output .latitude{white-space:nowrap}40°40′N 73°56′W\ufeff / \ufeff40.66°N 73.94°W'}]

Your turn.

This time we'll work with the `for loop` that turns our list of dictionaries into a list of populations.  Below we'll create a new function called `get_populations`, that returns the list of the `populations`.

Do so in the following steps:  

1. Start at the top of the cell, and hold down your cursor dragging until the bottom of the cell, so that the entire cell turns purple.  Then press `tab` to indent the code.

2. Now we need another line at the top to define our function.  Place your cursor touching the `p` in `populations = []` and press enter.

3. In the new line that we created above the statement `populations = []`, write the name of the function beginning with `def` and ending with `():` and named `get_populations`.  Remember that our first line **should not** be tabbed in.  In other words, the `d` of `def` should be touching the border of our gray cell.

4. Then end your function with the return value.

In [25]:
def get_populations():
  populations = []

  for city in cities:
      city_pop = city['2020 census']
      populations.append(city_pop)
  return populations



In [26]:
pops = get_populations()
# pops

Now do the same thing with the next block of code.  Write a function called `get_names` that returns the list of `city_names`.

In [27]:
def get_names():
  city_names = []

  for city in cities:
      city_name = city['City']
      city_names.append(city_name)
  return city_names[:5]

In [28]:
names = get_names()
names[:2]

['New York[d]', 'Los Angeles']

## Wrapping Up

When were finished with our code, our function definitions will look like the following.

In [32]:
import pandas as pd
def gather_cities():
    url = 'https://en.wikipedia.org/wiki/List_of_United_States_cities_by_population'
    tables = pd.read_html(url)
    cities_table = tables[4]
    cities = cities_table.to_dict('records')
    return cities

def get_populations():
    populations = []

    for city in cities:
        city_pop = city['2020 census']
        populations.append(city_pop)
    return populations

def get_names():
    city_names = []

    for city in cities:
        city_name = city['City']
        city_names.append(city_name)
    return city_names

And we can call all of our code in just a few lines.

In [33]:
cities = gather_cities()
pops = get_populations()
city_names = get_names()

In [34]:
pops[:2]

[8804190, 3898747]

In [35]:
city_names[:2]

['New York[d]', 'Los Angeles']

## Summary

In this lesson, we learned about functions.  We saw that functions allow us to save a procedure underneath the walls of a function.  We do so with the following pattern.

```python
def function_name():
    body_of_function = 'hello'
    return body_of_function + ' world'
```

Once we define the function, then we can execute the function with the `function_name()` and we are given the return value of the function.

Functions give names to our complicated code, and allow us to summarize complicated code in just a few steps.

```python
cities = gather_cities()
pops = get_populations()
city_names = get_city_names()
```

### References

Credit to [John Resig](https://johnresig.com/) for the catapult analogy, and for a bunch of other amazing things.

<right>
<a href="https://colab.research.google.com/github/jigsawlabs-student/code-intro/blob/master/12-function-arguments.ipynb">
<img src="https://github.com/jigsawlabs-student/code-intro/blob/master/next-yellow.jpg?raw=1" align="right" style="padding-right: 20px" width="10%">
    </a>
</right>

<center>
<a href="https://www.jigsawlabs.io" style="position: center"><img src="https://github.com/jigsawlabs-student/code-intro/blob/master/jigsaw-icon.png?raw=1" width="15%" style="text-align: center"></a>
</center>

### Answers

In [None]:
def collect_data():
    return 'data'

In [None]:
def catapult():
    return 'slime'

In [None]:
def get_populations():
    populations = []

    for each_city in cities:
        city_pop = each_city['2018estimate']
        populations.append(city_pop)
    return populations

In [None]:
get_populations()[:2]

[8398748, 3990456]

In [None]:
def get_names():
    city_names = []

    for each_city in cities:
        city_name = each_city['City']
        city_names.append(city_name)
    return city_names

In [None]:
get_names()[:2]

['New York[d]', 'Los Angeles']