# **Lab 4 — Dictionaries and functions**
---

## Introduction

This week we're going to learn about dictionaries and functions in Python. However, before we delve into this topic, we're going to do a refresher exercise on lists and iteration.

Your deliverable for this lab will be this notebook, with **"deliverables" completed as requested below**. The "exercises" are exploratory and not graded. Please rename the notebook from `lab_04.ipynb` to `<last_name>_lab_04.ipynb` prior to submission. Download the file using **File $\rightarrow$ Download**. Submit it to Canvas under the Lab 4 assignment.

Please note that Deliverable 4 is only required for students enrolled in 636. Students in 436 are welcome to complete the deliverable and will be given feedback on their work if they do, but it will not factor into their grade.

## Resources

[Dictionaries](https://www.w3schools.com/python/python_dictionaries.asp)  
[Functions](https://www.w3schools.com/python/python_functions.asp)

## Exercise I: Review

We're going to combine everything that we learned and add something extra. We will be using the [`open()`](https://docs.python.org/3/library/functions.html#open) function to open a data file consisting of latitudes, longitudes and station names. `open()` returns a "file object" which provides several operations that can operate on this datatype. We will use the `read()` function, which returns the entire file contents in one single string (if called without a parameter). String objects provide another [set of functions](https://docs.python.org/3/library/stdtypes.html#string-methods). Here, we'll use the `split()` function to split the single string that contains all of the file contents into a list of strings using any whitespace (spaces, tabs, newlines) as a separator.

First, download the text file `PBO_coords.txt` using by running the code cell below: 
(Note, 'curl' is a command line tool not python! We will learn more about this later)

In [None]:
#We're again making sure that all the output is displayed 
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

#this downloads the data
!curl -O http://www.grapenthin.org/teaching/geop501/download/PBO_coords.txt

Then open the file using `open()`:

In [None]:
with open('PBO_coords.txt') as f:
    data = f.read().split()
    # At this point, we have the whole contents of "PBO_coords.txt" in the variable "data" and
    # we can close the file, by leaving the "with" statement

data[:15]  # Only print the first entries since this is a large file!

We've wrapped this in the (maybe) odd-looking `with` statement. This ensures that the file will be closed at the end of `with`. Why is this important? Apart from good practice, write access to a file is granted only once on the system level. So if you (or someone else) wanted to add something to a file, but you didn't close the file when you were done and the program kept running, nobody else can write to the file.

## Deliverable 1 <font color='red'>(30 points)</font>

For this deliverable, you are going to organize the data and change it from a normal list into a nested list. To do so, you will have to iterate over all items in `data`, organize longitude, latitude, and station name into a sub-list, and add this sublist to `newlist` such that:

```python
print(newlist[0])
```

Should output:

```
['-174.204754770', '52.209504660', 'AB01']
```

Instead of:

```
-174.204754770
```

Why is this useful? You can access all the information for one station at once, rather than always having to keep in mind how many field belong to a station, and what the offset is that you need calculate to get to those data.

Add a **new code cell** below with your code that generates `newlist`.

## Exercise II: Dictionaries

Now we're going to introduce another very useful datatype: the [dictionary](https://docs.python.org/3/tutorial/datastructures.html#dictionaries). Dictionaries are similar to lists, but there are a few differences. First, lists are ordered sequences, while dictionaries are unordered sequences. Second, lists are indexed by numbers and dictionaries are indexed with keys. Dictionary keys can be any string or number or more complex datatypes such as datetime objects. Try this code (remember, **copy/pasting each line works best!**) to see how it works:

```python
x = {'a': 1, 'b': 100, 3: 4}
x['a']
x[3]
x[1]  # This line won't work, there's no key '1'
y = {'numbers': [0, 1, 2, 3, 4, 5, 6, 7, 8], 'letters': ['a', 'b', 'c', 'd']}
y['numbers']
y['letters']
y['symbols'] = ['!', '@', '#']
y
```

Make sure that you understand how this code works and why `x[1]` doesn't work before you move on.

In [None]:
# Try out the above here!


## Deliverable 2 <font color='red'>(35 points)</font>

We can use dictionaries to better organize our coordinate data from the first deliverable. Instead of searching for or having to remember indices of station names to access their coordinates, you can organize your data into a dictionary so that:

```python
station_dict['AB01']
```

Will return:

```
['-174.204754770', '52.209504660']
```

This means you can use the station name directly as a key to look up information in your data structure - a much more intuitive way to look up data, don't you think? 

For your second deliverable, turn all of the data from `PBO_coords.txt` into a dictionary that works like the `station_dict` in the example above. You can start with the list `data` from Exercise I if you'd like.

Add a **new code cell** below with your code that generates `station_dict`.

## Exercise 3: Functions

A function is a block of code that can be reused to perform a specific task. This may sound similar to iteration, but functions differ in that they may be called at any time during the program, with different parameters, and instead of running multiple times like a loop, they only run once for every time that you call them.

There are built-in functions (some of which we have already used) and user-defined functions. Here are some of the built-in functions that we have already used:

```python
int()
float()
str()
type()
range()
```

Built-in functions, like user-defined functions, take in an input and produce an output. The difference is that we do not have to separately define these functions, since they were already written by the creators of Python to solve simple problems.

As you develop larger programs, it is useful to divide a program into user-defined functions. There are several reasons why you should do this:

* Sectioning your program into functions allows you to name groups of commands, which makes your program easier to read, understand, and debug.
* Functions can make a program smaller by eliminating repetitive code. If you make future changes, you only have to make them in one spot.
* Well-designed functions are often useful outside of the program they're written for. Once you write and debug one, you can reuse it in future programs.

Let's take a look at an example user-defined function:

In [None]:
def larger(x, y):
    if x > y:
        return x
    elif y > x:
        return y
    else:
        return None

larger(5, 34)

Feel free to try out different input values to `larger` in the code cell above! The `return` command is defining the output of the program. This program looks for the larger number and returns it as the output. Now, try using `larger(5)` in the cell above. It will give you an error because the function is only defined to take two arguments. It is important to get used to reading and understanding such error messages. Python is actually quite helpful here and tells us that `y` is missing.

## Deliverable 3 <font color='red'>(35 Points)</font>

Turn the code from Deliverable 2 into a function (if you solved Deliverable using the data construct from Deliverable 1, you need to use both solutions here). The function should be defined as:

```python
def file_to_dict(filename):
```

The input for the function should be the `filename` and the output should be the dictionary of the station names and coordinates. You'll define this such that when run for a file named `BARD_coords2.txt`

```python
station_dict = file_to_dict('BARD_coords2.txt')
print(station_dict['DIAB'])
```

prints:

```
['-121.9156', '37.8786']
```

First, run the below code cell to download `BARD_coords2.txt`.

In [None]:
!curl -O grapenthin.org/teaching/geop501/download/BARD_coords2.txt

Then, **define your function in the cell provided below**. Note that you won't have to write much new code here — you can use the code we've provided above to read the `filename` using `open()`, and then you may adopt the code you wrote for Deliverable 2, or Deliverable 1 and 2.

In [None]:
def file_to_dict(filename):
    pass  # Replace this line with your lines of code!

Test your function by running the following. **Note that if you make changes to the function above, you need to re-run the code cell containing the function code for the function to be re-defined before you run the below again!**

In [None]:
#RERUN YOUR file_to_dict FUNCTION DEFINITION CELL IF YOU MAKE CHANGES TO IT!

station_dict = file_to_dict('BARD_coords2.txt')
print(station_dict['DIAB'])

`DIAB` is a GPS station on Mount Diablo in the San Francisco Bay Area in California: https://seismo.berkeley.edu/station_book/diab.html

## Exercise III: Multiple functions

It is possible to call other functions within a function. This can be very useful for complex programs. Here's an example program which combines an old function that we used with another function:

In [None]:
def larger(x, y):
    if x > y:
        return x
    else:
        return y

def largest_comparison(numlist):
    """Iterate through a list of numbers (numlist) and print the larger of numlist[i] and numlist[i + 1]."""
    for num in range(len(numlist)-1):
        print(larger(numlist[num], numlist[num+1]))  # Here we call the function larger() we defined above

# Call the function
numlist = [4, 1, 8, 3, 4, 6]
largest_comparison(numlist)

Experiment with different `numlist` inputs above to make sure things make sense.

## Deliverable 4 - <font color='red'>Only Required for 636 Students (20 points)</font>

For this exercise, reuse the code from Deliverable 3 by making a function which counts the number of stations. The function should be defined as:

```python
def number_of_stations(file):
```

Your function should call `file_to_dict()` internally. The idea is to count the number of entries (stations) in the dictionary. You can read the dictionary documentation [here](https://www.tutorialspoint.com/python/python_dictionary.htm) to find an easy way to do this. Again, this function should be **defined in the code cell provided below** such that

```python
print(number_of_stations('BARD_coords2.txt'))
```

prints `31`.

In [None]:
def number_of_stations(file):
    pass  # Replace this line with your lines of code!

print(number_of_stations('BARD_coords2.txt'))