<a href="https://www.hydroffice.org/epom/"><img src="images/000_000_epom_logo.png" alt="ePOM" title="Open ePOM home page" align="center" width="12%" alt="Python logo\"></a>

<a href="https://piazza.com/e-learning_python_for_ocean_mapping/summer2019/om000/home"><img src="images/help.png" alt="ePOM" title="Ask questions on Piazza.com" align="right" width="10%" alt="Piazza.com\"></a>
# Dictionaries

It is time to enrich your baggage of containers with a new useful one: the `dict`.

Each item in a `dict` is represented by a pair of a key and a value: e.g., mapping a [chemical symbol](https://en.wikipedia.org/wiki/Symbol_(chemistry)) to the chemical element name: e.g., `H` to `Hydrogen`.

<img align="left" width="6%" style="padding-right:10px;" src="images/key.png">

A `dict` maps a set of indices, called **keys**, to a set of values.

## How to create and populate a `dict`

To create a `dict`, you can call the `dict()` constructor. Then, you can start to add items to the `dict` by using square brackets as in the code below: 

In [None]:
chem_dict = dict()
chem_dict["H"] = "Hydrogen"
chem_dict["He"] = "Helium"
chem_dict["Li"] = "Lithium"
chem_dict["Be"] = "Beryllium"
chem_dict["B"] = "Boron"

print(chem_dict)

Printing a `dict` will show that:

- The items are between curly brackets (i.e., `{`, `}`). 
- The items are separated by a comma (e.g., `"Li": "Lithium"` is an item). 
- For each item, there are two parts separated by an `-`: the key on the left (e.g., `"Li"`) and the value on the right (e.g., `"Lithium"`).


## A `dict` is unordered

When you print the content of a dictionary, you may have the items presented in a order that differs from the one that you used to populate the `dict`. This is **not** an error, but a declared property of a `dict`.

<img align="left" width="6%" style="padding-right:10px;" src="images/key.png">

A `dict` is an **unordered** container. The order of items is not preserved.

<img align="left" width="6%" style="padding-right:10px;" src="images/info.png">

In case that you need to preserve the items order, Python provides a specialized dictionary called [`OrderedDict`](https://docs.python.org/3.6/library/collections.html?highlight=ordereddict#ordereddict-objects).

<img align="left" width="6%" style="padding-right:10px;" src="images/test.png">

Populate a dictionary that maps symbols to sediment type based on the [International Scale](https://en.wikipedia.org/wiki/Grain_size#International_scale) (e.g., `"Cl"` for `"Clay"`).

In [None]:
int_scale_dict = dict()
int_scale_dict["LBo"] = "Large boulder"
int_scale_dict["Bo"] = "Boulder"
int_scale_dict["Co"] = "Cobble"
int_scale_dict["CGr"] = "Coarse gravel"
int_scale_dict["MGr"] = "Medium gravel"
int_scale_dict["FGr"] = "Fine gravel"
int_scale_dict["CSa"] = "Coarse sand"
int_scale_dict["MSa"] = "Medium sand"
int_scale_dict["FSa"] = "Fine sand"
int_scale_dict["CSi"] = "Coarse silt"
int_scale_dict["MSi"] = "Medium silt"
int_scale_dict["FsI"] = "Fine silt"
int_scale_dict["Cl"] = "Clay"

print(int_scale_dict)

In [None]:
int_scale_dict = dict()

print(int_scale_dict)

***

## Comparison between `dict` and `list`

A dictionary differs from a list for several aspects:

| Topic | List  | Dictionary |
| :-----| :---- | :--------- |
| Brackets | Squared brackets: `[`, `]` | Curly brackets: `{`, `}` |
| Empty Constructor | `list()`, `[]` | `dict()`, `{}` |
| Indexing | Indices are only integers (`int`) | The indices can be of (almost) any type  |
| Ordered | Yes (The order of items is fixed.) | No (The order of items is unpredictable.) |


<img align="left" width="6%" style="padding-right:10px;" src="images/info.png">

In the table above, the '(almost) any type' for `dict` indexing is because the type must be [hashable](https://docs.python.org/3.6/glossary.html). That is, it must be possible to calculate a [hash value](https://en.wikipedia.org/wiki/Hash_function) from the key which never changes during its lifetime, and can be compared to other keys.

***

## What is Metadata?

In Ocean Mapping you will often encounter the concept of [metadata](https://en.wikipedia.org/wiki/Metadata). One of the most common use of metadata is to help users discover and identify data resources. 

There are different [metadata standards](https://en.wikipedia.org/wiki/Metadata#Standards) for each different field of study, and ocean mapping uses many of these standards. However, for the task of this notebook, we will not explore them. You will get familiar with these standards during your future Ocean Mapping courses. 

***

## A `dict` as a Metadata Container

We will now explore the use of a `dict` as a [metadata](https://en.wikipedia.org/wiki/Metadata) container.  

Following our previous examples of experiments collecting water salinity and temperature values, we will use a `dict` to store metadata such as:

- The author of the measures (`"first_name"` and `"last_name"`).
- The location where the measurements took place (`"latitude"` and `"longitude"`).
- The time frame when the measures happened (`"start_timestamp"` and `"end_timestamp"`).

Thus, a complete set of metadata will be represented by a `dict` containing the following six keys (with the corresponding value type):

- `"first_name"` &#x279C; `str` type
- `"last_name"` &#x279C; `str` type
- `"latitude"` &#x279C; `float` type
- `"longitude"` &#x279C; `float` type
- `"start_timestamp"` &#x279C; `datetime` type
- `"end_timestamp"` &#x279C; `datetime` type

This is the first time that we use the [`datetime`](https://docs.python.org/3.6/library/datetime.html?#module-datetime) type! 

<img align="left" width="6%" style="padding-right:10px;" src="images/key.png">

A variable of `datetime` type contains all the information from both a date and a time. This is of key importance in Ocean Mapping as integration of data is usually done on a time basis. 

An entire Ocean Mapping course will be focused on the integration of data from various sensors used to map the seafloor.

As you can read from the [Python documentation](https://docs.python.org/3.6/library/datetime.html?#datetime-objects), the `datetime` constructor is part of the `datetime` module (yes, they have both the same name!) and takes several parameters: 

- `datetime(year, month, day, hour=0, minute=0, second=0, microsecond=0, tzinfo=None, *, fold=0)` 

For the aims of this notebook, you can just ignore all the parameters after the first six. In fact, we will call the `datetime` constructor with only 6 values (from `year` to `second`).

In [None]:
from datetime import datetime

begin_timestamp = datetime(2019, 2, 22, 12, 32, 40)
print(str(begin_timestamp))

**How can the above code actually work?** It works because the parameters after the first 3 (e.g., `hour=0`) have a **default value** assigned to them (the `=0` in this specific example). This implies that, if you do *not* pass values for those parameters, Python will assign them those defined default values. 

We can now write our `metadata`:

In [None]:
metadata = dict()
metadata["first_name"] = "John"
metadata["last_name"] = "Doe"
metadata["latitude"] = 43.135555
metadata["longitude"] = -70.939534
metadata["start_timestamp"] = datetime(2019, 2, 22, 12, 32, 40)
metadata["end_timestamp"] = datetime(2019, 2, 22, 12, 34, 14)

print(metadata)

<img align="left" width="6%" style="padding-right:10px;" src="images/test.png">

Populate and print a `metadata` dictionary containing the following three keys: your `"username"`, the `"begin_time"` and the `"end_time"` for the execution of this exercise.

In [None]:
metadata = dict()
metadata["username"] = "jdoe"
metadata["start_timestamp"] = datetime(2019, 2, 22, 12, 34, 20)
metadata["end_timestamp"] = datetime(2019, 2, 22, 12, 34, 21)

print(metadata)

***

# More on String Formatting

In this last section of this notebook, we will explore different mechanisms that Python provides for printing (**string formatting**) a value.

At this moment, you know how to print a value with `str` type:

In [28]:
metadata = dict()
metadata["first_name"] = "John"

print("The first name is: " + metadata["first_name"])

The first name is: John


You also know that you can type-casting types using `str()`:

In [29]:
metadata = dict()
metadata["latitude"] = 43.135555
metadata["longitude"] = -70.939534
metadata["start_timestamp"] = datetime(2019, 2, 22, 12, 32, 40)

print("The position is: " + str(metadata["latitude"]) + ", " + str(metadata["longitude"]))
print("Start time: " + str(metadata["start_timestamp"]))

The position is: 43.135555, -70.939534
Start time: 2019-02-22 12:32:40


It is possible to achieve the same results by using the `%` modulo operator like in the following examples:

In [33]:
metadata = dict()
metadata["first_name"] = "John"

print("The first name is: %s" % (metadata["first_name"], ))

The first name is: John


In [37]:
metadata = dict()
metadata["latitude"] = 43.135558
metadata["longitude"] = -70.939534
metadata["start_timestamp"] = datetime(2019, 2, 22, 12, 32, 40)

print("The position is: %s, %s" % (metadata["latitude"], metadata["longitude"]))
print("Start time: %s" % (metadata["start_timestamp"], ))

The position is: 43.135558, -70.939534
Start time: 2019-02-22 12:32:40


If you look at the above examples, you will noticed the presence of `%s` as placeholders in the string. The string is followed by a `%` operator, then by one or more variables enclosed in square brackets.

<img align="left" width="6%" style="padding-right:10px;" src="images/info.png">

In the above code, the values after the `%` operator that are inside the square brackets create a [`tuple`](https://docs.python.org/3.6/library/stdtypes.html?#tuples). <br>
A `tuple` is a Python container that represents an immutable sequence (thus, you cannot change the content of a `tuple`).

String formatting using the `%` operator provides [additional printing options](https://docs.python.org/3.6/library/stdtypes.html#printf-style-string-formatting). Among them, you can decide how many decimal digits will be printed for a `float` value. 

For instance, by using `%.4f` as a placeholder, Python will print **only** the first four decimal digits: 

In [41]:
metadata = dict()
metadata["latitude"] = 43.135558
metadata["longitude"] = -70.939534

print("The position is: %.4f, %.4f" % (metadata["latitude"], metadata["longitude"]))

The position is: 43.1356, -70.9395


***

<img align="left" width="6%" style="padding-right:10px; padding-top:10px;" src="images/refs.png">

## Useful References

* [The official Python 3.6 documentation](https://docs.python.org/3.6/index.html)
  * [Glossary](https://docs.python.org/3.6/glossary.html)
  * [Mapping Types - dict](https://docs.python.org/3.6/library/stdtypes.html#mapping-types-dict)
  * [Collections - OrderedDict](https://docs.python.org/3.6/library/collections.html?highlight=ordereddict#ordereddict-objects)
  * [`datetime`](https://docs.python.org/3.6/library/datetime.html?#module-datetime) 
  * [`tuple`](https://docs.python.org/3.6/library/stdtypes.html?#tuples)
* [Hash function](https://en.wikipedia.org/wiki/Hash_function)
* [Metadata](https://en.wikipedia.org/wiki/Metadata)

<img align="left" width="5%" style="padding-right:10px;" src="images/email.png">

*For issues or suggestions related to this notebook, write to: gmasetti@ccom.unh.edu*

<!--NAVIGATION-->
[< Write Your Own Functions](005_Write_Your_Own_Functions.ipynb) | [Contents](index.ipynb) | [Read and Write Text Files >](007_Read_and_Write_Text_Files.ipynb)