<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#More-about-Dictionary-objects" data-toc-modified-id="More-about-Dictionary-objects-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>More about <code>Dictionary</code> objects</a></span></li><li><span><a href="#➜-Challenge-yourself:-working-with-dictionaries" data-toc-modified-id="➜-Challenge-yourself:-working-with-dictionaries-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>➜ Challenge yourself: working with dictionaries</a></span></li><li><span><a href="#List-comprehensions" data-toc-modified-id="List-comprehensions-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>List comprehensions</a></span></li><li><span><a href="#➜-Challenge-yourself:-random-walks" data-toc-modified-id="➜-Challenge-yourself:-random-walks-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>➜ Challenge yourself: random walks</a></span></li><li><span><a href="#➜-Challenge-yourself:-moving-average" data-toc-modified-id="➜-Challenge-yourself:-moving-average-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>➜ Challenge yourself: moving average</a></span></li><li><span><a href="#Further-tips" data-toc-modified-id="Further-tips-6"><span class="toc-item-num">6&nbsp;&nbsp;</span>Further tips</a></span></li></ul></div>

> All content here is under a Creative Commons Attribution [CC-BY 4.0](https://creativecommons.org/licenses/by/4.0/) and all source code is released under a [BSD-2 clause license](https://en.wikipedia.org/wiki/BSD_licenses). 
>
>Please reuse, remix, revise, and [reshare this content](https://github.com/kgdunn/python-basic-notebooks) in any way, keeping this notice.

# Module 8: Overview 

In the prior [module 7](https://yint.org/pybasic07) you had an intensive introduction to main Pandas objects: `Series` and `DataFrame`. You were also introduced to dictionaries. In this worksheet, we only see a bit more of dictionaries, and get to apply Pandas to solving practical problems you have seen in prior modules.

<img src="images/general/Crystal_Clear_action_db_commit.png" style="width: 100px ; float:left"/> <br><br> Check our this repo using Git. Use your favourite Git user-interface, or
```
git clone git@github.com:kgdunn/python-basic-notebooks.git

# If you already have the repo cloned:
git pull
```
to update it to the later version.

<hr>

### Preparing for this module###

You should have completed [worksheet 7](https://yint.org/pybasic07).


## More about ``Dictionary`` objects

A dictionary is a Python ***object*** that is a flexible data container for other objects. It contains objects using what are called ***key*** - ***value*** pairs. You create a dictionary like this:

```python
random_objects = {'key1': 45,
                  2:      'Yes, keys can even be integers!',
                  3.0:    'Or floating point objects',
                  (4,5):  'Or tuples!',
                 }
print(random_objects)
```

### Ordered vs Unordered dictionaries
Dictionaries are an ***unordered*** container; though in the very recent versions of Python 3.7 above they are now ordered in the order that you add key-values. 

That means the above dictionary is created in a certain order (not necessarily as shown in the code!), but once you add new key-values sequentially, they will retain that order. This means if you create an empty dictionary, and add pairs ...

```python
testing_order = {}
testing_order['key1'] = 45
testing_order[2] = 'Yes, keys can even be integers!'
testing_order[3.0] = 'Or floating point objects'
testing_order.keys()
```

... that they will retain the order you added them. Because this is such a new feature, and people do not quickly upgrade their Python version, you probably should not count on it being available.

If you need to test the Python version in the code, use the ``sys.version_info`` attribute:
```python
import sys

if (sys.version_info.major >= 3) and (sys.version_info.minor >= 7):
    print('I can rely on ordered dictionaries!')
    testing_order = dict()
else:
    print('Use the OrderedDict class from "import collections".')
    from collections import OrderedDict
    testing_order = OrderedDict()
    
testing_order['key1'] = 45
testing_order[2] = 'Yes, keys can even be integers!'
testing_order[3.0] = 'Or floating point objects'

# Guaranteed to be in order, no matter which version of Python you use!
testing_order.keys()
```

### Iterating over the keys-values of a dictionary

Once you have a dictionary, it is common to operate on the keys, or values, or both - in an iterative loop:

```python
random_objects = {'key1': 45,
                  2:      'Yes, keys can even be integers!',
                  3.0:    'Or floating point objects',
                  (4,5):  'Or tuples!'
                 }
for key, value in random_objects.items():
    print('The key is "{}" and the value is: {}'.format(key, value))
    random_objects[key] = value * 2
```

If you need only the values, and not the keys:
```python
for value in random_objects.values():
    # Do something here 
    pass
```

or, if you need only the keys, and not the values:
```python
for value in random_objects.keys():
    # Do something here 
    pass
```

### Setting and getting key-values

We already saw how to set a new key or overwrite an existing key:
```python
random_objects['key1'] = 'will now be replaced'
random_objects['key2'] = 'is newly added'
```

You can get a value, from a given key, using the square bracket notation, and then immediately use it for further calculation or processing:
```python
uppercase_value = random_objects['key2'].upper()

# but this will fail:
random_objects['key3']
```

with a ``KeyError``, because you are trying to access a non-existent key. Two solutions to deal with the case if you are not sure if the key exists, but you need your code to continue:

```python

# Option 1: try-except
try:
    value = random_objects['key3']
except KeyError:
    value = float('nan')
    
# Now "value" is guaranteed to exist after these 4 lines.
# Or, option 2, in a single line of code:
value = random_objects.get('key3', float('nan'))
```
You probably will prefer using the last version, since it is compact, and provides the same functionality as the first option.



##  ➜ Challenge yourself: working with dictionaries

Create a dictionary containing the molar mass of of pure species:

* ``C``: carbon = 12.0107
* ``O``: oxygen = 15.999
* ``N``: nitrogen = 14.0067
* ``H``: hydrogen = 1.00784
* ``S``: sulfur = 32.065
* ``P``: phosphorous = 30.973762

Now write a function ``molar_mass`` which accepts 1 input, a chemical formula as a string, and returns the molar mass.

For example:
```python
methionine = 'C5H11N1O2S1'  # make life easier: explicitly add the '1' parts
met_mm = molar_mass(methionine)
```

For a solution, you should get 149.21 g/mol for that amino acid. Try your function on some other amino acids, such as  Lysine, $\text{C}_6\text{H}_{14}\text{N}_2\text{O}_2$, which has a molar mass of 146.190 g/mol.

*Suggested solution approach:*
* The input string will always start with a alphabetical letter, not a number. 
* Iterate over the string, until you encounter a number (use `.isnumeric()` on the string)
* Keep the preceding character(s): in this example, it will be `C`.
* Keep iterating over the string until the numeric value switches back to alphabetic (use `.isalpha()` on the string)
* Then you have the value(s). In this example, `5`.
* Store, in a dictionary that letter as the key, and the numeric part as a value.
* Keep going, until you have built up a dictionary:
```python
formula = {'C': 5, 'H': 11, 'N': 1, 'O': 2, 'S': 1}
```
* Now iterate over the dictionary, looking up the molar mass in a second dictionary, and add up the molecular weight.

Challenge yourself even more: adjust the code so that it can work with *natural* formulas, where the `'1'` parts are not given. E.g. your function should be able to handle `methionine = 'C5H11NO2S'` instead of `'C5H11N1O2S1'`.

## ➜ Challenge yourself: random walks

* Average of the dice thrown tends to be normally??
* Random walk exercise

## ➜ Challenge yourself: moving average

## Further tips

1. Download a *cheatsheet* of Pandas tips: https://www.kdnuggets.com/2017/01/pandas-cheat-sheet.html 
2. Learn more about Indexing Pandas arrays: https://www.kdnuggets.com/2019/04/pandas-dataframe-indexing.html (to be covered in the Advanced course)

<img src="images/general/Crystal_Clear_action_db_commit.png" style="width: 100px ; float:left"/> <br><br>Wrap up this section by committing all your work. Have you used a good commit message? Push your work, to refer to later, but also as a backup.

>***Feedback and comments about this worksheet?***
> Please provide any anonymous [comments, feedback and tips](https://docs.google.com/forms/d/1Fpo0q7uGLcM6xcLRyp4qw1mZ0_igSUEnJV6ZGbpG4C4/edit).

In [1]:
# IGNORE this. Execute this cell to load the notebook's style sheet.
from IPython.core.display import HTML
css_file = './images/style.css'
HTML(open(css_file, "r").read())