# Section B - Dictionaries and Loops

Feedback: https://forms.gle/Le3RAsMEcYqEyswEA

**Topics**: Dictionaries and advanced control flow - for and while loops - and list comprehinsions.

This is a very exciting week.  Dictionaries are so useful for organizing things, and with the introductin of loops, we become real programmers who can perform large tasks with concise bits of code.  The Exercises below for the loops are hopefully kind of challenging requring you to use conditoinals, variables, lists, and if else branching!

# Dictionaries
Dictionaries are for key: value collections.

This could be useful for many reasons!  Here's  asimple example, and below a more complex example.  Some exersices will follow breaking this stuff down. 

## Defining dictionaries with content ready to reference:
We follow the {key: value, key2: value2, ... } pattern for this.

In [4]:
colors = {
    "white": "#FFFFFF",
    "black": "#000000",
    "red": "#FF0000"
}

Note that keys can be any immutable object!
* strings
* numbers
* tuples

You **cannot use lists** as dictionary keys, and you cannot use dictionaries as dictionary keys.

## Using/Accessing dictionary data
We have a few ways to use dictionaries:
* Look values up by key
* See what keys are in the directory
* See what values ar in the dictionary
* See all key:value pairs in the dictionary
* And some other more obscure stuff to read about later.

In [11]:
print('The color code for white is', colors['white'])
print('The colors in this dictionary are:', colors.keys())
print('The values in this dictionary are:', colors.values())
print('And the k:v pairs, aka items, in the dictionary are:', colors.items(), '\n')

print('It can be helpful to iterate ovr the items in a dictionary like this:')
for color, code in colors.items():
    print(f'    The color "{color}" has the code "{code}"')

The color code for white is #fffffff
The colors in this dictionary are: dict_keys(['white', 'black', 'red', 'blue'])
The values in this dictionary are: dict_values(['#fffffff', '#000000', '#FF0000', '#0000FF'])
And the k:v pairs, aka items, in the dictionary are: dict_items([('white', '#fffffff'), ('black', '#000000'), ('red', '#FF0000'), ('blue', '#0000FF')]) 

It can be helpful to iterate ovr the items in a dictionary like this:
    The color "white" has the code "#fffffff"
    The color "black" has the code "#000000"
    The color "red" has the code "#FF0000"
    The color "blue" has the code "#0000FF"


## Overwriting keys or adding new data to a dictionary:
Assigmint is the same whether or not the key is already in the dictionary.  If it exists already, it will be overwritten with the new value:

In [8]:
print('Current colors:', colors.items())
colors["blue"] = "#0000FF"
colors["white"] = "#fffffff"
print('Colors now:', colors.items())

Current colors: dict_items([('white', '#FFFFFF'), ('black', '#000000'), ('red', '#FF0000')])
Colors now: dict_items([('white', '#fffffff'), ('black', '#000000'), ('red', '#FF0000'), ('blue', '#0000FF')])


## Accessing or calling values in dictionaries
When we use a dictionary to access an oject, it is the same as having a separate variable pointing to the object.  Let's explain with an example:

In [9]:
cars = []
bicycles = []

vehicles = {
    "motorcars": cars,
    "pedalbikes": bicycles
}

cars.append('toyota')
vehicles["pedalbikes"].append('schwinn')

print('Cars:', cars, vehicles['motorcars'])
print('Bicycles:', bicycles, vehicles['pedalbikes'])

Cars: ['toyota'] ['toyota']
Bicycles: ['schwinn'] ['schwinn']


There are a few notable things happening here:
* We create cars and bicycles variables pointing to lists
* We create a dictionary with string keys and associated the keys with whatever the variables were associated with.  In this case, the very same lists the variables refer to.
* We add something to the cars list using the variable
* We add something to the bicycles list using it's key in the dictionary
* Demonstrate that when we print using the variables and the dictionary, we see the same updated list in both cases. 

This concept can be expanded in lots of ways:
* lists of lists
* nested dictionaries

And it raises need for discussion about what is passing by reference, what is passing by value, and when do we need to think about this?  But that's too much for now!

Let's play with all of this!




And we can use variables and loops (explanation of loops below) to be more dynamic:


**Mapping human readable values to hex codes**
In this example, we pre-define a bunch of colors and hex values they represent.


        "green": "#008000",
        "blue": 
        "yellow": "#FFFF00",
        "cyan": "#00FFFF",
        "magenta": "#FF00FF"
    }

    text_color_code = colors["blue"]

**Grouping files for processing by analysis type**
This jumps ahead a little by using a for loop, but this is really helpful use case for dictionies.  Imagine you have a bunch of data files that use similar processing overall, but with small differences depeding on the particlar analyte beig looked at.  We can group common data files in a dictionary and then use the same code on them all later. We'll expand on later, but try to internalize what's happening here:

    all_files = {'chloraphyl': [], 'nitrogen': [], 'salinity': []}
    for file_name in os.listdir():
        if 'chloraphyl' in file_name:
            all_files['chloraphyl'].append(file_name)
        elif 'nitrogen' in file_name:
            all_files['nitrogen'].append(file_name)
        elif 'ph' in file_name:
            all_files['salinity'].append(file_name)
        else:
            print('Warning, unknown file:', file_name)

* all_files is a dictionary with three key: value pairs.  
* the key is teh analyte name that we're looking for in the file names
* the value is a list that we will append each matching file name to. 
* os.listdir() returns a list of all of the files in the current working directory.  You could also pass it a path to another directory if needed. 

We'll get a dictionary that looks like this afterward:

    >>> all_files
    {'chloraphyl': ['chloraphyl_20240202.xlsx', 'chloraphyl_20240202.xlsx', 'chloraphyl_20240202.xlsx',],
    'nitrogen': ['nitrogen_20240202.xlsx', 'nitrogen20240202.xlsx',],
    'salinity': ['salinity_20240202.xlsx', 'salinity_20240202.xlsx'] }

# For Loops
This is really where things start to get interesting.  Anything that we did previously that was repetetive can be wrapped into a loop to make a single chunk of code do things over and over. 

## basic structure:

    for each_thing in many_things:
        # do something with each thing
        print(each_thing)

* "many_things" is any iterable - a list, a dictionary, a tuple, a function that yields multiple things, etc.
* "each_thing" is the name that we use to refer to each item from many_things, one at a time. 
* The indented do something block contains all of the code we want to run for each_thing.

## control commands - continue and break
* Inside the loop, we can call "break" to exit the loop, even if there are more things in many_things.
* And we can call "continue" to skip to the next item without running any more code on the current thing.

Let's see an example where we are trying to **find five animals from a list names with three or fewer letters** and then stop when done:
    all_animals = ()

In [None]:
all_animals = ('dog', 'mouse', 'rat', 'squirrel', 'cat', 'rabbit', 
               'hamster', 'gerbil', 'guinea pig', 'pig', 'cow', 'horse',
               'chinchilla', 'ferret', 'hedgehog', 'sugar glider', 'bat')
short_animals = []
for animal in all_animals:
    if len(animal) > 3:
        # skip this one
        continue
    print('found one!', animal)
    short_animals.append(animal)
    
    if len(short_animals) == 5:
        print('all done, found enough')
        break
print('These are the first five short animals:', short_animals)

It's worth noting that we could have done this without continue and break, but using them reduces need for indentaton and helps with concise, readble, code. 

#### *Exercise*:

Write a for loop that will try three times to prompt the user for a four letter word.  It should break when a valid word is entered. And after the loop, it should print out the given word.  Something to consider:
* What do we do if the user doesn't give a valid word for any of the three tries?  How do we avoid an error in the print statement?  
* What are three different ways to make the for loop do the thing three times?
* Remember that we can use "input" to prompt the user:  word = input('tell me a word')

In [None]:
# Note that _ is a valid variable name in Python, but it is used to indicate that the variable is not used in the loop.
# if you wanted to print an error message and give a count of tries each time, using something like "count" in place of "_"
# would be more appropriate.

for _ in ...:
    ...
print(...)

## for else
One last thing to mention about for loops is that we can have an else clause that gets calles only if break is not called from within the for loop.  This is like our contingency code for what to do if what we expect doens't happen in the for loop, like we don't find something we're looking for:

    for widgit in suitible_widgets:
        supplier_stock_qty = check_supplier_stock(widget)
        if supplier_stock_qty > 2:
            print('Great, we can order', widget)
            order_widget = widget
            break
    else:
        print("Supplier didn't have any suitible widgets in stock!")
        order_widget = None
        notify_supplier(suitible_widgets)

You could implement this without the for else functionality, but this reduces the numbers of variables needed and helps to show the intent of your code by using it. 

The check_supplier_stock and notify_supplier functinos might make rest calls to the supplier web site to check their stock or place an order automatically. 

#### *Exercise*:

Modify your user prompt for loop above so that if the user doesn't give a valid response and break is never calles, the else block prints an error message.

In [None]:
for _ in ...:
    ...
    print(...)  
    break
else:
    print(...)  # error message here

# While Loops
Much like for loops, we use while loops to do things over and over, but instead of doing it once for each item in a list of objects passed to the loop, we do it until a condition is met.  

## General Structure

while condition:
    do something

* Like with for loops, we can call continue and break.
* while True: will loop forever becaues True is never False.  We'd have to use break to exit the loop in this case. 

Let's try our animal example again:

In [2]:

all_animals = ['dog', 'mouse', 'rat', 'squirrel', 'cat', 'rabbit', 
               'hamster', 'gerbil', 'guinea pig', 'pig', 'cow', 'horse',
               'chinchilla', 'ferret', 'hedgehog', 'sugar glider', 'bat']
short_animals = []
while len(short_animals) < 5:
    if len(all_animals) == 0:
        print('no more animals to check')
        break
    animal = all_animals.pop()  # take one from the list and remove it from the list
    if len(animal) > 3:
        # skip this one
        continue
    print('found one!', animal)
    short_animals.append(animal)

print('These are some short animals:', short_animals)

found one! bat
found one! cow
found one! pig
found one! cat
found one! rat
These are some short animals: ['bat', 'cow', 'pig', 'cat', 'rat']


What would happen if we didn't check the lengte of all_animals before calling pop?  
What would happen if there weren't five short animalis in the list?

#### *Exercise*:

Let's use a while loop to make a guessing game!  We'll generate a random number from 1 to 100 and, in the loop, prompt the user for a guess.  Tell the user if the number is higher or lower than the mystery number. Use a condition on the loop to have it exit automatically when the user has guessed correctly and print a congratulations after the loop!

In [None]:
import random
mystery_number = random.randint(1, 100)
user_guess = -1  # start with an invalid guess
while ...:
    ...
print('You got it!')

# List Comprehensions
You can get by without these, but they're a nice tool for doing simple operations to lists of things that reduces code bloat and improves readabiity if you don't get too crazy.

Consider this very standard for loop:

In [12]:
people = ['joan', 'maude', 'henrietta']
tmp_list = []
for name in people:
    tmp_list.append(name.capitalize())
people = tmp_list
print(people)

['Jaon', 'Maude', 'Henrietta']


We can replace it with this simple structure:

**[expression for item in iterable]**

* expression: This is the value that will be included in the new list.  It uses the item to do something.
* item: This is a variable that takes the value of each element in the iterable.
* iterable: This is any Python object capable of returning its members one at a time, such as a list, range, string, etc.

In [16]:
people = ['joan', 'maude', 'henrietta']
people = [name.capitalize() for name in people]
print(people)

['Jaon', 'Maude', 'Henrietta']


## With A filter:
Use a filter to select specific items from the source list:

**[expression for item in iterable if condition]**

* condition lets us select specific items from the source list

In [15]:
# In practice, you would list files from a directory like this:
# all_files = os.listdir('optional/path/to/directory')
# But for this example, we'll just use a pretend listing:
all_files = ['readme.md', 'data.csv', 'event.log', 'config.json', 'config.yaml', 'hangman.py', 'args.py', 'writefile.py']
python_files = [file_name for file_name in all_files if file_name.endswith('.py')]
print(python_files)

['hangman.py', 'args.py', 'writefile.py']


 Let's try this for ourselves!

#### *Exercise*:

Write list comprehensions in the code cell to modify each source list per the instructions in the comment.  Replace each '...'.

In [None]:
# String manipulation - convert the list of signs to all upper case:
street_signs = ['stop', 'yield', 'one way', 'speed limit', 'wrong way']
street_signs = [...] # use a list comprehension to convert the signs to upper case
print(street_signs)

# String slicing - truncate the names to the first 4 characters:
names = ['Clarice', 'Fernando', 'Xavier', 'Mildred']
# names = ['Clarice', 'Fernando', 'Ed', 'Tim']  # Challenge mode, make it not error on short names
names = [...] # use a list comprehension to truncate the names to the first 4 characters

# Conditional - use the if condition to select which values will be kept
actual_berries = ('blueberry', 'raspberry', 'strawberry', 'blackberry')
found_fruit = ['strawberry', 'banana', 'kiwi', 'blueberry', 'raspberry', 'blackberry', 'yuzu']
found_berries = [...] # use a list comprehension to create a list of berries from the found_fruit list

## For loops with multiple variables

A very common use case for this is when we want to iterate over key:value pairs of a dictionary. An example:

In [None]:
animal_sounds = {
    'dog': 'bark',  'cat': 'meow',   'cow': 'moo',
}
for animal, sound in animal_sounds.items():
    print(f'The {animal} says "{sound}"')
    
# An alternative way to do this is:
for animal in animal_sounds:
    print(f'Again, the {animal} says "{animal_sounds[animal]}"')
    
# Or even:
for animal_and_sound in animal_sounds.items():
    animal, sound = animal_and_sound
    print(f'And again, the {animal} says "{sound}"')

## Nested loops

Nesting loops can make code complicated very quickly, but it's very useful to do.  Here's an example where we have a dictionary with the values being lists of things.  We can iterate through the dictoinary and then itereate throug it's items before going to the next key in the dictinoary.

In [21]:
wine_pairings = {
    'chardonnay': ['fish', 'chicken', 'pork'],
    'merlot': ['beef', 'lamb', 'pasta'],
    'sauvignon blanc': ['salad', 'veggies', 'pasta'],
}
for wine, pairings in wine_pairings.items():
    print(f'{wine.title()} pairs well with:')
    for pairing in pairings:
        print(f'    {pairing}')

Chardonnay pairs well with:
    fish
    chicken
    pork
Merlot pairs well with:
    beef
    lamb
    pasta
Sauvignon Blanc pairs well with:
    salad
    veggies
    pasta


No we see how loops can be both useful and delicious!

#### *Exercise*:

Let's make a tool that collects students grades in several subjects, stores then in a dictionary, and then summarizes them for us. To keep this simple, we can start by storing just a list of grades for each student, but to make things more complicated, we could:
* Use nested dictinoaries to store each student's grades for each subject separately.
* Report which student did the best in each subject
* Improve input validation

In [1]:
students = {'Stanley': [], 'Casey': [], 'Taylor': []}
subjects = ['math', 'science', 'history']

# First iterate through the students, getting their name and a reference to the list to store their grades in:
for student, grades in ...:
    # Then iterate through the subjects and prompt for a grade for the student for each subject:
    for subject in ...:
        grade = input(...)
        # convert it to float
        # Store the grade in the list of grades for the student
        
# And now summarize the grades for each student:
for ...:
    average = ...
    best = ...
    worst = ...
    # Print stats for the student:
    print(...)

Hello World


# Writing files
Let's start with writing a file so we have something to read afterward. We can read/write a few types of data:
* Text data can be read/written direclty.  String data is text.  And a list of dictionaries could be converted to text if each dict has the same keys in it. 
* Structured data, like a dictionary, can be written by converting it to json or another format.
* Binary data, like most python objects, can be serialized with "pickle" and writen as binary data.

## Simple string/text data
First simple example!  A note on newline charcters:
* Linux and Mac (any unix flavor OS) uses '\n' as a newline
* Windows uses '\r\n' as a newline character
* You can "import os" and use os.linesep to get the newline character appropriate for whatever OS your code is running on. 
* Python has writelines and readlines functions that will figure out the newline characters for you.
* I'll use '\n' in these examples as it shouldn't matter for any of the examples here, but if you save a file to windows to look at it... ymmv. 

In [6]:
some_text = 'The quick brown fox jumps over the lazy dog!'
output_file = './simple_text.txt'
with open(output_file, 'w') as f:
    # f is a file handle that we can write to.  f could be any variable name.
    f.write(some_text)
    f.write('\n')  # it's good form to include a newline at the end of the file
# The file is automatically closed when the with indented block ends

### Using "with"
**with** helps us make sure things get cleaned up when we'r done with them. In the example above:
* "f" is the return value of open(output_file, 'w')
* We can use "f", short for file handle, for as many lines as we need, indented below the with line. 

The general format is as follows.  

    with some_function() as variable_name:
        do stuff with variable_name
        ...
    variable_name.close() is called automatically

A few examples of things that support using with:
* opening zip files, like with the zipfile library.
* some database connecions, like with the sqlite3 library.
* temporary files with the tempfile library
* network calls with requests.get
* ... lots more

Below is an alternate way to do the same thing without using **with**.  You should always use with if you can, but there are occasions where you need to keep a file handle open for a longer time, like if your program opens a mutex file to ensure that it doesn't get run twice at the same time and step on it's own feet.  It would only close the mutex when it is done running, and then another instance can open the mutex. 

We don't want to forget to close files because:
* If our program opens a lot of files and never closes them, it can cause performance or stability problems on the computer.
* While we have the file open, other tools may not be able to access it.  Excel, in particular, will complain if you still have an xlsx open.

In [7]:
f = open(output_file, 'w')
f.write(some_text)
f.write('\n')
f.close()

## Writing structured data
You can convert data to json, yaml, or ahother format to write to file and easily read it back in later.  

**Json** is a format you'll use a lot for web queries.  You'll notice that json looks very similar to python code.  It's more strict though - no comments allowed, less flexible about quoting, etc. 
**Yaml** is a format that's friendly for editing with a text editor, like for a config file.

In [10]:

import json

data = [{'name': 'Stanley', 'math': '85', 'science': '90', 'history': '92'},
        {'name': 'Casey', 'math': '75', 'science': '80', 'history': '85'},
        {'name': 'Taylor', 'math': '95', 'science': '100', 'history': '100'}]

text_data = json.dumps(data)  # dumps is "dump string"
output_file = './stundent_grades.json'
with open(output_file, 'w') as f:
    f.write(text_data)

We can actually skip the intermediary step of calling json.dumps and let json work directly with the file handle.  Either way is okay, but the above is nice to see since it looks like the first simple text example and you can use that dumps funcion for web stuff too. 

In [None]:
with open(output_file, 'w') as f:
    json.dump(data, f)
    # notice we're using "data", not "text_data" here

# Reading Files
Reading file data works almost exactly like writing the data.

Note that pandas has it's own functions for reading in .csv, .dat, .xlsx files that we'll look at next week. 

In [8]:
input_file = './simple_text.txt'
with open(input_file, 'r') as f:
    text = f.read()
# The file is automatically closed when the with indented block ends
print("We read this from the file:")
print(text)

We read this from the file:
The quick brown fox jumps over the lazy dog!



In [13]:
input_file = './stundent_grades.json'
with open(input_file, 'r') as f:
    new_data = json.load(f)
print("We read this from the file:")
for student_dict in new_data:
    print(student_dict)

We read this from the file:
{'name': 'Stanley', 'math': '85', 'science': '90', 'history': '92'}
{'name': 'Casey', 'math': '75', 'science': '80', 'history': '85'}
{'name': 'Taylor', 'math': '95', 'science': '100', 'history': '100'}


#### *Exercise*:

Let's make a journaling tool that we can use to record things we've learned each day in class. We'll use two cells for this.

In the first cell, we can create new jornal entries.  Use open(journam_file, 'w+') with a **w+** to append to the file so that if you are making multiple entries on the same date, you don't overwrite one.   An entry is a single line of text from the input function.

In the second cell, get a list of journal files using os.listdir and open them one at a time to print out the journal entries. 

For extra credit, store the journal entries in json format - you might need to open the file, read the data, add to the data, and rewrite the entire file to store an entry... O_O 

In [None]:
# ADDING NEW JOURNAL ENTRIES
from datetime import datetime

# We use a file with today's date in the name
todays_date = datetime.now().strftime("%Y-%m-%d")
journal_file = f'./journal_{todays_date}.txt'
# We write some text to the file

# open the file for writing and write the text to it. 

In [None]:
# READING JOURNAL ENTRIES
import os

all_files = os.listdir('.')

# use a list comprehension to filter the files to only the journal files
journal_files = ...

# iterate over each of the files, open them, print the date from the filename, and print the contents of the file

# Week 2 Turtle Challenge
Note - you can find example code for running "turtle" in the A-Getting_Started notebook. 

This time, let's use the power of loops to make shapes of arbitrary numbers of sides with only a few lines of code!

#### *Exercise*:
**Level 1**
* Create variables for number of sides and size.  
* Calculate the turn angle based on the number of sides.
* Use a loop to draw the shape described by the variables

**Level 2**
* Create a dictionary that describes a variety of shapes, like {'big square': (4, 100), 'small triangle': (3, 50), ...}
* Create a list that holds a few of the shapes listed ['big square', 'small square', ...]
* Use a loop to go through the list of shapes to make, looks up the values needed to make the shape, and then draws each shape one at a time. 

**Level 3**
* Create a config file for storing different shapes.  Read it in to get the shapes data. 
* Use the list of shapes as above to say which to make from the config file.

In [None]:
# Copy the base turtle code from last week's notebook to start.