# Control Structures

## Recall

Last unit we learned about **operators** and data-**types**. **Operators** like ```+```, ```/``` or ```<```, allow us to work with **variables** containing different **values**. With the help of **lists** we van group **ints**, **floats**, **strs** and other data-**types** together.

The combination of these allowed us to formulate most of the building blocks for our final algorithm.

1. [ ] For every file we do the following:
    1. [x] We open the file ```csv_file = open("./data/Day_1_dish_1_zoom_3.csv")```
    2. [x] We figure out what day and dish it is ```_, day, _ , dish_number, _, zoom_factor = csv_file_name.split("_") ```
    3. [x] We create a counter for the number of cells ```cell_counter = 0``` 
    4. [x] We create a counter for the area covered by the cells ```cell_area_counter = 0```
    5. [ ] We ignore the first line 
    6. [ ] For every line we do the following:
        1. [x] We increase the cell counter ```cell_counter += 1```
        2. [x] We add the cell area to the cell-area counter ```cell_area_counter += cell_area```
    7. [x] We save the cell-counter ```cell_counter_dish_1_list.append(cell_counter)```
    8. [x] We save the area counter ```cell_area_counter_dish_1_list.append(cell_counter)```

To fill out the missing gaps and solve Bobs-cell-counting problem we also have to recall the the first unit. In the first unit we used ```if_less``` and ```goto``` to jump around in our pseudo-assembly-code. We now want to learn how this is done in Python.

## If-statement
The if-statement is rather simple it consists of the keyword ```if``` followed by something that is or can be converted into a **bool**, ```:``` and then an indented block of further instructions. These instructions are executed if the condition between the keyword and ```:``` is ```True```.
Here is a code snippet illustrating the use of the if-statement. 
```Python
if True:
	print("Hello")
if False:
	print("world!")
```

Please predict what this snippet will print and then try it in the next block.

In [None]:
# Copy the code here

The if-statement can be expanded with two optional statements. ```else``` and ```elif``` short for “else-if”. The elif-code is executed if the if-statement is not executed and its condition is ```True```. The ```else```-statement is executed if no if- or elif-statement was executed.
Here is a short snippet to demonstrate the use of an if-statement with ```elif``` and ```else```:
```Python
a = 5
b = 6
if a < b:
	print("a is smaller than b")
elif a == b:
	print("a is equal b")
else:
	print("a is bigger than b")
```

Copy the code into the next boy and change ```a``` and ```b``` until all three messages were printed.


In [None]:
# Copy the code here and then modify it

## While loop
The next statement is the while-loop. It consists of the keyword ```while``` followed by something that can be converted into a **bool**, ```:``` and then an indented block of instructions.  These instructions are executed while the condition between the keyword and ```:``` remains ```True```.
Here is a short example of a while loop:
```Python
counter = 0
while counter < 20:
	print(counter)
	counter += 1
```

Please predict what this code will print before you copy and execute it.

In [None]:
# Copy the code here

## For loop
The last control-structure is the for-loop. It works similar to the while loop, but it iterates over a sequence like a **list** or **tuple**. It consists of the keyword ```for``` followed by the variable name the current element will have followed by the keyword ```in``` followed by a sequence (e.g. **list**), ```:``` and an indented block of instructions.
Here is an example of a for-loop:
```Python
elements = ["Hello", "", "world", "", "!", 42, 3.0, True]
for element in elements:
	print(element)
```
Please predict the output of this code snippet before executing it in the next block.

In [None]:
# Copy the code here

## Break and continue
Within loops two special-statements can be used the ```break```-statement breaking out of the loop and the ```continue``` statement jumping to the beginning of the next loop cycle. Since they change the flow of the loop they are almost always encountered within an if-statement.
Here is an example of a for-loop with ```break``` and ```continue```.
```Python
elements = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
for element in elements:
	if element in [2, 3, 5, 7]:
		continue
	if element > 8:
		print("Breaking")
        break
	print(element)
```

Please predict what this code prints before you execute it and compare your prediction to the actual results.

In [None]:
# Copy the code here

## Range and len
We often want to iterate over a range of numbers. We can use a for-loop and ```range``` for this. ```range``` is a **function** taking 3 **arguments** ```start```, ```end``` and ```step```. So if we write ```small_numbers = range(0, 10, 1)``` we get all number beginning from ```0``` to ```9```. So ```end``` is not included in the range. 
This may seem weird, but can be attributed to the way **lists** work. Since **lists** start at ```0``` a list with 10 elements is indexed by ```0```, ```1```, ```2```, ```3```, ```4```, ```5```, ```6```, ```7```, ```8``` and ```9``` or ```range(0, 10, 1)```. 
This becomes more convenient if we use ```len```. ```len``` is a **function** that return the length of its **argument**. 
Now let us see both of them in action:
```Python
names = ["John Doe", "Erika Musterfrau", "Max Mustermann", "Karl Dosenkohl", "Hein Janmaat", "Juan Pérez", "Kalle Svensson", "Fred Bloggs"]
for index in range(0, len(names), 1):
	print(names[index])
```

Please copy and execute the code, then adapt it so that only every second name is printed.

In [None]:
# Copy the code here

<details>
  <summary>Click to reveal solution</summary>

```Python
names = ["John Doe", "Erika Musterfrau", "Max Mustermann", "Karl Dosenkohl", "Hein Janmaat", "Juan Pérez", "Kalle Svensson", "Fred Bloggs"]
for index in range(0, len(names), 2):
    print(names[index])
```

</details>

## Functions
After we used ```range``` and ```len``` let us talk more about what they are: **functions**. From a language perspective **functions** are similar to **operators** they take a number of **values** and often become/**return** a **value**. So instead of ```sum = a + b``` we might write ```sum = add(a, b)```. 
From a code-structure perspective they are organizational units or **abstractions** that combine multiple lines of code into a single thing. They are therefore constructed from other **functions** and **operators**. Let us write an ```add``` **function** so we can investigate its parts.
```Python
def add(a, b):
	sum = a + b
	return sum
```
As you can see a **function** looks quite similar to the other control-structures. It starts with the keyword ```def``` (like define) followed by the name of the **function**, ```(``` the **arguments** of the **function**, ```)```, ```:``` and an indented block of instructions.

The name of the function is used to call it later so our example **function** is called ```add``` and can be called like ```c = add(2, 4)```. The **arguments** are what into a **function**, like food goes into your mouth or raw material into a factory. Often they are processed into a final product that is **returned**, but some functions do only modify their **mutable** inputs, like adding **values** to a **list**.
Let us practice this by creating a new **function**. It will be the [fizz-buzz](https://de.wikipedia.org/wiki/Fizz_buzz)-**function**. It is supposed to print either the number or “Fizz” if the number we put in is divisible by 3 and “Buzz” if it is divisible by 5. If both is the case it should print “Fizz Buzz”. This is a common test in programming interviews and a nice example. To test is something is divisible we use ```%``` which gives us the rest of a division.
So let us begin with defining our **function**. Its name should obviously be “fizzBuzz” and its argument a number. So we write:
```Python
def fizzBuzz(number):
```
Now we have to do something in it. Let us first get the rest of the division by ```3``` and print it out to test our function.
We are interested in the rest of the division, because it is zero if the number is divisible by ```3```.

```Python
def fizzBuzz(number):
	rest_division_by_three = number % 3
	print(rest_division_by_three)
# We also have to call the function so it gets executed
fizzBuzz(5)
```
Now copy the code and predict what it prints. Afterwards get the rest for a division by 5.

In [None]:
# Your code goes here

<details>
  <summary>Click to reveal solution</summary>

```Python
def fizzBuzz(number):
    rest_division_by_three = number % 3
    rest_division_by_five = number % 5
# We also have to call the function so it gets executed
fizzBuzz(5)
```

</details>

Now that we have the rest we should try to print “Fizz Buzz” if the number is divisible by ```3``` and ```5```, “Fizz” if it is divisible by ```3```, “Buzz” if it is divisible by ```5```, else we just print the number.

Please recall the comparison-**operators** from the last unit and what we learned so far to adapt the function, so it prints what was described above:

In [None]:
# Your code goes here

<details>
  <summary>Click to reveal solution</summary>

```Python
def fizzBuzz(number):
    rest_division_by_three = number % 3
    rest_division_by_five = number % 5
    divisible_by_three = rest_division_by_three == 0
    divisible_by_five = rest_division_by_five == 0

    if divisible_by_three and divisible_by_five:
        print("Fizz Buzz")
    elif divisible_by_three:
        print("Fizz")
    elif divisible_by_five:
        print("Buzz")
    else:
        print(number)
# We also have to call the function so it gets executed
fizzBuzz(5)
```

</details>

Now that we have a working function we should test it by running on a larger set of numbers. Please build a loop around your **function** so it runs on all integers from 0 until 100.

In [None]:
# Your code goes here

<details>
  <summary>Click to reveal solution</summary>

```Python
def fizzBuzz(number):
    rest_division_by_three = number % 3
    rest_division_by_five = number % 5
    divisible_by_three = rest_division_by_three == 0
    divisible_by_five = rest_division_by_five == 0

    if divisible_by_three and divisible_by_five:
        print("Fizz Buzz")
    elif divisible_by_three:
        print("Fizz")
    elif divisible_by_five:
        print("Buzz")
    else:
        print(number)

# Create a for loop from 0 until 100
for number in range(0, 101, 1):
    fizzBuzz(number)
```

</details>

The last step before we can fully utilize **functions** is the keyword ```return```. Similar to ```break``` in loops ```return``` signals that the function should be left, with a little twist. The **function** becomes or returns the value behind the ```return```. If there is nothing it returns ```None```.
To better illustrate this we will create a **function** that checks if a number is even and returns the result as a **bool**.

```Python
def is_even(number):
	divisible_by_two = number % 2 == 0
	return divisible_by_two
```

Now we will use a for-loop to print our all even numbers between 0 and 20:

```Python
for number in range(0, 21, 1):
	if is_even(number):
		print(number)
```

Please copy both samples in the next cell and execute them.


In [None]:
# Combine the code here

<details>
  <summary>Click to reveal solution</summary>

```Python
def fizzBuzz(number):
    rest_division_by_three = number % 3
    rest_division_by_five = number % 5
    divisible_by_three = rest_division_by_three == 0
    divisible_by_five = rest_division_by_five == 0

    if divisible_by_three and divisible_by_five:
        print("Fizz Buzz")
    elif divisible_by_three:
        print("Fizz")
    elif divisible_by_five:
        print("Buzz")
    else:
        print(number)

# Create a for loop from 0 until 100
for number in range(0, 101, 1):
    fizzBuzz(number)
```

</details>

Congratulations, you know all the necessary building blocks to write simple programs. 

## Writing a program

Writing a program consists of a few steps:
1. **Design**: Here we figure out what our program needs and how it should run. We did this from unit one until now.
2. **Skeleton/smallest parts**: You should always begin with a very small part that is easy to understand and write.
3. **Test regularly**: As soon as you can run your code you should do so. This helps you to find mistakes while it is still small. Later you may wish to write [automated tests](https://en.wikipedia.org/wiki/Test_automation) and expand into [test-driven-development](https://en.wikipedia.org/wiki/Test-driven_development).
4. **Get feedback**: After you have written something ask another competent person to look at your solution, they might find mistakes you did not see.
5. **Incremental improvement**: Do not try to solve your problem as a whole. Work **function** by **function**, **line** by **line**, otherwise you will be overwhelmed  and confused by your own work.

### Skeleton

Now that all the building blocks are combined together we can return to our algorithm and fill out the missing steps:
1. [x] For every file we do the following: ```for csv_file in csv_files:```
    1. [x] We open the file ```csv_file = open("./data/Day_1_dish_1_zoom_3.csv")```
    2. [x] We figure out what day and dish it is ```_, day, _ , dish_number, _, zoom_factor = csv_file_name.split("_") ```
    3. [x] We create a counter for the number of cells ```cell_counter = 0``` 
    4. [x] We create a counter for the area covered by the cells ```cell_area_counter = 0```
    5. [x] For every line we do the following ```for_line in csv_file_handle```:
        1. [x] We ignore the first line ```if line_counter != 0```
        2. [x] We increase the cell counter ```cell_counter += 1```
        3. [x] We add the cell area to the cell-area counter ```cell_area_counter += cell_area```
    7. [x] We save the cell-counter ```cell_counter_dish_1_list.append(cell_counter)```
    8. [x] We save the area counter ```cell_area_counter_dish_1_list.append(cell_counter)```

Now this are a lot of steps so let us try to split them up into some simple components:

```Python
csv_files = [
    "./data/Day_1_dish_1_zoom_3.csv"
]

# Create something to save the dishes
dishes = {}

def process_csv(csv_file, dishes):
    # Here we have to do the hard work,
    # but first we print out the arguments to see if our function is called correctly
    print(csv_file)
    print(dishes)
    return

# Go through all files
for csv_file in csv_files:
    process_csv(csv_file, dishes)
```

This is what is often called a skeleton implementation. The rough structures are here but the details are not fleshed out. 

### Test

Please run the code above to see if there are any errors in it.

In [None]:
# Copy code to test here

### Flesh out the skeleton

After testing it, we can now begin to flesh out our ```process_csv```-**function**. First we write down our plan from the comments, before we implement it line by line, incrementally approaching the final function.

```Python
def process_csv(csv_file, dishes):
    # 1. Open the file
    # 2. Figure out what day and dish it is
    # 3. Create a counter for the cells
    # 4. Create a counter gor the area covered by the cells
    # 5. For every line we do:
    #   5.1 Ignore the first line
    #   5.2 Increase cell counter
    #   5.3 Add area to area-counter
    # 6. Save cell counter
    # 7. Save area cell counter
    return
```

Since we already did our research regarding what we need to use we can now begin to replace the comments with code.

```Python
def process_csv(csv_file, dishes):
    with open(csv_file, "r") as csv_file_handle:
        # 2. Figure out what day and dish it is
        # 3. Create a counter for the cells
        # 4. Create a counter gor the area covered by the cells
        # 5. For every line we do:
        #   5.1 Ignore the first line
        #   5.2 Increase cell counter
        #   5.3 Add area to area-counter
        # 6. Save cell counter
        # 7. Save area cell counter
    return
```

Please use the cell below to complete our little program. Remeber the steps above and work line by line. Use ```print``` to check your results. In the end use ```print(dishes)``` to check if you were successful.

In [None]:
# Write your code here

<details>
  <summary>Click to reveal solution</summary>

```Python
csv_files = [
    "./data/Day_1_dish_1_zoom_3.csv"
]

# Create something to save the dishes
dishes = {}

def process_csv(csv_file, dishes):
    with open(csv_file, "r") as csv_file_handle:
        _, day, _ , dish_number, _, zoom_factor = csv_file.split("_") 
        cell_counter = 0
        cell_area_counter = 0
        line_counter = 0
        for line in csv_file_handle:
            if line_counter != 0:
                cell_counter += 1
                cell_id, nucleus_x, nucleus_y, nucleus_area, cell_area, center_of_area_y, center_of_area_y = line.split(",")
                cell_area_counter += float(cell_area)
            line_counter += 1
        if dish_number not in dishes.keys():
            dishes[dish_number] = {}
        dishes[dish_number][day] = {
            "cell_count": cell_counter,
            "area": cell_area_counter
        } 
    return

# Go through all files
for csv_file in csv_files:
    process_csv(csv_file, dishes)

print(dishes)
```

</details>

## Readable code

Now we have a working script analyzing one file. If you recall Bobs-cell-counting problem from the first unit, we expect him to have multiple files and arrange them into **lists**. First we should try to build the **lists** from the **dict** we already have. How we do this is a matter of style and taste, this means there are multiple ways with costs and benefits, the most relevant for you is **readability**.

**Readability** refers to the ability of a reader to understand the text/code. As a general rule if the reader takes more than half the time to read your text than you needed to write it you goofed. If they take the same time to read it as you to write you should seriously consider professionalizing your writing style.

I mention this because academia has a relevant fraction of fully self-taught-“programmers”, believing hard-work is required to understand code. They usually conclude that whoever fails to understand their incoherent excuse for code “just cannot program”. Industrial and trained wisdom usually attributes a failure to understand code to incomplete documentation or lack of “domain-knowledge”, instead of programming skill. “Domain-knowledge” refers to knowledge about the subject, like cancer-cells, astrophysics or neuroscience. So in other words if someone fails to understand your code the following explanations are seen as legitimate:

1.	They do not understand the area the code is applied in (e.g. they do not know what cancer-cells are)
2.	They do not know basic language structures (e.g. for-loop)
3.	The code is not written well enough (It is usually this)

Please remember this if people ask you how your code works. It usually means you failed to explain your goal and methods well enough. 
So how do you make your code more readable:

- Use clear **variable** and **function** names. So your cell-area should be called ```cell_area``` and not ```car```.
- Try to write simple lines. One line should ideally do one to three things if it does more split it.
- Use intermediate values. So if you add a few numbers before another step introduce a **variable** storing the sum.
- Keep your functions short, they should fit on a small laptop screen.
- Use comments or [docstrings](https://peps.python.org/pep-0257/) for functions to explain, what goes in, what goes out and what the function should achieve.
- Use comments in the code to explain why things are done a certain way.
- Use comments in the code to reference to papers or online sources you read to understand the code.
- Use comments in the code to introduce concepts that might be novel to another programmer.
- Use a [sytleguide](https://google.github.io/styleguide/pyguide.html) once once you start a bigger project.
- Avoid writing[unmaintanable code](https://github.com/Droogans/unmaintainable-code).

If your code starts reading like a paper with references to different web-resources and additional explanations you are doing things mostly right.
JupyterNotebook is designed to support this approach, known as [literate programming](https://en.wikipedia.org/wiki/Literate_programming).

Please take your solution and add the comments to it. Then let someone else read the code, so they can tell you what is not fully clear to them. If they claim to fully understand your code they are most probably to shy to critizize, so keep asking until they find a flaw.

In [None]:
# Your code with comments should be here

<details>
  <summary>Click to reveal solution</summary>

```Python
# A list with all files that should be processed
csv_files = [
    "./data/Day_1_dish_1_zoom_3.csv"
]

# Create something to save the dishes
dishes = {}

def process_csv(csv_file, dishes):
    """!
        @brief This function reads in specific csv-file and adds the contents into the dict dishes
        @details We assume that we get an csv-file with a header line and 7 fields per row.
            The name and contents of the csv-file should be defined as given in param, so we can extract the day and dish number.
            The contents are then stored in a dict passed as dishes defined in param.
        @param csv_file the path to the csv-file as a str, it should follow the form Day_[day]_dish_[dish]_zoom_[zoom].csv,
            where all [] denote numbers extracted as meta data.
            The 5th field should contain the area of the cells.
        @param dishes a dict that will be filled with the contents of the csv-file.
            The first layer of keys will be the dish-numbers, the values belonging to them are dicts.
            These dicts contain the days as keys and dicts as values. The lowest layer dicts contain
            the "cell_count" and "area" as keys. Their values are the total cell-count for that day and dish and the
            total area for that day and dish respectivley.
        @return None
    """
    with open(csv_file, "r") as csv_file_handle:
        _, day, _ , dish_number, _, zoom_factor = csv_file.split("_")
        cell_counter = 0
        cell_area_counter = 0
        line_counter = 0
        for line in csv_file_handle:
            if line_counter != 0:
                cell_counter += 1
                cell_id, nucleus_x, nucleus_y, nucleus_area, cell_area, center_of_area_y, center_of_area_y = line.split(",")
                cell_area_counter += float(cell_area)
            line_counter += 1
        if dish_number not in dishes.keys():
            dishes[dish_number] = {}
        dishes[dish_number][day] = {
            "cell_count": cell_counter,
            "area": cell_area_counter
        } 
    return

# Go through all files
for csv_file in csv_files:
    process_csv(csv_file, dishes)

print(dishes)
```

</details>

## Looking at data

Now we know how to make the code easy to read. This is good for us as programmers, but we are also scientists working with data, so we should discuss the datat too. What are data? Usually they are quantifiable observations or measurements, like the body-temperature of an animal or the number of times it pressed a button. So they are numbers. These numbers we have to present in a way, that they can be used in the program and understood by us and our peers.

This is means we have to learn how to look at our data, how to write them down. Usually the numbers belong together, like multiple temperature measurements of the same animal, in this case we could call this a [dimension](https://en.wikipedia.org/wiki/Dimension_(vector_space)). Dimension usually refers to something you can measure along, like time, temperature or button-presses. So they correspond to variables. Imagine a dimension as a [number line](https://en.wikipedia.org/wiki/Number_line). On it goes everything an instrument produces, so if you have two thermometers you have two lines or two dimensions. If you have a clock this is another number line.

Everytime you take readings from your instruments you write down all this numbers, getting a data point. So if you read of a clock and a thermometer you get a temperature and a time on two axis, a temperature and a time axis. If you read fast enough your points start to form patterns, if you connect them you get lines or plots.

If you want to compare two experiments or two things you lay the axes over each other. So you put the time and temperature axes of experiment 1 over those of experiment 2, giving you two overlapping plots. The question when two points truly belong on the same axis and are in the same dimension and when they are not depends on the question you ask, but the principle stays the same.

In general we have to discriminate between continious data like temperature and discrete data like number of button presses. Discrete means that there are steps or boxes, so in our case we have dish-1, dish-2, dish-3 but no dish-2.23 and the dishes are therefore discrete. The same is true for the number of cells as there can be no half or quater cell either there is a cell or there is none. Of yourse such a distinction becomes tricky if the cells start to divide, but these kind of problems you have to analyse yourself.

As an exercise please think about the cancer-cell-dishes Bob wants to analyze. How many dimensions do you find there and are they discrete or continuous. Please note your results in the next cell.

<details>
  <summary>Click to reveal possible solution</summary>

Remember looking at data is interpretation and while there are many wrong interpretations, there are also many right ones. You will learn with time which perspective or interpretation is beneficial to your work.

For each day and dish we can measure a cell-count and cell-area, so we have four dimensions:
- Days
- Dishes
- Cell-count
- Cell-area

Dishes and cell-count are discrete by nature as there are no half dishes or half cells. The later one is a conscious choice we made.
By nature time is continuous, but Bob decided to measure once a day, so for us the time measured in days is discrete.
The cell-area itself is by nature continuous, but the pictures contain pixels which are discrete, therefore the area became discrete.
We however have a zoom which is according to our knowledge not discrete, so the area we have should be treated as continuous.

This last point is rather subtle and you may have rightfully arrived at a different conclusion.

</details>

## Structuring data

Now that we have identified our dimensions, we can now begin to store them. Usually you can split your dimensions/variables into controlled and resulting. Controlled variables are the ones you or the experiment designer decides on, like the number of days or the number of dishes. Resulting variables are usually what we are interested in like total cell-area or cell-count. These are usually hypothesized to [depend]( https://en.wikipedia.org/wiki/Dependent_and_independent_variables)  on the controlled ones. 

If you store your data you can rely on the fact, that there is a [countable]( https://en.wikipedia.org/wiki/Countable_set) and practical finite set of measurements. Countable means we can assign every measurement a number, so we can store them into a **list**. So the simplest way to store measurements is in **lists** sorted by the precise time or recording. This is usually not really helpful, so we return to the controlled variables and think about them.

So we should store our data in a way we can access it by the controlled variables for easier access. So we store by day and dish number. Since both are contigious, meaning there are no gaps in our discrete numbers, we can store both of them in **lists**. So we store our data in two dimensions day and dish.

Lastly we have to distinguish between the area and the cell-count. We currently do this by storing them in a **dict** with two entries. The fact that there are two entries means that we can understand the dict-entries as another dimension, so our data can be stored in a three dimensional structure. Which might similar to the next image from [wikipedia](https://upload.wikimedia.org/wikipedia/commons/thumb/7/71/Epsilontensor.svg/640px-Epsilontensor.svg.png).

![Epsilon-tensor](https://upload.wikimedia.org/wikipedia/commons/thumb/7/71/Epsilontensor.svg/640px-Epsilontensor.svg.png)

In our case it would be just two tables stacked behind each other, corresponding to the area and the cell-count. We now have to ask ourselves how we should store this three dimensional data to keep it easily accessible . The obvious choice is to distinguish between cell-area and cell-count, but not so obvious is how we wish to distinguish. The simple approach is to use a **list**.
```Python
cells = [cell_area, cell_count]
```
The problem is obviously that ```cells[0]``` and ```cells[1]``` do not really tell us what we get, so we should use a **dict**.

```Python
cells = {"area": cell_area, "cell_count": cell_count}
```
Now ```cells["area"]``` and ```cells["count"]``` are in my opinion much more readable than just numbers, so I think we should choose this option.

Next we should think about how we should store the contents of the tables. Here I would advocate for lists so we can iterate though them, the question is should we first select the dish and then the day or first the day and then the dish. 
The consideration here is what we will more likely access together, all data of one dish or all data of one day, since the dishes are physically separated and Bob wants to investigate the growth, so we may wish to see changes overtime meaning that the contents of a dish should stay together. This means we will have a **list** that contains the dishes and the dishes themselves are **lists** containing the values for the days. In code this would look like this:

```Python
# Creation of the lists
cells = {
	"area": [
		# Dish 1
		[ 
            # Day 1
            12,
            # Day 2
            20
            # Day 3
            43,
             #...
		],
		# Dish 2
		#....
	],
	"count": [
		# …	
	]
}

# Accesing cell-count Dish 2 day 9
# Remember lists start at 0
cell_count = cells["count"][1][8]
```

Please grab a piece of paper if you have one and write down some example data "(dish, day):(area, count)". Fill them out with numbers so dishes 1-3 and day 1-3. Now copy them once for area "(dish, day):(area)" and once for count "(dish, day):(count)". Lastly transfer them into tables with the dish as the major or vertical and the day as the minor or horizontal coordinate using glue and scissors is encouraged. Now put both tables behind each other, to visualize the data-structure we want to create.

## Data transformation

Now how do we get from the ```dishes``` we created earlier to the ```cells``` we want. Remembering the last excersice 

Remembering the last exercise and the example we need two entries in a **dict**, which will contain **lists** representing dishes containing **lists** representing days with entries.
So we first create the **dict** with the **lists**.
```Python
area = []
count = []
cells = {"area": area, "count": count}
```

Then iterate over the dishes to get their contents:
```Python
for dish_number in range(1, len(dishes) + 1, 1):
    dish = dishes[str(dish_number)]
```

We need a new **list** for every day, so we create them:
```Python
    dish_area = []
    dish_count = []
```
Lastly we iterate over the days and ```append``` the values. Please do this and complete the transformation. Use ```print(cells)``` in the end to confirm you achieved your goal.

In [None]:
# Write your code here

<details>
  <summary>Click to reveal solution</summary>

```Python
area = []
count = []
cells = {"area": area, "count": count}
# We know that the dishes are numbered so we iterate over them
for dish_number in range(1, len(dishes) + 1, 1):
    dish = dishes[str(dish_number)]
    dish_area = []
    dish_count = []
    # We know that the days in the dishes are numbered
    for day_number in range(1, len(dish) + 1, 1):
        value_pair = dish[str(day_number)]
        day_area = value_pair["area"]
        day_count = value_pair["cell_count"]
        dish_area.append(day_area)
        dish_count.append(day_count)
    area.append(dish_area)
    count.append(dish_count)
print(cells)
```

</details>

## Reminder readability

Of course your solution is rather long, so let me present to you a shorter version:

```Python
count = [[dishes[dish_number][day]["cell_count"] for day in sorted(dishes[dish_number].keys(), key=int)] for dish_number in sorted(dishes.keys(), key=int)]
area = [[dishes[dish_number][day]["area"] for day in  sorted(dishes[dish_number].keys(), key=int)] for dish_number in sorted(dishes.keys(), key=int)]
cells = {"area": area, "count": count}
print(cells)
```

If you feel confused now remember that is how unreadable code feels. Please do not write it. If you return here after working with python for a year these lines will be clear to you, because you are used to write like that. Please whenever you write, remember to ask yourself what your audience does or does not know and adopt your code accordingly.

I assume you have some questions. Please ask them now, so you have understood all relevant concepts before we move on to classes and using other people code in the next unit.