## Python Review: Storing Data in the **List** and **Dict** Data Structures

Let's warm up our Python skills by practicing data organization with Lists and Dicts!



### Lists

| Code | Description |
| :-- | :-- |
| **`data = []`** | Makes an empty List | 
| **`data = [10, 11, 12]`** | Makes a List with 3 items: `10`, `11`, and `12` |
| **`data = ['a', 'b', 'c']`** | Makes a List with 3 items: `10`, `11`, and `12` |
| **`data[0]`** | Accesses the first element in the list |
| **`data[-1]`** | Accesses the last element in the list |
| **`data[1:]`** | Accesses the second-through-end elements in the list |
| **`data[1] = 7`** | Replaces with second element in the list with the element `7` |
| **`data.append(100)`** | Append a new element `100` on to the end of the list |



**Exercises**

**Example**: Make a list of three names of European countries.

In [1]:
countries = ['Germany', 'France', 'England']
countries

['Germany', 'France', 'England']

Make a list of three scientists.

In [2]:
scientists = ["Rose", "Einstein", "Newton"]

Display the second scientist in your list

In [3]:
scientists[1]

'Einstein'

Replace the third scientist with `"Wilhelm Wundt"`

In [4]:
scientists[2] = "Wilhelm Wundt"

Use the list's `.append()` method to put Max Planck onto the end of the list.

In [5]:
scientists.append("Max Planck")

Remove the first scientist from the list using the list's `.pop()` method

In [7]:
scientists.pop(0)
scientists

['Einstein', 'Wilhelm Wundt', 'Max Planck']

**Extra**: Use the string's `.count()` method to count how many W letters there are in the second string in the list of scientists.

In [9]:
scientists[1].count("W")

2


### Dicts

| Code | Description |
| :-- | :-- |
| **`data = {}`** | Makes an empty Dict | 
| **`data = {'a': 3, 'b': 5}`** | Makes a Dict with two items: "a" and "b" |
| **`data['a']`** | Accesses the value associated with key 'a' |
| **`data['c'] = 7`** | Adds a new key-value pair 'c': 7 to the Dict |
| **`list(data.keys())`** | Retrieves a list of all keys in the Dict |
| **`list(data.values())`** | Retrieves a list of all values in the Dict |

**Exercises**

The `image` dict describes how researcher Tom's recording is formatted:

In [10]:
image = {'height': 1920, 'width': 1080, 'format': 'RGB', 'order': 'F'}
image

{'height': 1920, 'width': 1080, 'format': 'RGB', 'order': 'F'}

**Example**: Write the code to print out the width of the image, by accessing the `"width"` key:

In [11]:
image['width']

1080

What is the height of the image?

In [13]:
image['height']

1920

How are the pixel data in the image formatted?

In [14]:
image['format']

'RGB'

What does the error message say, if you use the same syntax to find out which key has the value `1080` ?  What does this tell you about how key-value maps like Dictionaries are designed for?

In [15]:
image[1080]

KeyError: 1080

## Building a Schema for a Data Model: Mixing Dicts and Lists

Dictionaries and Lists are quite flexible, allowing any data type to be stored alongside any other data type, but without an organization plan, the code around analyzing the data they store can get quite complex.  Building a **"schema"**, a model for your data that is the same for each record in your project that is applied consistently, makes the data easy to browse and analyze.

For example, weather data could be stored like:

```python
weather = [
    {'date': '2024-11-20',
     'morning_condition': 'sunny',
     'afternoon_condition', 'rainy',
     'hourly_temperatures': [20, 21, 20, 18, 16, 14, 13],
    },
    {'date': '2024-11-21',
     'morning_condition': 'sunny',
     'afternoon_condition', 'cloudy',
     'hourly_temperatures': [18, 16, 20, 18, 16, 14, 13],
    },
]
```

Getting the third hourly temperature for the second record would then be done like this:

```python
weather[1]['hourly_temperatures'][2]
```
.

Because the data is consistently-stored, getting all the morning conditions for analysis can be done like this:
```python
morning_conditions = [day['morning_condition'] for day in weather]
```
.



**Exercises**:  Using any schema you'd like, organize the information these two sessions into a single data structure:
  - Nov 13, 2024: Subject Jeff did 3 trials: first with a red circle stimulus on the left side, then a green square on the right, then a green circle on the right.
  - Nov 14, 2024: Subject Jane did 2 trials: first with a green square stimulus on the right side, then a red circle on the left.

**Note**: There is no one right answer here; every data structure has its advantages and disadvantages.  Feel free to organize the data as you see fit.

In [17]:
my_data = [{'date': '2024-11-13', 'subject': 'Jeff', 
               'trials':
               [{'trial': 1, 'shape': 'circle', 'color': 'red', 'side': 'left'},
                {'trial': 2, 'shape': 'square', 'color': 'green', 'side': 'right'},
                {'trial': 3, 'shape': 'circle', 'color': 'green', 'side': 'right'}]},
                {'date': '2024-11-14', 'subject': 'Jane',
                 'trials':
                 [{'trial': 1, 'shape': 'square', 'color': 'green', 'side': 'right'},
                  {'trial': 2, 'shape': 'circle', 'color': 'red', 'side': 'left'}]}]
               

In [18]:
my_data[1]["subject"]

'Jane'

Using your data, get the 2nd trial's stimulus color from the first session.

In [19]:
my_data[0]['trials'][1]['color']

'green'