<img src="../Images/DSC_Logo.png" style="width: 400px;">

## 3. Data Structures

In this notebook, you will learn about Python’s most common data structures: how they work, what you can and can’t do with them, and how to choose the right one for your task.

<img src="../Images/PythonMindmap.png" style="width: 1000px;">

In the previous notebooks, we worked with basic data types in Python. Each of these stores one single piece of data. Most real-world problems don’t involve just one number or one name. They involve collections of data! To handle these, we need ways to group and organize multiple values, not just one at a time. That’s where data structures come in.

**Data structures are objects that group multiple items into one container**. Unlike a single integer or string, a data structure can hold several values at once, sometimes of different types. 

Python's four core built-in data structures are: `list`, `tuple`, `set`, `dictionary`.

In addition, the `range` type is a specialized, immutable sequence that represents a series of numbers, typically used for looping or iteration (Notebook 4).

## 3.1 List: A Mutable Sequence

In Python, lists are defined using square brackets `[]`, and they can contain multiple items in an ordered sequence. Lists can store multiple items of any type, and their content can be changed after creation (mutable).

This list contains an `int`, a `float`, two `str`, and a `bool`:

In [None]:
materials = [3, 2.5, "wood", "paint", True]
print(materials)

You can access individual list items using their **index**. Let's access the first, third and last item of the list:

In [None]:
print(materials[0])
print(materials[2]) 
print(materials[-1])

---
### **Exercise 1:** 

How can you access only the two string elements in the list using their indexes? Check out Notebook 2 again for help.

In [None]:
# Solution:

print(materials[2:4])

---

Let's modify the list and add another item to the end of the list using the `append` function:

In [None]:
materials.append("roof")
print(materials)

## 3.2 Tuple: An Immutable Sequence

A tuple is similar to a list, but its contents cannot be changed (immutable). Use tuples when you want to protect data from being modified accidentally. Tuples are defined using parentheses `()`:

In [None]:
materials = (3, 2.5, "wood", "paint", True)
print(materials)

Accessing items in tuples like with lists:

In [None]:
print(materials[0])

Adding another item to the tuple with `append` won't work because tuples do not support methods that change their contents:

In [None]:
materials.append("glue")

## 3.3 Set: No Duplicates Allowed

A set is an unordered collection of unique elements. Sets are mutable, but they automatically remove duplicate values, i.e. they are useful when you need to deduplicate data. They use curly braces `{}`.

In [None]:
crew = {"Alice", "Bob", "Alice", "Linus"}
print(crew) 

## 3.4 Dictionary: Key-Value Pairs

Dictionaries store key-value pairs and are mutable. They use curly braces `{}` with a colon `:` to separate **keys** and **values**.

Creating a dictionary to store different information about the shed:

In [None]:
shed = {
    "length": 3.0,
    "width": 2.5,
    "color": "red",
    "windows_installed": True
}
print(shed)

Unlike lists, which rely on position (index), dictionaries are accessed via keys. To access a value, provide its key in square brackets:

In [None]:
print(shed["length"])  # value for "length" key
print(shed["color"])   # value for "color" key

Dictionaries make data more explicit and self-describing. Suppose we have a list like this:

In [None]:
name_list = ["Alice", "Painter"]

We can only assume index 0 is the first name of the person and index 1 is the role of the person. But with a dictionary, each piece of data has a name (a key), so we can directly ask for it using that key instead of remembering positions:

In [None]:
person = {"first_name": "Alice", "role": "Painter"}
print(person["first_name"]) # value for "first_name" key

---
### **Exercise 2:** 

Think of something in your research where one item maps to another (e.g., abbreviation → full term, sample ID → site, code → category). Create a small dictionary (3–5 entries) and do one lookup.

In [None]:
# Solution (example - map participant IDs to the main theme they talked about in an interview)
participant_to_theme = {
    "P001": "Emotional Support",
    "P002": "Trust in Community",
    "P003": "Barriers to Services",
    "P004": "Program Feedback"
}

# Example lookup: what theme did participant P003 discuss?
print(participant_to_theme["P002"])

---

We can also combine individual dictionaries into a multi-level (nested) dictionary, or include lists inside dictionaries. Such structures are useful for organizing datasets.

In [None]:
# Example of a codebook represented as a nested dictionary in Python
codebook = {
    "EMO_01": {
        "category": "Emotional Support",
        "definition": "Expressions of empathy, caring, encouragement, or comfort offered by others.",
        "inclusion": [
            "Statements of comfort or reassurance (e.g., 'It'll be okay').",
            "Descriptions of someone listening or showing empathy."
        ],
        "exclusion": [
            "Instrumental help (e.g., 'they gave me money').",
            "General positive statements without interpersonal context."
        ],
        "example": "\"When I felt overwhelmed, my neighbor sat with me and listened.\"",
        "notes": "Often co-occurs with 'Trust in Community' (SS_TRU_02)."
    },
    "TRU_02": {
        "category": "Trust in Community",
        "definition": "Expressions that indicate confidence in the intentions, reliability, or goodwill of community members.",
        "inclusion": [
            "Explicit claims of trust ('I trust the community leaders').",
            "Examples where people rely on others without fear of betrayal."
        ],
        "exclusion": [
            "Trust in institutions (e.g., 'I trust the hospital') — consider separate code."
        ],
        "example": "\"We know our neighbors will show up when something goes wrong.\"",
        "notes": "Useful to flag whether trust is personal or institutional."}
    }

To retrieve information, for example the definition of the code "EMO_01", write:

In [None]:
print(codebook["EMO_01"]["definition"])

Dictionaries are very often used to store data, for example, information retrieved from web APIs. These APIs often return results in JSON format, which Python automatically converts into a dictionary that you can easily explore and analyze:

In [None]:
import requests
agify_request = requests.get("https://api.agify.io/?name=friedel") # check out: https://agify.io/
agify_request_dict = agify_request.json()

print(agify_request_dict)