### Python Data Structures

In Week 1, we introduced the concept of Python lists - a way to store multiple values in a single variable. This week, we'll expand on lists and explore other important data structures in Python: tuples, sets, and dictionaries. Each of these structures has its own unique properties and use cases.

Data structures are different ways of organising data. For example, if you are writing a shopping list you might just jot down each item one after another. The order doesn’t matter much — you only need to remember what to buy. But if you wanted to keep your friends’ addresses, that approach wouldn’t be very helpful; you’d want to be able to quickly find the address of a particular friend. To do that, you might label each address with the friend’s name, and then you can look it up directly when you need it. Similarly, Python provides different data structures to organise and work with different kinds of information.

Later in the module, we’ll spend a lot of time working with pandas DataFrames, which are powerful tools for handling tabular data. However, it is still important to understand the basic Python data structures such as lists, dictionaries, sets and tuples and there will be times when you need to make use of them. Knowing these will make it much easier to understand how pandas works and to write clean, efficient code for data analysis.

## Lists (Review and Advanced Features)

Let's quickly review what we learned about lists in Week 1:

- Lists are created using square brackets `[]`
- They can contain items of different data types
- Items can be accessed using index (starting from 0)
- Lists are mutable (this just means they can be changed after creation)

In [None]:
# Creating a list
movies = ["The Matrix", "Inception", "Interstellar", "The Dark Knight", "Pulp Fiction"]
print("Movies list:", movies)

Now, let's explore some more advanced features of lists.

### List Slicing

List slicing allows you to access a specific range of elements from a list. The syntax is `list[start:end]`, where `start` is the index of the first element to include and `end` is the index of the first element to exclude.

In [None]:
# List slicing examples
print("Original list:", movies)

# Get elements from index 1 to 3 (not including 3)
print("movies[1:3]:", movies[1:3])  # Output: ['Inception', 'Interstellar']

# Get elements from the beginning to index 2 (not including 2)
print("movies[:2]:", movies[:2])   # Output: ['The Matrix', 'Inception']

# Get elements from index 2 to the end
print("movies[2:]:", movies[2:])   # Output: ['Interstellar', 'The Dark Knight', 'Pulp Fiction']

In [None]:
# Negative indices count from the end of the list
print("Original list:", movies)

# Get the last two elements
print("movies[-2:]:", movies[-2:])  # Output: ['The Dark Knight', 'Pulp Fiction']

# Get the last element
print("movies[-1]:", movies[-1])    # Output: 'Pulp Fiction'

# Get all elements except the last two
print("movies[:-2]:", movies[:-2])  # Output: ['The Matrix', 'Inception', 'Interstellar']

### List Methods

Python provides several built-in methods to work with lists. We've already seen `append()`, `insert()`, `remove()`, and `pop()`. Here are some additional useful methods:

In [None]:
# Finding the length of a list
print("Movies list:", movies)
print("Number of movies:", len(movies))  # Output: 5

In [None]:
# Sorting a list
print("Original list:", movies)

# Sort the list alphabetically
movies.sort()
print("Sorted list:", movies)  # Output: ['Inception', 'Interstellar', 'Pulp Fiction', 'The Dark Knight', 'The Matrix']

In [None]:
# Reversing a list
print("Current list:", movies)

# Reverse the order of items
movies.reverse()
print("Reversed list:", movies)  # Output: ['The Matrix', 'The Dark Knight', 'Pulp Fiction', 'Interstellar', 'Inception']

In [None]:
# Counting occurrences and finding indices
# Let's create a list with duplicate items
fruits = ["apple", "banana", "apple", "orange", "banana", "apple"]
print("Fruits list:", fruits)

# Count how many times "apple" appears
apple_count = fruits.count("apple")
print("Number of apples:", apple_count)  # Output: 3

# Find the index of the first occurrence of "banana"
banana_index = fruits.index("banana")
print("First banana is at index:", banana_index)  # Output: 1

## Tuples

Tuples are similar to lists, but they are immutable, meaning they cannot be changed after creation. Tuples are created using parentheses `()`.

In [None]:
# Creating tuples
# A tuple of coordinates (x, y)
coordinates = (10, 20)

# A tuple representing an RGB color (red, green, blue)
rgb_color = (255, 0, 128)

# A tuple with mixed data types
person = ("Alice", 30, "Engineer")

print("Coordinates:", coordinates)  # Output: (10, 20)
print("RGB Color:", rgb_color)      # Output: (255, 0, 128)
print("Person:", person)            # Output: ('Alice', 30, 'Engineer')

### Accessing Tuple Elements

Like lists, tuple elements can be accessed using indices:

In [None]:
# Accessing tuple elements
print("X coordinate:", coordinates[0])  # Output: 10
print("Person's age:", person[1])       # Output: 30

One of the useful features of tuples is "tuple unpacking," which allows you to assign the elements of a tuple to multiple variables at once:

In [None]:
# Tuple unpacking - remember we have our tupe from earlier: person = ("Alice", 30, "Engineer")
name, age, profession = person
print("Name:", name)             # Output: Alice
print("Age:", age)               # Output: 30
print("Profession:", profession) # Output: Engineer

### Why Use Tuples?

1. **Immutability**: Tuples cannot be changed after creation, making them useful for data that shouldn't be modified.
2. **Faster**: Tuples are slightly faster than lists for operations like iteration.
3. **Dictionary Keys**: Tuples can be used as dictionary keys (lists cannot).
4. **Multiple Return Values**: Functions can return multiple values as a tuple.

In [None]:
# Attempting to modify a tuple will result in an error
try:
    coordinates[0] = 15  # This will raise a TypeError
except TypeError as e:
    print("Error:", e)  # Output: 'tuple' object does not support item assignment

## Sets

Sets are unordered collections of unique elements. They are useful for membership testing, removing duplicates, and mathematical operations like union, intersection, and difference.

Sets are created using curly braces `{}` or the `set()` function:

In [None]:
# Creating sets
fruits_set = {"apple", "banana", "cherry"}
print("Fruits set:", fruits_set)

# Creating a set from a list (removes duplicates)
numbers_list = [1, 2, 2, 3, 4, 4, 5]
numbers_set = set(numbers_list)
print("Original list with duplicates:", numbers_list)
print("Set from list (duplicates removed):", numbers_set)

### Set Operations

Sets support various operations like adding elements, removing elements, and checking membership:

In [None]:
# Basic set operations
fruits_set = {"apple", "banana", "cherry"}
print("Original set:", fruits_set)

# Adding an element
fruits_set.add("orange")
print("After adding 'orange':", fruits_set)

# Removing an element
fruits_set.remove("banana")
print("After removing 'banana':", fruits_set)

# Checking membership
print("Is 'apple' in the set?", "apple" in fruits_set)  # Output: True
print("Is 'banana' in the set?", "banana" in fruits_set)  # Output: False

### Mathematical Set Operations

Sets support mathematical operations like union, intersection, and difference:

In [None]:
# Mathematical set operations
set_a = {1, 2, 3, 4, 5}
set_b = {4, 5, 6, 7, 8}
print("Set A:", set_a)
print("Set B:", set_b)

# Union (all elements from both sets)
union_set = set_a.union(set_b)  # or set_a | set_b
print("Union (A ∪ B):", union_set)  # Output: {1, 2, 3, 4, 5, 6, 7, 8}

# Intersection (elements common to both sets)
intersection_set = set_a.intersection(set_b)  # or set_a & set_b
print("Intersection (A ∩ B):", intersection_set)  # Output: {4, 5}

# Difference (elements in set_a but not in set_b)
difference_set = set_a.difference(set_b)  # or set_a - set_b
print("Difference (A - B):", difference_set)  # Output: {1, 2, 3}

### Real-World Example: Using Sets

Let's see a practical example of using sets to find common elements between two lists:

In [None]:
# Finding common elements between two lists
alice_movies = ["Inception", "The Matrix", "Interstellar", "The Dark Knight", "Pulp Fiction"]
bob_movies = ["The Matrix", "The Lord of the Rings", "Pulp Fiction", "Star Wars", "The Godfather"]

# Convert lists to sets
alice_set = set(alice_movies)
bob_set = set(bob_movies)

# Find common movies (intersection)
common_movies = alice_set.intersection(bob_set)
print("Alice's movies:", alice_movies)
print("Bob's movies:", bob_movies)
print("Movies both Alice and Bob like:", common_movies)

## Dictionaries

Dictionaries are collections of key-value pairs. They are unordered, mutable, and indexed by keys. Dictionaries are created using curly braces `{}` with key-value pairs separated by colons `:`.

In [None]:
# Creating dictionaries
# A dictionary of student scores
student_scores = {"Alice": 85, "Bob": 92, "Charlie": 78}
print("Student scores:", student_scores)

# A dictionary with mixed data types
person_info = {
    "name": "Alice",
    "age": 30,
    "is_student": False,
    "courses": ["Math", "Physics", "Computer Science"]
}
print("Person information:", person_info)

### Accessing Dictionary Values

You can access dictionary values using their keys:

In [None]:
# Accessing dictionary values
print("Bob's score:", student_scores["Bob"])  # Output: 92
print("Person's name:", person_info["name"])  # Output: Alice
print("Person's courses:", person_info["courses"])  # Output: ['Math', 'Physics', 'Computer Science']

If you try to access a key that doesn't exist, you'll get a KeyError. To avoid this, you can use the `get()` method, which returns `None` (or a default value) if the key doesn't exist:

In [None]:
# Using get() to safely access dictionary values
print("David's score:", student_scores.get("David"))  # Output: None
print("David's score (with default):", student_scores.get("David", 0))  # Output: 0

### Modifying Dictionaries

Dictionaries are mutable, so you can add, update, or remove key-value pairs:

In [None]:
# Adding and updating dictionary items
print("Original student scores:", student_scores)

# Adding a new student
student_scores["David"] = 88
print("After adding David:", student_scores)

# Updating an existing student's score
student_scores["Charlie"] = 82
print("After updating Charlie's score:", student_scores)

In [None]:
# Removing dictionary items
print("Current student scores:", student_scores)

# Remove a student using del
del student_scores["Bob"]
print("After removing Bob:", student_scores)

# Remove a student using pop() (returns the removed value)
charlie_score = student_scores.pop("Charlie")
print(f"Charlie's score was: {charlie_score}")
print("After removing Charlie:", student_scores)

### Dictionary Methods

Dictionaries provide several useful methods:

In [None]:
# Dictionary methods
student_scores = {"Alice": 85, "Bob": 92, "Charlie": 78, "David": 88}
print("Student scores:", student_scores)

# Get all keys
keys = student_scores.keys()
print("Keys:", keys)  # Output: dict_keys(['Alice', 'Bob', 'Charlie', 'David'])

# Get all values
values = student_scores.values()
print("Values:", values)  # Output: dict_values([85, 92, 78, 88])

# Get all key-value pairs as tuples
items = student_scores.items()
print("Items:", items)  # Output: dict_items([('Alice', 85), ('Bob', 92), ('Charlie', 78), ('David', 88)])

## Choosing the Right Data Structure

Each data structure has its strengths and weaknesses. Here's a quick guide to help you choose the right one for your needs:

- **Lists**: Use when you need an ordered collection of items that might change.
- **Tuples**: Use when you need an ordered collection of items that should not change.
- **Sets**: Use when you need to ensure uniqueness or perform set operations.
- **Dictionaries**: Use when you need to associate values with keys for fast lookup.

## Nested Data Structures

A powerful feature of Python's data structures is that they can be **nested**, meaning you can have data structures inside other data structures. For example, a list can contain dictionaries, or a dictionary can have lists as values. This allows you to model more complex, real-world data.

This is a concept we will use in the final set of exercises.

In [None]:
# Example: A list of dictionaries
# Each dictionary represents a person with their name and age
people = [
    {"name": "Alice", "age": 30},
    {"name": "Bob", "age": 25},
    {"name": "Charlie", "age": 35}
]

# Accessing the first person's information
print("First person's name:", people[0]["name"]) # Output: Alice

# Example: A list of tuples
# Each tuple represents a point with (x, y) coordinates
points = [(1, 2), (3, 4), (5, 6)]

# Accessing the x-coordinate of the second point
print("Second point's x-coordinate:", points[1][0]) # Output: 3

## Exercises

### Exercise 1: List Slicing

Create a list of numbers from 1 to 10, then use slicing to extract:
1. The first three numbers
2. The last three numbers

In [None]:
# Your solution here


### Exercise 2: Tuple Operations

Create a tuple containing the names of three cities. Then, use tuple unpacking to assign each city to a separate variable and print them.

In [None]:
# Your solution here


### Exercise 3: Set Operations

Create two sets: one containing even numbers from 1 to 10, and another containing multiples of 3 from 1 to 10. Then find:
1. The union of the two sets
2. The intersection of the two sets
3. The numbers that are even but not multiples of 3

In [None]:
# Your solution here


### Exercise 4: Dictionary Operations

Create a dictionary of five countries and their capitals. Then:
1. Look up the capital of one country by using its name as a key.
2. Add a new country and its capital.
3. Change the capital of one country.
4. Remove one country from the dictionary.
5. Print the entire dictionary.

In [None]:
# Your solution here


## Open-Ended Exercises

The following exercises are more open-ended. For each problem, you'll need to decide which data structure is the most appropriate and then implement a solution.

### Exercise 5: Storing User Profile Information

**Problem:**
You need to store a user's profile information for a social media app. This information includes their username (e.g., "DataWizard"), their age (e.g., 28), and a list of their favorite hobbies (e.g., "reading", "hiking", "coding"). You need to be able to easily access all of this information for a specific user.

**Task:**
Choose the best data structure to store this user profile information and create an example for a user.

**Guidance for Students:**
*   Think about how you would look up a user's information. Would you search through a list, or is there a more direct way?
*   Does the order of the hobbies matter? Should you be able to add or remove hobbies?
*   What data types would you use for the username, age, and hobbies?

In [None]:
# Your solution here


### Exercise 6: Representing a Deck of Playing Cards

**Problem:**
You want to represent a standard deck of 52 playing cards. Each card has a suit (e.g., 'Hearts', 'Diamonds', 'Clubs', 'Spades') and a rank (e.g., '2', '3', ..., '10', 'Jack', 'Queen', 'King', 'Ace'). The order of the cards in the deck is important, and you should be able to shuffle the deck (which means you'll need to be able to change the order of the cards).

**Task:**
Choose the best data structure to represent a deck of cards. You don't need to create the full deck, just a few cards to show the structure.

**Guidance for Students:**
*   How can you group the rank and suit of a single card together?
*   Once you have a representation for a single card, how can you create a collection of 52 of them?
*   Should the collection of cards be changeable (mutable) or unchangeable (immutable)?

In [None]:
# Your solution here


### Exercise 7: Tracking Unique Website Visitors

**Problem:**

**Data:**
Here is the list of visitor IP addresses:
`visitor_ips = ['192.168.1.1', '10.0.0.1', '192.168.1.1', '172.16.0.1', '10.0.0.1', '10.0.0.1']`
You are tracking the IP addresses of visitors to your website. Over the course of a day, you collect a list of IP addresses, but some visitors come to the site multiple times, so their IP address appears more than once in your list. You need to determine the number of *unique* visitors.

**Task:**
You have a list of IP addresses with duplicates. Choose the best data structure to find the unique IP addresses and the total count of unique visitors.

**Guidance for Students:**
*   Which data structure is specifically designed to hold only unique items?
*   How can you convert your list of IP addresses into this data structure?

In [None]:
# Your solution here


### Exercise 8: Storing Fixed Configuration Settings

**Problem:**
Your application has some important configuration settings that should be defined once and never changed while the application is running. These settings include the application's name, its version number, and a flag to indicate whether it's in debug mode.

**Task:**
Choose the best data structure to store these fixed configuration settings.

**Guidance for Students:**
*   Which data structure is designed to be unchangeable (immutable)?
*   How can you store different data types (string, float, boolean) together in this structure?

In [None]:
# Your solution here


### Exercise 9: Managing a To-Do List

**Problem:**
You are creating a simple to-do list application. You need to store a list of tasks. For each task, you want to store its description (e.g., "Buy groceries") and its status (e.g., "incomplete" or "complete"). You should be able to add new tasks and change the status of existing tasks.

**Task:**
Choose the best data structure to store the to-do list and create an example with a few tasks.

**Guidance for Students:**
*   How can you represent a single task with its description and status?
*   How can you store a collection of these tasks?
*   Should the collection be ordered? Should you be able to modify it?

In [None]:
# Your solution here


### Exercise 10: Building a Simple Contact Book

**Problem:**
You want to create a simple contact book to store information about your friends. For each friend, you need to store their phone number and email address. You need to be able to quickly find a friend's contact information by their name.

**Task:**
Choose the best data structure to build this contact book and add a few contacts.

**Guidance for Students:**
*   How can you associate a person's name with their contact details?
*   For each person, how can you store both a phone number and an email address?
*   Think about the most efficient way to look up a contact.

In [None]:
# Your solution here


### Exercise 11: Tracking Product Inventory

**Problem:**
You are building a system to track the inventory of a small shop. You need to store the name of each product and the quantity in stock. You should be able to easily update the quantity of a product when it's sold or restocked.

**Task:**
Choose the best data structure to manage the product inventory and create an example with a few products.

**Guidance for Students:**
*   How can you link a product's name to its stock quantity?
*   Is the order of products important?
*   How will you update the stock count?

In [None]:
# Your solution here
