# <font color = firebrick>Tutorial 4: Lists, Tuples, and Dicts</font><a id='home'></a>

We've worked with simple types like float, int, and str, which are scalar types. But we often need more complex structures.

Python offers data structures like list, tuple, and dict that allow us to group data, including user interactions or transaction records. In this course, we will focus heavily on lists, tuples, and dictionaries for organizing and analyzing data:
    
1. [Lists](#list) 
2. [Tuples](#tuple)
3. [Dictionaries](#dict)

# 1. Lists<a id="list"></a> ([top](#home))
In economics, we often handle data collections such as user activities, transactions, or platform metrics. Python lists allow us to store and manage these data points in a flexible and ordered way, perfect for real-time analysis.

A list is an ordered and modifiable (or mutable) collection of data points. Let’s explore lists with examples:

In [None]:
# A list of numerical data (like user ratings or purchase amounts)
transaction_list = [5, 20, 15, 10]

# A list of strings (e.g., platform names)
platform_list = ['Facebook', 'Amazon', 'Google']

print('Transaction List:', transaction_list)
print('Platform List:', platform_list)

Python lists can even hold mixed types, such as user IDs and activity types, something that would be more rigid in other languages:

In [None]:
# A mixed list combining numbers and strings (useful for user profiles)
mixed_list = [101, 'search', 205, 'purchase']

print('Mixed List:', mixed_list)
print('First element type:', type(mixed_list[0]))
print('Second element type:', type(mixed_list[1]))

**Key concept:** Lists in Python are indexed starting from 0, meaning the first element is accessed with index 0. This is crucial when dealing with datasets, as the position of data often holds meaning (e.g., transaction history).

In [None]:
# Accessing specific elements
print('First element:', mixed_list[0])  # 101
print('Second element:', mixed_list[1]) # 'search'

#### Manipulating Lists:
Lists can be combined and modified easily, making them perfect for tasks like aggregating user data from different sources:

In [None]:
# Concatenate lists (e.g., merging datasets)
combined_list = mixed_list + ['Instagram', 30]

print('Combined list:', combined_list)

# Append a new element (e.g., adding a new data point)
mixed_list.append('completed')

print('Updated mixed list:', mixed_list)

You can even have lists within lists, useful when working with complex datasets (like structured user profiles):

In [None]:
# A list within a list (e.g., a user profile with subcategories)
user_data = [101, 'John', ['active', 50]]

print('Full user data:', user_data)
print('Activity type:', user_data[2][0])  # 'active'

We have seen two of Python's features above. 
- how to **concatenate** lists using the + operator.
- how to create a list on the same line as the one we're assigning it to.

The '+' operator works like the print() function: it 'knows' which object types it's working with (lists, ints, strings) and takes appropriate action. With a few limitations, however.

## <font color='firebrick'> Practice</font>
Let's do a few exercises to practice list manipulations commonly used in economic data analysis:

1. Create two lists
- One list with integers 1, 2, 3, called `my_int_list`.
- Another list with strings '1', '2', '3', called `my_string_list`.
- Print the types of both lists.

2. Concatenate the lists
- Merge `my_int_list` and `my_string_list` into a new list `my_super_lis`.
- Print `my_super_list`.

3. Modify the list
- In `my_super_list`, change the integer 2 to your favorite number.
- Replace the string '3' with your least favorite number.
- Print the modified `my_super_list`.

4. Data cleaning exercise

You have raw data in the form of a string: '13839'. What is the data type of `raw_data`? Convert it into a list so it's ready for analysis.

Convert the elements in the list to integers. Consider using a loop if working with large datasets.

## List comprehensions<a id="listcomp"></a>
List comprehensions offer a compact way to iterate over lists (or other collections). While everything achieved with a list comprehension can be done with a for loop, they provide a cleaner and more efficient syntax—widely used in Python code for data handling.

### Use case: Standardizing data
In real-world digital datasets, especially with social media or platform data, consistency is crucial. Imagine analyzing online user types across platforms where user categories might appear in different formats (e.g., 'Subscriber', 'subscriber', 'free_user'). Cleaning these values ensures accurate comparisons. Here’s how we can clean user categories by converting them to lowercase:

Using a for loop: 

In [None]:
# User categories data
user_types = ['Subscriber', 'subscriber', 'Free_User', 'guest', 'subscriber']

# Empty list for cleaned data
user_types_cleaned = []
for user in user_types:
    user_types_cleaned.append(user.lower())

print(user_types_cleaned)

This works, but Python offers a more elegant solution with list comprehensions:

Using a list lomprehension:

In [None]:
# User categories data
user_types = ['Subscriber', 'subscriber', 'Free_User', 'guest', 'subscriber']

user_types_cleaned_lc = [user.lower() for user in user_types]

print(user_types_cleaned_lc)

### Example: Calculating metrics
You might want to calculate engagement metrics for users based on activity levels. Consider a simple case where you square user interaction counts:

In [None]:
interaction_counts = [2, 3, 5]
squared_interactions = [count**2 for count in interaction_counts]

print(squared_interactions)

List comprehensions help us perform these tasks more efficiently, particularly when working with large-scale digital data.

**Conditional comprehensions:** Filtering high-value users
You can also add conditions to filter specific user categories or behaviors, like prioritizing high-value users:

In [None]:
high_value_users = [user.upper() for user in user_types_cleaned if user == 'subscriber']

print(high_value_users)

## <font color='firebrick'> Practice</font>
Take a few minutes and practice with these exercises related to data handling.

1. Here's a list of interest rates: r = [0.01, 0.01, 0.015, 0.02, 0.022]. Multiply each rate by 100 to express them as percentages.

2. You’re given `raw_data = '37318'`. Use a list comprehension to turn this string into a list of integers.

3.  From the list `data_list = [1, 2, 3, 4, 5, 6, 7, 8]`, create two new lists: one with odd numbers and another with even numbers. Use the modulo operator `%`. 

# 2. Tuples<a id="tuple"></a> ([top](#home))
In our exploration of data structures, tuples are key in situations where data must remain unchanged. Unlike lists, tuples are immutable, meaning once created, they cannot be altered. While lists are more common, tuples are useful when you want to ensure data integrity—such as preserving critical parameters in a digital economy simulation.

You can create a tuple using round brackets:

In [None]:
# A tuple of numeric data
number_tuple = (2, 3, 5, 8) 

# A tuple of strings
string_tuple = ('price', 'elasticity', 'demand')

print('number_tuple type:', type(number_tuple))
print('number_tuple:', number_tuple, '\n')

print('string_tuple:', type(string_tuple))
print('string_tuple:', string_tuple)

Notice the difference in brackets between tuples (round) and lists (square). Tuples are often useful when you're working with data that shouldn't be modified, like constant parameters in a model of digital markets.

Let's see how immutability works by comparing lists and tuples:

In [None]:
# Change the second element of the list to 1000
number_list = [2, 3, 5, 8]
number_list[1] = 1000    
print(number_list)

# Now try that with a tuple (this will throw an error)
number_tuple[1] = 1000

Tuples are useful to protect essential variables, such as market parameters, from accidental changes when analyzing the impact of policies, prices, or competition shifts.

# 3. Dictionaries <a id="dict"></a> ([top](#home))
Dictionaries (or dicts) are incredibly useful for handling datasets as key-value pairs, especially when you want to map categories (like product types or consumer segments) to specific values (such as prices, quantities, or preferences).

A dict is an unordered collection where each element is a key-value pair. The keys must be unique, but values can repeat. For example, let’s map user subscription tiers to price levels:


In [None]:
# Subscription tiers mapped to prices
subscription_prices = {'Basic': 5.99, 'Pro': 15.99, 'Premium': 29.99}

print(type(subscription_prices))
print(subscription_prices)

In this dict, 'Basic', 'Pro', and 'Premium' are keys, while their associated values are prices. You reference values by their keys:

In [None]:
print(subscription_prices['Pro'])  # Returns 15.99

In [None]:
# What happens here? Will this return 'B'?
print(grades[3.0])

### Common mistake
Attempting to reference a value directly will lead to an error:

In [None]:
# This will throw an error because dicts are referenced by keys, not values
print(subscription_prices[15.99])

You’ll encounter a `KeyError` since the system is looking for a key, not a value.

### Adding / updating a dictionary
Dictionaries are mutable, meaning you can add new elements or change existing values. For instance, if a new tier is introduced or prices are updated:

In [None]:
# Add a new subscription tier
subscription_prices['Enterprise'] = 49.99
print(subscription_prices)

# Update the price of the 'Basic' tier
subscription_prices['Basic'] = 6.99
print(subscription_prices)

Dictionaries are excellent for managing categories of products, services, or consumer segments, and they allow you to quickly access, update, and manipulate this data efficiently.

## <font color='firebrick'> Practice</font>
Try the following exercises related to digital economics:

1. Create a dictionary in the cell below representing the average time spent online (in hours) by users on various platforms for 2022.
    - Keys: `'YouTube'`, `'Instagram'`, `'TikTok'`, `'Facebook'`
    - Assign values to reflect your own estimates (e.g., YouTube: 3, Instagram: 2).
    - Print your dictionary.

2. Can you give two platforms the same time value?


3. Assume a new report shows increased usage for TikTok, but Facebook's usage decreases
    - Update TikTok's and Facebook"s values to new estimates.
    - Print the updated dictionary.

**Here’s a more advanced example:**

Consider the dictionary below representing revenue trends for major digital companies over three years:

In [None]:
digital_revenue = {'Year': [2018, 2019, 2020], 'Google': [136.8, 161.9, 182.5], 'Amazon': [232.9, 280.5, 386.1], 'Facebook': [55.8, 70.7, 85.9]}

- What are the keys and values?
- What do you think this data represents?
- How might this dictionary help us understand digital companies' growth and dominance?

Lastly, print the years and revenue for Amazon from the dictionary:

In [None]:
print(digital_revenue['Year'])
print(digital_revenue['Amazon'])

**Answer:**