# Python for Economists - Data Structures
## Working with Economic Data Collections

In this notebook, we'll learn how to work with collections of economic data. We'll cover:
* Lists (for time series data)
* Dictionaries (for organizing economic indicators)
* Basic data operations
* Introduction to working with economic datasets

### Why Data Structures Matter in Economics
* Store and analyze time series data
* Organize cross-sectional economic data
* Handle panel data efficiently
* Process multiple economic indicators

## 1. Lists in Python

Lists are perfect for storing time series data or sequences of economic values. Think of them as columns in your economic data table.

In [None]:
# Creating lists of economic data
gdp_values = [20.94, 21.43, 22.99, 23.32]  # GDP in trillion USD
years = [2019, 2020, 2021, 2022]
unemployment_rates = [3.7, 8.1, 5.4, 3.6]  # in percentage

# Accessing list elements
print(f"GDP in 2020: ${gdp_values[1]} trillion")
print(f"Most recent unemployment rate: {unemployment_rates[-1]}%")

# List length
print(f"Number of years in the dataset: {len(years)}")

### Basic List Operations for Economic Analysis

In [None]:
# Calculate year-over-year GDP growth rates
gdp_growth_rates = []
for i in range(1, len(gdp_values)):
    growth_rate = ((gdp_values[i] - gdp_values[i-1]) / gdp_values[i-1]) * 100
    gdp_growth_rates.append(round(growth_rate, 2))

print("GDP Growth Rates (%):", gdp_growth_rates)

# Basic statistical operations
average_unemployment = sum(unemployment_rates) / len(unemployment_rates)
max_unemployment = max(unemployment_rates)
min_unemployment = min(unemployment_rates)

print(f"\nUnemployment Statistics:")
print(f"Average: {average_unemployment:.1f}%")
print(f"Maximum: {max_unemployment}%")
print(f"Minimum: {min_unemployment}%")

## 2. Dictionaries in Python

Dictionaries are ideal for storing related economic indicators or cross-sectional data across countries/regions.

In [None]:
# Economic indicators for different countries
country_indicators = {
    "USA": {
        "gdp": 23.32,
        "unemployment": 3.6,
        "inflation": 8.0
    },
    "Japan": {
        "gdp": 4.94,
        "unemployment": 2.6,
        "inflation": 3.8
    },
    "Germany": {
        "gdp": 4.26,
        "unemployment": 3.0,
        "inflation": 7.5
    }
}

# Accessing dictionary data
print(f"US GDP: ${country_indicators['USA']['gdp']} trillion")
print(f"Japan Unemployment: {country_indicators['Japan']['unemployment']}%")

# Comparing inflation rates
for country, data in country_indicators.items():
    print(f"{country} inflation rate: {data['inflation']}%")

### Working with Economic Time Series Data

In [None]:
# Creating a time series of quarterly GDP data
quarterly_gdp = {
    "2022-Q1": 24.386,
    "2022-Q2": 24.899,
    "2022-Q3": 25.725,
    "2022-Q4": 25.462
}

# Calculate quarterly growth rates
quarters = list(quarterly_gdp.keys())
values = list(quarterly_gdp.values())

print("Quarterly GDP Growth Rates:")
for i in range(1, len(quarters)):
    growth = ((values[i] - values[i-1]) / values[i-1]) * 100
    print(f"{quarters[i]}: {growth:.1f}%")

## 3. Combining Lists and Dictionaries

Let's create a more complex economic dataset combining both data structures.

In [None]:
# Economic data for multiple countries over time
economic_data = {
    "USA": {
        "gdp": [20.94, 21.43, 22.99, 23.32],
        "unemployment": [3.7, 8.1, 5.4, 3.6],
        "years": [2019, 2020, 2021, 2022]
    },
    "Japan": {
        "gdp": [5.08, 4.92, 4.94, 4.23],
        "unemployment": [2.4, 2.8, 2.8, 2.6],
        "years": [2019, 2020, 2021, 2022]
    }
}

# Analysis example: Compare GDP growth patterns
for country, data in economic_data.items():
    print(f"\n{country} GDP Growth Rates:")
    gdp_values = data['gdp']
    years = data['years']
    
    for i in range(1, len(gdp_values)):
        growth = ((gdp_values[i] - gdp_values[i-1]) / gdp_values[i-1]) * 100
        print(f"{years[i]}: {growth:.1f}%")

## 4. Basic Data Analysis Functions

In [None]:
def calculate_basic_statistics(data_list):
    """Calculate basic statistical measures for economic data"""
    mean = sum(data_list) / len(data_list)
    sorted_data = sorted(data_list)
    n = len(sorted_data)
    
    # Calculate median
    if n % 2 == 0:
        median = (sorted_data[n//2 - 1] + sorted_data[n//2]) / 2
    else:
        median = sorted_data[n//2]
    
    # Calculate variance and standard deviation
    variance = sum((x - mean) ** 2 for x in data_list) / (n - 1)
    std_dev = variance ** 0.5
    
    return {
        "mean": mean,
        "median": median,
        "std_dev": std_dev,
        "min": min(data_list),
        "max": max(data_list)
    }

# Example: Analyze unemployment rates
us_unemployment = economic_data["USA"]["unemployment"]
stats = calculate_basic_statistics(us_unemployment)

print("US Unemployment Statistics:")
for metric, value in stats.items():
    print(f"{metric}: {value:.2f}")

## Practice Exercises

1. Create a list of monthly inflation rates for 2022 and calculate:
   * Average inflation rate
   * Number of months with inflation above 5%
   * The month with highest inflation

2. Create a dictionary of economic indicators for three countries and:
   * Compare their GDP per capita
   * Find the country with lowest unemployment
   * Calculate average inflation across all countries

3. Using the provided `calculate_basic_statistics` function:
   * Analyze GDP growth rates for both USA and Japan
   * Compare the volatility (standard deviation) of their growth rates

## Key Takeaways

* Lists are excellent for time series economic data
* Dictionaries help organize complex economic indicators
* Combining both allows for sophisticated data organization
* Basic statistical analysis can be performed easily

## Next Steps

In the next notebook, we'll cover:
* Introduction to Pandas for economic data analysis
* Data visualization with Matplotlib
* Importing and working with real economic datasets