# Introduction to Economic Data Analysis with Python

This notebook demonstrates basic economic data analysis using Python and common libraries.

## Topics Covered
1. Loading economic data
2. Exploratory data analysis
3. Time series visualization
4. Basic statistical analysis
5. Correlation analysis

## Setup

First, let's install and import the necessary libraries.

In [None]:
# Install required packages (uncomment if running in Google Colab)
# !pip install pandas numpy matplotlib seaborn scipy statsmodels -q

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats

# Set style for better-looking plots
sns.set_style("whitegrid")
plt.rcParams['figure.figsize'] = (12, 6)

# Display settings
pd.set_option('display.max_columns', None)
pd.set_option('display.precision', 2)

print("Libraries imported successfully!")

## 1. Loading Economic Data

We'll load sample economic data containing various indicators.

In [None]:
# Load data from GitHub (if running in Colab) or local file
try:
    # Try loading from local file
    df = pd.read_csv('../data/sample_economic_data.csv')
except FileNotFoundError:
    # If in Colab, load from GitHub
    url = 'https://raw.githubusercontent.com/koiti-yano/colab_and_economics/main/data/sample_economic_data.csv'
    df = pd.read_csv(url)

# Convert date to datetime
df['date'] = pd.to_datetime(df['date'])
df.set_index('date', inplace=True)

print(f"Data loaded successfully!")
print(f"Shape: {df.shape}")
print(f"\nColumns: {df.columns.tolist()}")
print(f"\nDate range: {df.index.min()} to {df.index.max()}")

In [None]:
# Display first few rows
df.head(10)

## 2. Exploratory Data Analysis

Let's examine the basic statistics of our economic indicators.

In [None]:
# Summary statistics
df.describe()

In [None]:
# Check for missing values
print("Missing values:")
print(df.isnull().sum())

## 3. Time Series Visualization

Visualizing economic time series helps identify trends, cycles, and patterns.

In [None]:
# Plot GDP over time
fig, ax = plt.subplots(figsize=(14, 6))
ax.plot(df.index, df['gdp_billions'], linewidth=2, color='blue')
ax.set_title('GDP Over Time', fontsize=16, fontweight='bold')
ax.set_xlabel('Date', fontsize=12)
ax.set_ylabel('GDP (Billions USD)', fontsize=12)
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

In [None]:
# Plot unemployment and inflation rates
fig, axes = plt.subplots(2, 1, figsize=(14, 10))

# Unemployment rate
axes[0].plot(df.index, df['unemployment_rate'], linewidth=2, color='red')
axes[0].set_title('Unemployment Rate', fontsize=14, fontweight='bold')
axes[0].set_ylabel('Rate (%)', fontsize=12)
axes[0].grid(True, alpha=0.3)

# Inflation rate
axes[1].plot(df.index, df['inflation_rate'], linewidth=2, color='green')
axes[1].set_title('Inflation Rate', fontsize=14, fontweight='bold')
axes[1].set_xlabel('Date', fontsize=12)
axes[1].set_ylabel('Rate (%)', fontsize=12)
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

In [None]:
# Plot all indicators together (normalized)
fig, ax = plt.subplots(figsize=(14, 7))

# Normalize each series to start at 100
for col in df.columns:
    normalized = (df[col] / df[col].iloc[0]) * 100
    ax.plot(df.index, normalized, linewidth=2, label=col)

ax.set_title('All Economic Indicators (Normalized, Base=100)', fontsize=16, fontweight='bold')
ax.set_xlabel('Date', fontsize=12)
ax.set_ylabel('Index (Base = 100)', fontsize=12)
ax.legend(loc='best')
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

## 4. Basic Statistical Analysis

Calculate growth rates and moving averages.

In [None]:
# Calculate year-over-year GDP growth rate
df['gdp_growth_yoy'] = df['gdp_billions'].pct_change(periods=12) * 100

# Plot GDP growth rate
fig, ax = plt.subplots(figsize=(14, 6))
ax.plot(df.index, df['gdp_growth_yoy'], linewidth=2, color='purple')
ax.axhline(y=0, color='black', linestyle='--', linewidth=1)
ax.set_title('Year-over-Year GDP Growth Rate', fontsize=16, fontweight='bold')
ax.set_xlabel('Date', fontsize=12)
ax.set_ylabel('Growth Rate (%)', fontsize=12)
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

In [None]:
# Calculate moving averages for unemployment rate
df['unemployment_ma_6'] = df['unemployment_rate'].rolling(window=6).mean()
df['unemployment_ma_12'] = df['unemployment_rate'].rolling(window=12).mean()

# Plot unemployment with moving averages
fig, ax = plt.subplots(figsize=(14, 6))
ax.plot(df.index, df['unemployment_rate'], linewidth=1, alpha=0.5, label='Actual')
ax.plot(df.index, df['unemployment_ma_6'], linewidth=2, label='6-Month MA')
ax.plot(df.index, df['unemployment_ma_12'], linewidth=2, label='12-Month MA')
ax.set_title('Unemployment Rate with Moving Averages', fontsize=16, fontweight='bold')
ax.set_xlabel('Date', fontsize=12)
ax.set_ylabel('Rate (%)', fontsize=12)
ax.legend()
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

## 5. Correlation Analysis

Examine relationships between different economic indicators.

In [None]:
# Calculate correlation matrix
correlation_matrix = df[['gdp_billions', 'unemployment_rate', 'inflation_rate', 
                         'interest_rate', 'consumer_confidence']].corr()

print("Correlation Matrix:")
print(correlation_matrix)

In [None]:
# Visualize correlation matrix as heatmap
fig, ax = plt.subplots(figsize=(10, 8))
sns.heatmap(correlation_matrix, annot=True, fmt='.2f', cmap='coolwarm', 
           center=0, square=True, linewidths=1, cbar_kws={"shrink": 0.8})
ax.set_title('Correlation Matrix of Economic Indicators', fontsize=16, fontweight='bold')
plt.tight_layout()
plt.show()

In [None]:
# Scatter plot: GDP vs Unemployment
fig, ax = plt.subplots(figsize=(10, 6))
ax.scatter(df['gdp_billions'], df['unemployment_rate'], alpha=0.5)
ax.set_xlabel('GDP (Billions USD)', fontsize=12)
ax.set_ylabel('Unemployment Rate (%)', fontsize=12)
ax.set_title('GDP vs Unemployment Rate', fontsize=16, fontweight='bold')
ax.grid(True, alpha=0.3)

# Add trend line
z = np.polyfit(df['gdp_billions'].dropna(), df['unemployment_rate'].dropna(), 1)
p = np.poly1d(z)
ax.plot(df['gdp_billions'], p(df['gdp_billions']), "r--", linewidth=2, label='Trend')
ax.legend()

plt.tight_layout()
plt.show()

## Conclusion

This notebook demonstrated:
- Loading and exploring economic data
- Visualizing time series data
- Computing growth rates and moving averages
- Analyzing correlations between economic indicators

## Next Steps

- Try with real economic data from APIs (FRED, World Bank)
- Perform more advanced time series analysis (ARIMA, VAR models)
- Conduct econometric analysis (regression, causality tests)
- Build forecasting models