# 🌍 Global Warming Data Analysis Project

## Integrating Python Day 1 & Day 2 Concepts

This comprehensive project demonstrates all concepts learned in:
- **Day 1**: Python basics, data types, data structures (lists, tuples, sets, ranges)
- **Day 2**: Functions, packages, NumPy arrays, vectorization, lambda functions

### Project Overview
We'll analyze 145 years of global temperature data to:
1. Understand warming trends
2. Analyze acceleration of climate change
3. Assess city-specific impacts
4. Generate future projections

---

## 📦 Import Required Packages

First, we'll import the packages we need. These demonstrate Day 2's concept of leveraging Python's ecosystem.

In [None]:
# Day 2 Concept: Importing packages
import numpy as np  # For numerical computing
import random      # For generating random variations
from datetime import datetime, timedelta  # For date operations
import math        # For mathematical functions

# Set random seed for reproducibility
random.seed(42)
np.random.seed(42)

print("✅ Packages imported successfully!")
print(f"NumPy version: {np.__version__}")

## Section 1: Data Generation & Structures

### Day 1 Concepts Applied:
- **Lists**: Ordered, mutable collections for temperature data
- **Tuples**: Immutable records for (year, temperature, anomaly)
- **Sets**: Unique decades
- **Ranges**: Efficient year generation

In [None]:
def generate_temperature_data():
    """
    Generate synthetic global temperature data using Day 1 concepts.
    Returns multiple data structures for analysis.
    """
    print("🌍 GLOBAL WARMING DATA ANALYSIS PROJECT")
    print("=" * 50)
    print("\n📊 Generating Temperature Data...")
    
    # Day 1 Concepts: Lists for storing sequences
    years = list(range(1880, 2025))  # Historical years
    cities = ["New York", "London", "Tokyo", "Sydney", "Mumbai", "Cairo", "São Paulo", "Moscow"]
    
    # Generate base temperatures with warming trend
    base_temp = 14.0  # Global average in Celsius (1880)
    warming_rate = 0.01  # Degrees per year
    
    # Create temperature records as list of tuples (Day 1: tuples for fixed records)
    temperature_records = []
    
    for i, year in enumerate(years):
        # Add warming trend + random variation
        annual_temp = base_temp + (warming_rate * i) + random.uniform(-0.5, 0.5)
        
        # Create tuple: (year, temperature, anomaly)
        anomaly = annual_temp - base_temp
        record = (year, round(annual_temp, 2), round(anomaly, 2))
        temperature_records.append(record)
    
    # Day 1 Concept: Sets for unique values
    decade_set = set(year // 10 * 10 for year in years)
    
    print(f"✓ Generated {len(temperature_records)} years of data")
    print(f"✓ Tracking {len(cities)} cities")
    print(f"✓ Covering {len(decade_set)} decades")
    
    return temperature_records, cities, decade_set

# Generate the data
records, cities, decades = generate_temperature_data()

# Display sample data
print("\n📌 Sample Records (first 5 and last 5):")
print("Year | Temperature | Anomaly")
print("-" * 30)
for record in records[:3]:
    print(f"{record[0]} | {record[1]:.2f}°C | {record[2]:+.2f}°C")
print("...")
for record in records[-3:]:
    print(f"{record[0]} | {record[1]:.2f}°C | {record[2]:+.2f}°C")

## Section 2: Functions for Data Processing

### Day 2 Concepts:
- **Functions with clear purpose**: Modular, reusable code
- **Parameters and return values**: Input/output handling
- **Docstrings**: Clear documentation

In [None]:
def calculate_decade_averages(records):
    """
    Day 2 Concept: Function with clear purpose
    Calculate average temperatures by decade.
    """
    decade_temps = {}
    
    for year, temp, anomaly in records:
        decade = year // 10 * 10
        if decade not in decade_temps:
            decade_temps[decade] = []
        decade_temps[decade].append(temp)
    
    # Calculate averages
    decade_averages = {}
    for decade, temps in decade_temps.items():
        decade_averages[decade] = round(sum(temps) / len(temps), 2)
    
    return decade_averages

# Calculate and display decade averages
decade_avgs = calculate_decade_averages(records)

print("📊 Average Temperature by Decade:")
print("=" * 35)
for decade in sorted(decade_avgs.keys()):
    avg_temp = decade_avgs[decade]
    bar = "█" * int((avg_temp - 13) * 10)  # Simple bar chart
    print(f"{decade}s: {avg_temp:.2f}°C {bar}")

In [None]:
def analyze_warming_acceleration(records, window_size=30):
    """
    Day 2 Concept: Function with parameters
    Analyze the rate of warming over time using moving windows.
    """
    years = [r[0] for r in records]
    temps = [r[1] for r in records]
    
    warming_rates = []
    
    for i in range(len(records) - window_size):
        window_years = years[i:i+window_size]
        window_temps = temps[i:i+window_size]
        
        # Simple linear regression (slope)
        n = len(window_years)
        sum_x = sum(window_years)
        sum_y = sum(window_temps)
        sum_xy = sum(x*y for x, y in zip(window_years, window_temps))
        sum_x2 = sum(x**2 for x in window_years)
        
        slope = (n*sum_xy - sum_x*sum_y) / (n*sum_x2 - sum_x**2)
        warming_rates.append((window_years[window_size//2], round(slope * 10, 3)))  # Per decade
    
    return warming_rates

# Analyze warming acceleration
warming_rates = analyze_warming_acceleration(records)

print("📈 Warming Acceleration Analysis")
print("Sample warming rates (°C per decade):")
print("-" * 40)
# Show rates from different periods
sample_indices = [0, len(warming_rates)//3, 2*len(warming_rates)//3, -1]
for idx in sample_indices:
    year, rate = warming_rates[idx]
    print(f"Around {year}: {rate:.3f}°C/decade")

## Section 3: NumPy Operations

### Day 2 Advanced Concepts:
- **NumPy arrays**: Efficient numerical computing
- **Vectorization**: Operations without loops
- **Boolean indexing**: Powerful filtering
- **Statistical functions**: Built-in analysis tools

In [None]:
def numpy_climate_analysis(records):
    """
    Day 2 Concept: NumPy for efficient numerical computing
    Perform vectorized operations on temperature data.
    """
    print("🔬 NumPy Analysis (Vectorized Operations)")
    print("=" * 50)
    
    # Convert to NumPy arrays
    years = np.array([r[0] for r in records])
    temps = np.array([r[1] for r in records])
    anomalies = np.array([r[2] for r in records])
    
    # Vectorized operations (Day 2: No loops needed!)
    print(f"\n📊 Temperature Statistics:")
    print(f"  Mean: {np.mean(temps):.2f}°C")
    print(f"  Std Dev: {np.std(temps):.2f}°C")
    print(f"  Min: {np.min(temps):.2f}°C (Year {years[np.argmin(temps)]})")
    print(f"  Max: {np.max(temps):.2f}°C (Year {years[np.argmax(temps)]})")
    print(f"  Median: {np.median(temps):.2f}°C")
    print(f"  25th percentile: {np.percentile(temps, 25):.2f}°C")
    print(f"  75th percentile: {np.percentile(temps, 75):.2f}°C")
    
    # Boolean indexing (Day 2: NumPy's superpower)
    warm_years = years[temps > np.mean(temps)]
    very_warm_years = years[temps > np.percentile(temps, 90)]
    
    print(f"\n🔥 Warming Trends:")
    print(f"  Years above average: {len(warm_years)} ({len(warm_years)/len(years)*100:.1f}%)")
    print(f"  Years in top 10%: {len(very_warm_years)}")
    print(f"  Hottest years: {sorted(very_warm_years[-5:])}")
    
    # Recent vs historical comparison
    historical = temps[:50]  # First 50 years
    recent = temps[-50:]      # Last 50 years
    
    print(f"\n📈 Historical vs Recent (50-year periods):")
    print(f"  Historical avg (1880-1929): {np.mean(historical):.2f}°C")
    print(f"  Recent avg (1975-2024): {np.mean(recent):.2f}°C")
    print(f"  Warming: +{np.mean(recent) - np.mean(historical):.2f}°C")
    print(f"  Volatility increase: {np.std(recent) - np.std(historical):.3f}°C")
    
    # Cumulative warming
    cumulative_anomaly = np.cumsum(anomalies)
    
    return years, temps, anomalies, cumulative_anomaly

# Perform NumPy analysis
numpy_results = numpy_climate_analysis(records)

## Section 4: Lambda Functions & Filtering

### Day 2 Concepts:
- **Lambda functions**: Quick one-line functions
- **Filter function**: Applying conditions to data
- **List comprehensions**: Pythonic data filtering

In [None]:
def apply_filters_and_transformations(records):
    """
    Day 2 Concept: Lambda functions for quick operations
    """
    print("🔍 Data Filtering & Transformations")
    print("=" * 50)
    
    # Lambda functions for conversions
    celsius_to_fahrenheit = lambda c: c * 9/5 + 32
    classify_temp = lambda t: "🔥 Hot" if t > 15.5 else ("🌡️ Warm" if t > 14.5 else "❄️ Cool")
    
    # Filter extreme years
    extreme_filter = lambda r: abs(r[2]) > 1.0  # Anomaly > 1°C
    extreme_years = list(filter(extreme_filter, records))
    
    print(f"\n⚠️ Extreme anomaly years (|anomaly| > 1°C): {len(extreme_years)}")
    print(f"Percentage of extreme years: {len(extreme_years)/len(records)*100:.1f}%")
    
    # Transform recent data
    recent_records = records[-10:]
    print("\n📅 Last 10 Years Analysis:")
    print("Year | Temp (°C/°F) | Classification | Anomaly")
    print("-" * 60)
    
    for year, temp, anomaly in recent_records:
        temp_f = celsius_to_fahrenheit(temp)
        classification = classify_temp(temp)
        print(f"{year} | {temp:.1f}°C ({temp_f:.1f}°F) | {classification} | {anomaly:+.2f}°C")
    
    # Using list comprehension (Pythonic filtering)
    recent_hot_years = [year for year, temp, _ in recent_records if temp > 15.5]
    print(f"\nHot years in last decade: {recent_hot_years}")
    
    return extreme_years

# Apply filters and transformations
extreme_years = apply_filters_and_transformations(records)

## Section 5: City-Specific Impact Analysis

### Combining Day 1 & Day 2 Concepts:
- Multiple data structures working together
- Functions processing complex data
- List comprehensions for filtering

In [None]:
def simulate_city_impacts(cities, global_warming_increase):
    """
    Simulate warming impacts on different cities.
    Combines multiple data structures (Day 1).
    """
    print("\n🏙️ City-Specific Impact Analysis")
    print("=" * 50)
    
    # City data: (city, base_temp, vulnerability_score)
    city_data = [
        ("New York", 12.5, 0.7),
        ("London", 10.5, 0.6),
        ("Tokyo", 16.0, 0.8),
        ("Sydney", 18.0, 0.7),
        ("Mumbai", 27.0, 0.9),
        ("Cairo", 22.0, 0.85),
        ("São Paulo", 19.0, 0.75),
        ("Moscow", 5.5, 0.5)
    ]
    
    impacts = []
    
    for city, base_temp, vulnerability in city_data:
        # Calculate projected temperature
        projected_temp = base_temp + (global_warming_increase * vulnerability)
        
        # Risk assessment
        if projected_temp > 30:
            risk = "🔴 CRITICAL"
        elif projected_temp > 25:
            risk = "🟠 HIGH"
        elif projected_temp > 20:
            risk = "🟡 MODERATE"
        else:
            risk = "🟢 LOW"
        
        impacts.append({
            "city": city,
            "current": base_temp,
            "projected": round(projected_temp, 1),
            "increase": round(projected_temp - base_temp, 1),
            "risk": risk
        })
    
    # Display results
    print(f"\nProjected warming: +{global_warming_increase}°C globally")
    print("\n" + "City".ljust(12) + "Current  Projected  Increase  Risk")
    print("-" * 55)
    
    for impact in impacts:
        city_str = impact['city'].ljust(12)
        current_str = f"{impact['current']:.1f}°C".ljust(9)
        projected_str = f"{impact['projected']:.1f}°C".ljust(11)
        increase_str = f"+{impact['increase']:.1f}°C".ljust(10)
        print(f"{city_str}{current_str}{projected_str}{increase_str}{impact['risk']}")
    
    # Summary statistics
    critical_cities = [i for i in impacts if "CRITICAL" in i["risk"]]
    high_risk_cities = [i for i in impacts if "HIGH" in i["risk"]]
    
    print(f"\n📊 Risk Summary:")
    print(f"  Cities at CRITICAL risk: {len(critical_cities)}")
    print(f"  Cities at HIGH risk: {len(high_risk_cities)}")
    
    return impacts

# Simulate city impacts with 2.5°C warming
projected_warming = 2.5
city_impacts = simulate_city_impacts(cities, projected_warming)

## Section 6: Comprehensive Climate Report

### Bringing It All Together:
This final section combines all our analyses into a comprehensive report, demonstrating how all Day 1 and Day 2 concepts work together in a real-world application.

In [None]:
def generate_climate_report(records, numpy_results, city_impacts):
    """
    Generate a comprehensive climate report combining all analyses.
    """
    print("\n" + "="*60)
    print("🌡️ COMPREHENSIVE CLIMATE REPORT")
    print("="*60)
    
    years, temps, anomalies, cumulative = numpy_results
    
    # Key findings
    print("\n📊 KEY FINDINGS:")
    print("-" * 40)
    
    # 1. Overall warming trend
    total_warming = temps[-1] - temps[0]
    years_span = years[-1] - years[0]
    rate_per_decade = (total_warming / years_span) * 10
    
    print(f"\n1. Total Warming ({years[0]}-{years[-1]}): +{total_warming:.2f}°C")
    print(f"   Rate: {rate_per_decade:.3f}°C per decade")
    
    # 2. Acceleration analysis
    early_period = temps[:30]
    late_period = temps[-30:]
    acceleration = np.std(late_period) - np.std(early_period)
    
    print(f"\n2. Volatility Change:")
    print(f"   Early period std: {np.std(early_period):.3f}°C")
    print(f"   Recent period std: {np.std(late_period):.3f}°C")
    print(f"   Increased volatility: {acceleration:.3f}°C")
    
    # 3. Threshold breaches
    threshold_15 = np.sum(temps > 15.0)
    threshold_16 = np.sum(temps > 16.0)
    
    print(f"\n3. Temperature Thresholds:")
    print(f"   Years above 15°C: {threshold_15} ({threshold_15/len(temps)*100:.1f}%)")
    print(f"   Years above 16°C: {threshold_16} ({threshold_16/len(temps)*100:.1f}%)")
    
    # 4. Projections
    if rate_per_decade > 0:
        years_to_2c = (2.0 - total_warming) / (rate_per_decade / 10)
        year_2c = int(years[-1] + years_to_2c) if years_to_2c > 0 else "Already exceeded"
    else:
        year_2c = "N/A"
    
    print(f"\n4. Future Projections:")
    print(f"   Expected year to reach +2°C: {year_2c}")
    print(f"   2050 projected increase: +{rate_per_decade * 2.5:.2f}°C from today")
    
    # 5. City impacts summary
    print(f"\n5. Urban Impact Summary:")
    critical_count = sum(1 for c in city_impacts if "CRITICAL" in c["risk"])
    high_count = sum(1 for c in city_impacts if "HIGH" in c["risk"])
    
    print(f"   Cities analyzed: {len(city_impacts)}")
    print(f"   Critical risk: {critical_count}")
    print(f"   High risk: {high_count}")
    
    # Most vulnerable cities
    sorted_cities = sorted(city_impacts, key=lambda x: x["increase"], reverse=True)
    print(f"\n   Most affected cities:")
    for city in sorted_cities[:3]:
        print(f"   • {city['city']}: +{city['increase']:.1f}°C (Risk: {city['risk']}")
    
    print("\n" + "="*60)
    print("📌 RECOMMENDATIONS:")
    print("-" * 40)
    print("1. Immediate action required for cities at CRITICAL risk")
    print("2. Implement adaptation strategies for +2°C warming")
    print(f"3. Focus on reducing warming rate (currently {rate_per_decade:.3f}°C/decade)")
    print("4. Prepare for increased climate volatility")
    print("="*60)

# Generate the comprehensive report
generate_climate_report(records, numpy_results, city_impacts)

## 🎓 Learning Summary

### Concepts Successfully Demonstrated:

#### **Day 1 Concepts:**
- ✅ **Data Types**: Integers (years), Floats (temperatures), Strings (city names), Booleans (conditions)
- ✅ **Lists**: Temperature records, cities, years
- ✅ **Tuples**: Immutable records (year, temp, anomaly)
- ✅ **Sets**: Unique decades
- ✅ **Ranges**: Efficient year generation
- ✅ **F-strings**: Professional output formatting

#### **Day 2 Concepts:**
- ✅ **Functions**: Modular, reusable code with clear purposes
- ✅ **Parameters**: Input handling and default values
- ✅ **NumPy Arrays**: Efficient numerical computing
- ✅ **Vectorization**: Operations without loops
- ✅ **Boolean Indexing**: Powerful data filtering
- ✅ **Lambda Functions**: Quick one-line operations
- ✅ **Package Imports**: Leveraging Python's ecosystem

### Real-World Skills Applied:
1. **Time Series Analysis**: 145 years of temperature data
2. **Statistical Analysis**: Mean, std dev, percentiles
3. **Trend Analysis**: Warming rates and acceleration
4. **Risk Assessment**: City vulnerability scoring
5. **Data Visualization**: Clear, formatted outputs
6. **Report Generation**: Actionable insights and recommendations

## 🚀 Next Steps & Enhancements

### You Can Extend This Project With:

1. **Data Visualization** (Coming in your course):
   - Add matplotlib graphs for temperature trends
   - Create heatmaps for city impacts
   - Plot warming acceleration over time

2. **Real Data Integration**:
   - Load actual climate data from CSV files
   - Connect to climate APIs
   - Use pandas DataFrames (Day 4+)

3. **Advanced Analysis**:
   - Machine learning predictions
   - Seasonal decomposition
   - Regional clustering

4. **Interactive Features**:
   - User input for city selection
   - Custom date ranges
   - Scenario modeling

### Key Takeaway:
**With just 2 days of Python knowledge, you've built a meaningful tool to analyze one of humanity's most pressing challenges!** 🌍

In [None]:
# Final celebration!
print("\n" + "🎉" * 30)
print("\n✅ PROJECT COMPLETE!")
print("\nYou've successfully integrated:")
print("  • Day 1: All basic Python concepts")
print("  • Day 2: Functions, NumPy, and advanced features")
print("  • Real-world application: Climate change analysis")
print("\nGreat job! You're ready for Day 3 and beyond! 🚀")
print("\n" + "🎉" * 30)