# Week 3: IN-CLASS LECTURE - Styling and Temporal Analysis

## **Learning Goals**
Building on pre-class foundations, those who actively engage with this lecture will be able to:
- Wrangle Data
- Build line and area mark visualizations
- Style charts


## Dataset Description

For the next two weeks we will use the [**Our World in Data (OWID) Energy Dataset**](https://ourworldindata.org/electricity-mix), which provides comprehensive global energy statistics from 1954 to 2024. The dataset includes metrics on energy production, consumption, and technology adoption across countries worldwide.

Here is a description of the key columns you'll be working with:

| Column                        | Description                                                            | Unit                |
|-------------------------------|------------------------------------------------------------------------|---------------------|
| `country`                     | Country name                                                           | Geographic location |
| `year`                        | Year of observation                                                    | Date                |
| `nuclear_electricity`         | Electricity generation from nuclear                                    | TWh                 |
| `hydro_electricity`           | Electricity generated from hydropower                                  | TWh                 |
| `population`                  | Country population                                                     | Count               |
| `fossil_fuel_consumption`     | Primary energy consumption from fossil fuels                           | TWh                 |
| `coal_production`             | Coal production volume                                                 | TWh                 |
| `coal_consumption`            | Coal consumption volume                                                | TWh                 |
| `nuclear_share_energy`        | Share of primary energy consumption that comes from nuclear power      | %                   |
| `fossil_share_energy`         | Share of primary energy consumption that comes from fossil fuels       | %                   |
| `hydro_share_energy`          | Share of primary energy consumption that comes from hydropower         | %                   |
| `renewables_share_energy`     | Share of primary energy consumption that comes from other renewables   | %                   |
| `biofuel_consumption`         | Primary energy consumption from biofuels                               | TWh                 |
| `gas_consumption`             | Primary energy consumption from gas                                    | TWh                 |
| `oil_consumption`             | Primary energy consumption from oil                                    | TWh                 |
| `hydro_consumption`           | Primary energy consumption from hydropower                             | TWh                 |
| `other_renewable_consumption` | Primary energy consumption from other renewables                       | TWh                 |
| `wind_electricity`            | Electricity generation from wind power                                 | TWh                 |
| `solar_electricity`           | Electricity generation from solar power                                | TWh                 |
| `hydro_electricity`           | Electricity generation from hydropower                                 | TWh                 |

**Note:** TWh = Terawatt-hours (1 TWh = 1 billion kilowatt-hours)

## Data and Environment Setup

In [None]:
import pandas as pd
import altair as alt

# If on PL use this one
filepath = 'data/owid_dataset.csv',

# If running locally on your machine use this one
filepath = 'https://raw.githubusercontent.com/kemiolamudzengi/dsci-320-datasets/main/owid_dataset.csv'

# Load the OWID energy dataset
owid_data = pd.read_csv( filepath, parse_dates=['year'])

print(f"Dataset shape: {owid_data.shape}")
print(f"Years covered: {owid_data['year'].dt.year.min()} to {owid_data['year'].dt.year.max()}")
print(f"Countries: {owid_data['country'].nunique()}")

## **Task 1**
**Exploration Task:** *How does wind electricity generation grown over time for select countries?*


<div class="alert alert-info" style="color:black; padding: 15px; border-radius: 8px; background-color:#eaf4ff;">
<h2> Data Task </h2>
Create a dataframe called `wind_data` for the countries listed in `wind_countries`.
You can use either `query`, `.loc` and `.isin
</div>

In [None]:
# List of countries to explore
wind_countries = ['China', 'USA', 'Germany', 'India', 'Brazil', 'UK']
wind_data = ...

<div class="alert alert-info" style="color:black; padding: 15px; border-radius: 8px; background-color:#eaf4ff;">
  <h2>VIZ TASK: Multi-line Chart for Wind Electricity</h2>
  <p>< Create a multi-line chart comparing wind electricity generation across countries.</p>

  <p>Using the <code>wind_data</code> dataset, create a visualization with:</p>
  <ul>
    <li>Use <code>mark_line</code> to represent wind generation trends</li>
    <li>Encode <code>year</code> on the <strong>x channel</strong></li>
    <li>Encode <code>wind_electricity</code> on the <strong>y channel</strong></li>
    <li>Encode <code>country</code> on the <strong>color channel</strong></li>
    <li>Include a tooltip that shows <code>Country</code>, <code>Year</code>, and <code>Wind Generation (TWh)</code> formatted to 1 decimal place (<code>.1f</code>)</li>
  </ul>

  <p><strong>Styling Specifications:</strong></p>
  <ul>
    <li><strong>Chart Properties:</strong> Width = 500px, Height = 300px, Title = "Wind Electricity Growth: Technology Adoption Leaders"</li>
    <li><strong>Mark Styling:</strong> <code>strokeWidth=1</code>, Points = size 10, filled = True</li>
    <li><strong>X Channel:</strong> Title = "Year", Format = "%Y"</li>
    <li><strong>Y Channel:</strong> Title = "Wind Electricity Generation (TWh)", Format = ".0f"</li>
    <li><strong>Legend:</strong> Title = "Country"</li>
    <li><strong>Color Scheme:</strong> Use <code>category10</code></li>
  </ul>
</div>


In [None]:
# Professional multi-line chart with all styling elements
wind_comparison = ...

# show chart
wind_comparison

What do you observe about each of the countries
...

## Task 2: Low-Carbon Electricity Composition
Analyze how the composition of low-carbon electricity sources has evolved over time, and identify which technologies have driven the largest changes in the energy mix.

<div class="alert alert-info" style="color:black; padding: 15px; border-radius: 8px; background-color:#eaf4ff;">
  <h2>Data Task: Low-Carbon Electricity Composition</h2>

  <p><strong> TASK:</strong> Prepare the global low-carbon dataset for each technology's contribution over time.</p>

  <p><strong>Step-by-step instructions (follow in order):</strong></p>
  <ul>
    <li><strong>STEP 0: Select the low-carbon sources you will plot.</strong>
      <br>Create a list with the four columns you’ll examine: <code>['hydro_electricity', 'nuclear_electricity', 'solar_electricity', 'wind_electricity']</code>.
    </li>
    <li><strong>STEP 1: Filter the dataset to global totals.</strong>
      <br>From the full OWID table, keep only rows where <code>country == "World"</code>. Use <code>.copy()</code> to avoid chained-assignment warnings.
    </li>
    <li><strong>STEP 2: Select the columns of interest. </strong></li>
    <li><strong>STEP 3: Reshape from wide → long.</strong>
      <br>Use <code>.melt()</code> so each row represents one (year, technology, generation) triple. This long format is required for stacked area encoding.
    </li>
    <li><strong>STEP 4: Clean the technology labels for presentation.</strong>
      <br>Map raw column names to nicer labels (e.g. <code>hydro_electricity → Hydropower</code>, <code>solar_electricity → Solar PV</code>, etc.) so the legend reads professionally.
      <br><em>Hint:</em> use a dictionary and <code>.map()</code> to replace names.
    </li>
  </ul>


In [None]:
# STEP 0: SELECT Low-carbon Sources
low_carbon_sources = ['hydro_electricity', 'nuclear_electricity', 'solar_electricity', 'wind_electricity']

# STEP 1: FILTER to world data
world_data = ...

# STEP 2: SELECT the columns of interest
selected = ...

# STEP 3: RESHAPE the data for stacking
world_low_carbon_long = ...

# STEP 4: CLEAN and REPLACE technology names
tech_names = {
    'hydro_electricity': 'Hydropower',
    'nuclear_electricity': 'Nuclear',
    'solar_electricity': 'Solar PV',
    'wind_electricity': 'Wind'
}

world_low_carbon_long['technology'] = ...



<div class="alert alert-info" style="color:black; padding: 15px; border-radius: 8px; background-color:#eaf4ff;">
  <h2>VIZ TASK: Low-Carbon Electricity Composition</h2>

  <p>Create a stacked area chart showing the contribution of each low-carbon technology to global electricity generation.</p>

  <p>Using the <code>world_low_carbon_long</code> dataset, create a visualization with:</p>
  <ul>
    <li>Use <code>mark_area</code> with stacking set to <code>'zero'</code></li>
    <li>Encode <code>year</code> on the <strong>x channel</strong></li>
    <li>Encode <code>generation</code> on the <strong>y channel</strong></li>
    <li>Encode <code>technology</code> on the <strong>color channel</strong></li>
    <li>Include a tooltip that shows <code>Year</code>, <code>Technology</code>, and <code>Generation (TWh)</code></li>
  </ul>

  <p><strong>Styling Specifications:</strong></p>
  <ul>
    <li><strong>Chart Properties:</strong> Width = 500px, Height = 300px, Title = "Global Low-Carbon Electricity Sources: Technology Composition"</li>
    <li><strong>Mark Styling:</strong> Default area mark with opacity set by stacking</li>
    <li><strong>X Channel:</strong> Title = "Year"</li>
    <li><strong>Y Channel:</strong> Title = "Low-Carbon Electricity Generation (TWh)", Format = ".0f"</li>
    <li><strong>Legend:</strong> Title = "Technology"</li>
    <li><strong>Color Scheme:</strong> Custom palette = <code>#1f77b4</code> (blue), <code>#ff7f0e</code> (orange), <code>#2ca02c</code> (green), <code>#d62728</code> (red)</li>
  </ul>

**Color Strategy for Low-Carbon Technologies:**
- **Hydropower**: Blue (#1f77b4) - water association
- **Nuclear**: Orange (#ff7f0e) - energy/power association
- **Solar**: Green (#2ca02c) - natural/renewable association
- **Wind**: Red (#d62728) - dynamic/movement association
</div>


In [None]:

# Stacked area with thoughtful color scheme
low_carbon_stack = ...

low_carbon_stack

<div class="alert alert-success" style="color:black; padding: 15px; border-radius: 8px; background-color:#e8f7e4;">
  <h2>Next Steps</h2>
  <ul>
    <li>Redo the entire lecture again for reinforcement.</li>
    <li>Ask: What other questions can we explore with this dataset?</li>
    <li>Brainstorm: What other visualizations could help answer these questions?</li>
    <li>Come up with at least <strong>3 distinct tasks</strong> that require:
      <ul>
        <li>Wrangling the data appropriately</li>
        <li>Design and create visualizations for the task</li>
      </ul>
    </li>
  </ul>
</div>
