# **(ADD THE NOTEBOOK NAME HERE)**

## Objectives

* Write your notebook objective here, for example, "Fetch data from Kaggle and save as raw data", or "engineer features for modelling"

## Inputs

* Write down which data or information you need to run the notebook 

## Outputs

* Write here which files, code or artefacts you generate by the end of the notebook 

## Additional Comments

* If you have any additional comments that don't fit in the previous bullets, please state them here. 



---

# Change working directory

* We are assuming you will store the notebooks in a subfolder, therefore when running the notebook in the editor, you will need to change the working directory

We need to change the working directory from its current folder to its parent folder
* We access the current directory with os.getcwd()

In [None]:
import os
current_dir = os.getcwd()
current_dir

'c:\\Users\\Ewa\\Documents\\vscode-projects\\GlobalEcoInsights2000-2024'

We want to make the parent of the current directory the new current directory
* os.path.dirname() gets the parent directory
* os.chir() defines the new current directory

In [1]:
os.chdir(os.path.dirname(current_dir))
print("Globalecoinsights")

NameError: name 'os' is not defined

Confirm the new current directory

In [None]:
current_dir = os.getcwd()
current_dir

'c:\\Users\\Ewa\\Documents\\vscode-projects'

# Basic visualization

Libraries

In [2]:
!pip install numpy




[notice] A new release of pip is available: 25.0 -> 25.0.1
[notice] To update, run: C:\Python311\python.exe -m pip install --upgrade pip


In [None]:
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.io as pio
import plotly.graph_objects as go

---

In [None]:
# Load the dataset into a pandas dataframe
df = pd.read_csv("/Users/Ewa/Documents/vscode-projects/GlobalEcoInsights2000-2024/temperature.csv")

Hypothesis: "Average temperature has increased over the years." Visualization: Line chart showing temperature trends per country

In [None]:
import plotly.express as px

# Create a color-blind-friendly line chart for temperature trends over time per country
fig = px.line(df, x="Year", y="Avg_Temperature_degC", color="Country",
              title="Temperature Trends Over Time by Country",
              labels={"Avg_Temperature_degC": "Average Temperature (°C)"},
              line_group="Country",
              color_discrete_sequence=px.colors.sequential.Cividis)  # Cividis is color-blind friendly

# Improve interactivity
fig.update_layout(
    hovermode="x unified",  # Show all values for a given year on hover
    legend_title="Country",
    template="plotly_white",  # Clean background for better visibility
    xaxis=dict(title="Year", showgrid=True),
    yaxis=dict(title="Average Temperature (°C)", showgrid=True),
)

# Add interactive legend (click to hide/show countries)
fig.update_traces(mode="lines+markers", marker=dict(size=4))

# Show the interactive plot
fig.show()

Hypothesis: "Higher CO2 emissions correlate with increased temperature." Visualization: Scatter plot with regression trend line.

In [None]:
# Create scatter plot for CO2 emissions vs. temperature
fig = px.scatter(df, x="CO2_Emissions_tons_per_capita", y="Avg_Temperature_degC",
                 color="Country",  # Different colors for each country
                 size="Population",  # Bubble size represents population
                 hover_name="Country",
                 title="CO2 Emissions vs. Average Temperature",
                 labels={"CO2_Emissions_tons_per_capita": "CO2 Emissions (tons per capita)",
                         "Avg_Temperature_degC": "Average Temperature (°C)"},
                 color_discrete_sequence=px.colors.sequential.Cividis)  # Color-blind-friendly

# Improve interactivity
fig.update_layout(
    hovermode="closest",  # Show closest point details on hover
    template="plotly_white",
    xaxis=dict(title="CO2 Emissions (tons per capita)", showgrid=True),
    yaxis=dict(title="Average Temperature (°C)", showgrid=True),
)

# Show the plot
fig.show()

Hypothesis: "Countries with a higher percentage of renewable energy have lower CO2 emissions." Visualization: Bar chart comparing countries.

In [None]:
# Create a bar chart for Renewable Energy vs. CO2 Emissions
fig = px.bar(df, x="Country", y="CO2_Emissions_tons_per_capita",
             color="Renewable_Energy_pct",
             title="Impact of Renewable Energy on CO2 Emissions",
             labels={"CO2_Emissions_tons_per_capita": "CO2 Emissions (tons per capita)",
                     "Renewable_Energy_pct": "Renewable Energy (%)"},
             color_continuous_scale=px.colors.sequential.Cividis)  # Color-blind friendly

# Improve interactivity
fig.update_layout(
    xaxis=dict(title="Country", tickangle=-45),  # Rotate country names for readability
    yaxis=dict(title="CO2 Emissions (tons per capita)"),
    template="plotly_white"
)

# Show the interactive bar chart
fig.show()

Hypothesis: "A decrease in forest area percentage leads to an increase in extreme weather events." Visualization: Dual-axis line chart showing trends.

In [None]:
# Aggregate data by year for trend analysis
df_grouped = df.groupby("Year").agg({
    'Forest_Area_pct': 'mean',          # Example: mean of Forest_Area_pct
    'Extreme_Weather_Events': 'mean',   # Example: mean of Extreme_Weather_Events
    # Add other numeric columns here with their aggregation method
}).reset_index()

# Create figure
fig = go.Figure()

# Add line for Forest Area Percentage
fig.add_trace(go.Scatter(x=df_grouped["Year"], y=df_grouped["Forest_Area_pct"],
                         mode='lines+markers', name='Forest Area (%)',
                         line=dict(color='green')))

# Add line for Extreme Weather Events
fig.add_trace(go.Scatter(x=df_grouped["Year"], y=df_grouped["Extreme_Weather_Events"],
                         mode='lines+markers', name='Extreme Weather Events',
                         line=dict(color='red'), yaxis="y2"))

# Update layout for dual-axis
fig.update_layout(
    title="Extreme Weather Events vs. Deforestation",
    xaxis=dict(title="Year"),
    yaxis=dict(title="Forest Area (%)", side="left", showgrid=False),
    yaxis2=dict(title="Extreme Weather Events", side="right", overlaying="y", showgrid=False),
    template="plotly_white"
)

# Show the plot
fig.show()

Hypothesis: "Higher population growth contributes to increased CO2 emissions and rising sea levels." Multi-line chart with population, CO2, and sea level rise.

In [None]:
# Check the grouped data
df_grouped = df.groupby("Year")[['Population', 'CO2_Emissions_tons_per_capita', 'Sea_Level_Rise_mm']].sum()
print(df_grouped.columns)  # Ensure 'Population' is present

Index(['Population', 'CO2_Emissions_tons_per_capita', 'Sea_Level_Rise_mm'], dtype='object')


In [None]:
# 1. Check the original column names
print(df.columns)

# 2. Perform the grouping operation
df_grouped = df.groupby("Year")[['Population', 'CO2_Emissions_tons_per_capita', 'Sea_Level_Rise_mm']].sum()

# 3. Check the columns after grouping
print(df_grouped.columns)


Index(['Year', 'Country', 'Avg_Temperature_degC',
       'CO2_Emissions_tons_per_capita', 'Sea_Level_Rise_mm', 'Rainfall_mm',
       'Population', 'Renewable_Energy_pct', 'Extreme_Weather_Events',
       'Forest_Area_pct'],
      dtype='object')
Index(['Population', 'CO2_Emissions_tons_per_capita', 'Sea_Level_Rise_mm'], dtype='object')


---

In [None]:
# Check the original DataFrame before grouping
print(df.columns)

# Group by 'Year' and aggregate the necessary columns
df_grouped = df.groupby("Year", as_index=False).agg({
    'Population': 'sum',  # Make sure to aggregate the 'Population' column
    'CO2_Emissions_tons_per_capita': 'mean',
    'Sea_Level_Rise_mm': 'mean'
})

# Reset index to ensure 'Year' is a column (if it becomes an index during grouping)
df_grouped = df_grouped.reset_index()

# Check if 'Year' and 'Population' are now available as columns
print(df_grouped.columns)

# Plot for Population
fig_population = go.Figure()
fig_population.add_trace(go.Scatter(x=df_grouped["Year"], y=df_grouped["Population"],
                                    mode='lines+markers', name='Population',
                                    line=dict(color='blue')))

fig_population.update_layout(
    title="Population Over Time",
    xaxis_title="Year",
    yaxis_title="Population",
    template="plotly_dark"
)

# Show the Population plot
fig_population.show()

# Plot for CO2 Emissions per capita
fig_co2 = go.Figure()
fig_co2.add_trace(go.Scatter(x=df_grouped["Year"], y=df_grouped["CO2_Emissions_tons_per_capita"],
                             mode='lines+markers', name='CO2 Emissions per capita',
                             line=dict(color='green')))

fig_co2.update_layout(
    title="CO2 Emissions per capita Over Time",
    xaxis_title="Year",
    yaxis_title="CO2 Emissions (tons per capita)",
    template="plotly_dark"
)

# Show the CO2 Emissions plot
fig_co2.show()

# Plot for Sea Level Rise
fig_sea_level = go.Figure()
fig_sea_level.add_trace(go.Scatter(x=df_grouped["Year"], y=df_grouped["Sea_Level_Rise_mm"],
                                  mode='lines+markers', name='Sea Level Rise (mm)',
                                  line=dict(color='red')))

fig_sea_level.update_layout(
    title="Sea Level Rise Over Time",
    xaxis_title="Year",
    yaxis_title="Sea Level Rise (mm)",
    template="plotly_dark"
)

# Show the Sea Level Rise plot
fig_sea_level.show()

Index(['Year', 'Country', 'Avg_Temperature_degC',
       'CO2_Emissions_tons_per_capita', 'Sea_Level_Rise_mm', 'Rainfall_mm',
       'Population', 'Renewable_Energy_pct', 'Extreme_Weather_Events',
       'Forest_Area_pct'],
      dtype='object')
Index(['index', 'Year', 'Population', 'CO2_Emissions_tons_per_capita',
       'Sea_Level_Rise_mm'],
      dtype='object')


NOTE

* You may add as many sections as you want, as long as it supports your project workflow.
* All notebook's cells should be run top-down (you can't create a dynamic wherein a given point you need to go back to a previous cell to execute some task, like go back to a previous cell and refresh a variable content)

---

# Push files to Repo

* In cases where you don't need to push files to Repo, you may replace this section with "Conclusions and Next Steps" and state your conclusions and next steps.

In [None]:
import os
try:
  # create your folder here
  # os.makedirs(name='')
except Exception as e:
  print(e)


IndentationError: expected an indented block after 'try' statement on line 2 (553063055.py, line 5)