#DSE 6000 Final Project: Eugene Lowe, Katrina Skiotys, Adanna Smith, Mitchell Vermet
You've diligently refined your analytical abilities using Python programming language throughout this course. Now, for the semester project, you'll apply what you have learned to craft a compelling `Data Story` using the Plotly Dash framework. Collaborating with a team (up to 5 members of your choosing), you'll construct a Data Story akin to the example provided here: https://ourworldindata.org/electric-car-sales to an external site.

The raw data can be downloaded from the link given below:

https://www.iea.org/data-and-statistics/data-tools/global-ev-data-explorer to an external site.

Click on Historical and Projected Tabs to download the two separate datasets.

You'll conduct Exploratory Data Analysis (EDA) on the provided datasets. Although the provided article already includes 8 diagrams for you to replicate initially, I'm also interested in seeing how each team will expand upon the initial analysis and generate additional insights and charts.

#DELIVERABLES
1.  Requirement-1 (1 pt)

You should show at least 2 steps you adopt to clean and/or transform the dataset.

2. Requirement - 2 (10 pt)

For the EDA part, in addition to the 8 diagrams (4 pt) already given in the article (close enough dynamic diagrams will be accepted), you need to plot 6 (6 pt) more diagrams to show correlations, frequencies, and/or relationships between various variables with plots of 5 different types (bar, line, heatmap, facet, etc.). Every plot should have a title and the x/y axis should have legible labels without any label overlaps for full credit. Provide a summary of your interpretations from the plots after each one.

https://ourworldindata.org/electric-car-sales to an external site.

There is no need to show each figure's three tabs (table/map/chart). Show only the charts (default) that load when you view the page for the first time.

3. Requirement - 3 (5 pt)

By this phase, you have a pretty good understanding of your data. Now, you will apply predictive analytics by building suitable ML models to make some predictions on Oil displacement/EV sales, etc., by selecting suitable independent/dependent variables from the dataset.  You may compare your prediction to the projected prediction given here: https://www.iea.org/data-and-statistics/data-tools/global-ev-data-explorer to an external site.
Evaluate at least 3 different ML algorithms and choose the best one that matches the projected prediction.

4. Requirement - 4 (1 pt)

You should have a conclusion, highlighting the main insights you were able to derive from your analysis. I will be particularly interested to know what else you can find beyond what the author of the article discovered.

In [None]:
!gdown --id '1Bh6EMX7Dc4FArX6bgNl6AlJH7PEXoxDK' --output historical.csv
!gdown --id '1gAfuzTg3jrPN_VXR2Nztx0ugHnJ-YJ9D' --output projected.csv

Downloading...
From: https://drive.google.com/uc?id=1Bh6EMX7Dc4FArX6bgNl6AlJH7PEXoxDK
To: /content/historical.csv
100% 239k/239k [00:00<00:00, 64.2MB/s]
Downloading...
From: https://drive.google.com/uc?id=1gAfuzTg3jrPN_VXR2Nztx0ugHnJ-YJ9D
To: /content/projected.csv
100% 33.1k/33.1k [00:00<00:00, 35.3MB/s]


In [None]:
#Import Packages and Load Data
import pandas as pd
historical_df = pd.read_csv('historical.csv')
projected_df = pd.read_csv('projected.csv')

In [None]:
!pip install dash==2.16.1

Collecting dash==2.16.1
  Downloading dash-2.16.1-py3-none-any.whl.metadata (10 kB)
Collecting Werkzeug<3.1 (from dash==2.16.1)
  Downloading werkzeug-3.0.6-py3-none-any.whl.metadata (3.7 kB)
Collecting dash-html-components==2.0.0 (from dash==2.16.1)
  Downloading dash_html_components-2.0.0-py3-none-any.whl.metadata (3.8 kB)
Collecting dash-core-components==2.0.0 (from dash==2.16.1)
  Downloading dash_core_components-2.0.0-py3-none-any.whl.metadata (2.9 kB)
Collecting dash-table==5.0.0 (from dash==2.16.1)
  Downloading dash_table-5.0.0-py3-none-any.whl.metadata (2.4 kB)
Collecting retrying (from dash==2.16.1)
  Downloading retrying-1.3.4-py3-none-any.whl.metadata (6.9 kB)
Downloading dash-2.16.1-py3-none-any.whl (10.2 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m10.2/10.2 MB[0m [31m19.8 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading dash_core_components-2.0.0-py3-none-any.whl (3.8 kB)
Downloading dash_html_components-2.0.0-py3-none-any.whl (4.1 kB)
Downloading d

#Data Cleaning

In [None]:
# Check for Duplicates- should be no duplicates (NOT CONSIDERED A STEP, JUST MAKING SURE WE HAVE CLEAN DATA)
hist_duplicates = historical_df[historical_df.duplicated(keep=False)]
print(hist_duplicates)
if len(hist_duplicates) > 0:
    print("There are duplicate rows in the dataset.")
else:
    print("There are no duplicate rows in the dataset.")

proj_duplicates = projected_df[projected_df.duplicated(keep=False)]
print(proj_duplicates)
if len(proj_duplicates) > 0:
    print("There are duplicate rows in the dataset.")
else:
    print("There are no duplicate rows in the dataset.")

Empty DataFrame
Columns: [region, category, parameter, mode, powertrain, year, unit, value]
Index: []
There are no duplicate rows in the dataset.
Empty DataFrame
Columns: [region, category, parameter, mode, powertrain, year, unit, value]
Index: []
There are no duplicate rows in the dataset.


In [None]:
#Checking for nulls (Step 1)- replacing possible nulls
hist_nulls = historical_df.isnull().sum()
proj_nulls = projected_df.isnull().sum()
print(hist_nulls)
print(proj_nulls)
historical_df.fillna(0, inplace=True)
projected_df.fillna(0, inplace=True)

region        0
category      0
parameter     0
mode          0
powertrain    0
year          0
unit          0
value         0
dtype: int64
region        0
category      0
parameter     0
mode          0
powertrain    0
year          0
unit          0
value         0
dtype: int64


In [None]:
#Changing 'Year' from int64 to datetime (Step 2)
historical_df['year'] = pd.to_datetime(historical_df['year'], format='%Y')
projected_df['year'] = pd.to_datetime(projected_df['year'], format='%Y')

print(historical_df.dtypes)
print(projected_df.dtypes)
historical_df.head()

region                object
category              object
parameter             object
mode                  object
powertrain            object
year          datetime64[ns]
unit                  object
value                float64
dtype: object
region                object
category              object
parameter             object
mode                  object
powertrain            object
year          datetime64[ns]
unit                  object
value                float64
dtype: object


Unnamed: 0,region,category,parameter,mode,powertrain,year,unit,value
0,Australia,Historical,EV sales share,Cars,EV,2011-01-01,percent,0.0065
1,Australia,Historical,EV stock share,Cars,EV,2011-01-01,percent,0.00039
2,Australia,Historical,EV sales,Cars,BEV,2011-01-01,Vehicles,49.0
3,Australia,Historical,EV stock,Cars,BEV,2011-01-01,Vehicles,49.0
4,Australia,Historical,EV stock,Cars,BEV,2012-01-01,Vehicles,220.0


#EDA

##Recreated diagrams

###CHART 1- Share of New Cars Sold That Are Electric (2010-2023)

In [None]:

#CHART 1- Share of New Cars Sold That Are Electric (2010-2023)
import plotly.express as px
from dash import Dash, dcc, html
from dash.dependencies import Input, Output

selected_regions = ['World', 'Norway', 'United Kingdom', 'EU27', 'China', 'USA']
filtered_data = historical_df[
    (historical_df['parameter'] == 'EV sales share') & (historical_df['region'].isin(selected_regions))
]

app = Dash(__name__)

region_order = ['World', 'Norway', 'United Kingdom', 'EU27', 'China', 'USA']

# Create the Plotly Express figure with custom category order
app.layout = html.Div([
    html.H2("Share of New Cars Sold That Are Electric (2010-2023)"),
    dcc.Graph(
        id='facet-plot',
        figure=px.line(
            filtered_data,
            x='year',
            y='value',
            color='region',
            facet_col='region',
            facet_col_wrap=3,
            title='Share of New Cars Sold That Are Electric (2010-2023)',
            labels={'value': 'Share of New Cars Sold (Electric)', 'year': 'Year'},
            markers=True,
            category_orders={'region': region_order}
        ).update_layout(
            xaxis_title='Year',
            yaxis_title='Share of New Cars Sold (Electric)',
            legend_title_text='Region'
 ).for_each_xaxis(lambda xaxis: xaxis.update(
            tickangle=45,
            tickvals=[2010, 2014, 2016, 2018, 2020, 2023],
            ticktext=['2010', '2014', '2016', '2018', '2020', '2023']
        ))
    )
])
if __name__ == '__main__':
    app.run_server(debug=True)

<IPython.core.display.Javascript object>

###CHART 2- Share of New Cars Sold that are Electric, 2023

In [None]:
#CHART 2- Share of New Cars Sold that are Electric, 2023
import pandas as pd
import plotly.express as px
from dash import Dash, dcc, html

selected_regions = ['Norway', 'Sweden', 'China', 'United Kingdom', 'Germany', 'EU27', 'World', 'USA', 'India', 'South Africa']
filtered_data = historical_df[
    (historical_df['parameter'] == 'EV sales share') & (historical_df['region'].isin(selected_regions))
]

filtered_data['year'] = pd.to_datetime(filtered_data['year'], format='%Y')
filtered_data_2023 = filtered_data[filtered_data['year'].dt.year == 2023]
filtered_data_2023_sorted = filtered_data_2023.sort_values(by='value', ascending=True)

fig = px.bar(
    filtered_data_2023_sorted,
    x='value',
    y='region',
    orientation='h',
    title='Share of New Cars Sold That Are Electric, 2023',
    labels={'value': 'Share of New Cars Sold (Electric)', 'region': 'Region'},
    text='value'
)

fig.update_traces(marker_color='teal', texttemplate='%{text:.1f}%', textposition='outside')
fig.update_layout(
    xaxis_title='Share of New Cars Sold (Electric) (%)',
    yaxis_title='Region',
    template='plotly_white'
)

app = Dash(__name__)

app.layout = html.Div([
    html.H2('Share of New Cars Sold That Are Electric, 2023', style={'textAlign': 'center'}),
    dcc.Graph(figure=fig)
])


if __name__ == '__main__':
    app.run_server(debug=True)



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



<IPython.core.display.Javascript object>

###CHART 3- Share of New Cars Sold That Are Battery-Electric & Plug-In Hybrid 2010 to 2023

In [None]:
import dash
from dash import dcc, html
import plotly.express as px
import pandas as pd

# Data setup
data = {
    "Year": [2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023] * 6,
    "Region": ["World"] * 14 + ["Norway"] * 14 + ["EU27"] * 14 +
              ["United Kingdom"] * 14 + ["China"] * 14 + ["United States"] * 14,
    "Plug-in Hybrid": [0, 0, 0, 0, 1, 5, 14, 18, 18, 13, 21, 22, 9, 8] * 6,
    "Battery Electric": [0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 3, 6, 12] * 6
}

df = pd.DataFrame(data)

# Reshape data for Plotly
melted_data = df.melt(
    id_vars=["Year", "Region"],
    value_vars=["Plug-in Hybrid", "Battery Electric"],
    var_name="Powertrain",
    value_name="Percentage"
)

# Plotly Express figure
fig = px.bar(
    melted_data,
    x="Year",
    y="Percentage",
    color="Powertrain",
    facet_col="Region",
    facet_col_wrap=3,
    title="Share of New Cars Sold (Battery-Electric & Plug-In Hybrid, 2010-2023)",
    labels={"Year": "Year", "Percentage": "Share (%)"},
    color_discrete_map={"Plug-in Hybrid": "maroon", "Battery Electric": "darkblue"}
)

# Update layout
fig.update_layout(
    barmode="stack",
    height=800,
    title=dict(font=dict(size=18)),
    yaxis=dict(ticksuffix='%'),
    xaxis=dict(showline=True, linecolor="black"),
    showlegend=True,
    template="plotly_white"
).for_each_annotation(lambda a: a.update(text=a.text.split("=")[-1]))

# Dash App
app = dash.Dash(__name__)

app.layout = html.Div([
    html.H2("Battery-Electric & Plug-In Hybrid Sales (2010-2023)", style={"textAlign": "center"}),
    dcc.Graph(figure=fig)
])

if __name__ == "__main__":
    app.run_server(debug=True)


<IPython.core.display.Javascript object>

In [None]:
import dash
from dash import dcc, html
import plotly.express as px
import pandas as pd

# Data provided
data = {
    "Year": [2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023] * 6,
    "Country/area": ["World"] * 14 + ["Norway"] * 14 + ["European Union (27)"] * 14 +
                    ["United Kingdom"] * 14 + ["China"] * 14 + ["United States"] * 14,
    "Plug-in Hybrid": [
        0, 0, 0, 0, 1, 5, 14, 18, 18, 13, 21, 22, 9, 8,  # World
        0, 0, 1, 1, 1, 5, 14, 18, 18, 13, 21, 22, 9, 8,  # Norway
        0, 0, 0, 0, 0, 0, 1, 1, 2, 2, 4, 7, 6, 7,  # European Union (27)
        0, 0, 0, 0, 0, 1, 1, 1, 2, 2, 1, 1, 6, 7,  # United Kingdom
        0, 0, 0, 0, 0, 1, 1, 2, 4, 4, 5, 13, 22, 25,  # China
        0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 2  # United States
    ],
    "Battery Electric": [
        0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 3, 6, 12,  # World
        0, 0, 3, 6, 14, 17, 15, 21, 31, 43, 54, 64, 80, 85,  # Norway
        0, 0, 0, 0, 0, 0, 1, 1, 2, 2, 5, 9, 12, 14,  # European Union (27)
        0, 0, 0, 0, 0, 0, 1, 1, 1, 2, 2, 4, 12, 17,  # United Kingdom
        0, 0, 0, 0, 0, 0, 1, 2, 4, 4, 5, 13, 22, 25,  # China
        0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 2, 4, 6, 8  # United States
    ]
}

# Convert to DataFrame
df = pd.DataFrame(data)

# Ensure proper country/region order
country_order = ["World", "Norway", "European Union (27)", "United Kingdom", "China", "United States"]
df["Country/area"] = pd.Categorical(df["Country/area"], categories=country_order, ordered=True)

# Melt the DataFrame for Plotly
melted_data = df.melt(
    id_vars=["Year", "Country/area"],
    value_vars=["Plug-in Hybrid", "Battery Electric"],
    var_name="Powertrain",
    value_name="Percentage"
)

# Create the stacked bar chart
fig = px.bar(
    melted_data,
    x="Year",
    y="Percentage",
    color="Powertrain",
    facet_col="Country/area",
    facet_col_wrap=3,
    title="Share of New Cars Sold That Are Battery-Electric & Plug-In Hybrid (2010 to 2023)",
    labels={"Year": "Year", "Percentage": "Share (%)"},
    color_discrete_map={"Plug-in Hybrid": "maroon", "Battery Electric": "darkblue"}
)

# Update layout for stacked bars and axis ticks
fig.update_layout(
    barmode="stack",
    height=800,
    showlegend=True,
    title=dict(font=dict(size=18)),
    yaxis=dict(ticksuffix='%')
)

# Remove "Country/area=" from facet titles
fig.for_each_annotation(lambda a: a.update(text=a.text.split("=")[-1]))

# Ensure the years are displayed at the bottom of all subplots
fig.update_xaxes(
    tickmode="array",
    tickvals=list(range(2010, 2024)),
    ticktext=[str(year) for year in range(2010, 2024)],
    showline=True,
    linecolor="black",
    matches="x"
)


fig.update_yaxes(
    tickvals=[0, 20, 40, 60, 80],
    range=[0, 100],
    title=None,
    tickformat=".0f",
)

# Build the Dash App
app = dash.Dash(__name__)

app.layout = html.Div([
    html.H2("Share of New Cars Sold That Are Battery-Electric & Plug-In Hybrid (2010-2023)", style={"textAlign": "center"}),
    dcc.Graph(
        id="ev-sales-share-chart",
        figure=fig
    )
])

if __name__ == "__main__":
    app.run_server(debug=True)

<IPython.core.display.Javascript object>

###CHART 4: Share of New Electric Cars that are Fully Battery-Electric 2012-2023

In [None]:
import pandas as pd
import plotly.graph_objects as go
import dash
from dash import dcc, html

# Define selected regions and their colors
selected_regions = ['Norway', 'United Kingdom', 'World', 'China', 'Sweden']
region_colors = {
    'Norway': 'orange',
    'United Kingdom': 'green',
    'World': 'purple',
    'China': 'blue',
    'Sweden': 'brown'
}

# Filter and process the data
ev_sales_data = historical_df[
    (historical_df['parameter'] == 'EV sales') &
    historical_df['region'].isin(selected_regions)
]
ev_sales_data['year'] = pd.to_datetime(ev_sales_data['year'], errors='coerce').dt.year
ev_sales_data = ev_sales_data[ev_sales_data['year'].between(2012, 2023)]  # Adjusted range to start from 2012

bev_sales_data = ev_sales_data[ev_sales_data['powertrain'] == 'BEV']
total_ev_sales_data = ev_sales_data[ev_sales_data['powertrain'] != 'BEV']

bev_sales_by_year_region = bev_sales_data.groupby(['year', 'region'])['value'].sum().reset_index()
total_ev_sales_by_year_region = total_ev_sales_data.groupby(['year', 'region'])['value'].sum().reset_index()

merged_data = pd.merge(
    bev_sales_by_year_region,
    total_ev_sales_by_year_region,
    on=['year', 'region'],
    suffixes=('_bev', '_total')
)
merged_data['share_bev'] = merged_data['value_bev'] / (merged_data['value_bev'] + merged_data['value_total']) * 100

# Initialize the Dash app
app = dash.Dash(__name__)

# Create the figure
fig = go.Figure()

# Add traces for each region
for region in selected_regions:
    region_data = merged_data[merged_data['region'] == region]
    fig.add_trace(go.Scatter(
        x=region_data['year'],
        y=region_data['share_bev'],
        mode='lines+markers',
        name=region,
        line=dict(color=region_colors[region]),
        marker=dict(symbol='circle')
    ))

# Add annotations for the last data points
end_points = {}
for region in selected_regions:
    region_data = merged_data[merged_data['region'] == region]
    if not region_data.empty:  # Check if region_data is empty
        last_year = region_data['year'].iloc[-1]
        last_share = region_data['share_bev'].iloc[-1]
        end_points[region] = (last_year, last_share)

annotations = []
for region, (x, y) in end_points.items():
    annotations.append(
        go.layout.Annotation(
            x=x,
            y=y,
            text=region,
            showarrow=False,
            xanchor='left',
            yanchor='middle',
            font=dict(size=10)
        )
    )

# Update layout
fig.update_layout(
    title='Share of New Electric Cars That Are Fully Battery-Electric 2012 to 2023',
    xaxis_title='Year',
    yaxis_title='Share of BEVs (%)',
    legend_title='Region',
    template='plotly_white',
    xaxis=dict(
        tickangle=45,
        tickvals=list(range(2012, 2024)),
        ticktext=[str(year) for year in range(2012, 2024)]
    ),
    hovermode='closest',
    annotations=annotations
)

# Define the app layout
app.layout = html.Div([
    html.H2('Electric Vehicle Sales Analysis (2012-2023)', style={'textAlign': 'center'}),
    dcc.Graph(figure=fig)
])

# Run the app
if __name__ == '__main__':
    app.run_server(debug=True)



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



<IPython.core.display.Javascript object>

###CHART 5- Number of New Electric Cars Sold 2023

In [None]:
#CHART 5- Number of New Electric Cars Sold 2023
import dash
from dash import dcc, html
import plotly.express as px
import pandas as pd

historical_df = pd.read_csv('historical.csv')


# Filter data for 2023 and EV sales
filtered_data = historical_df[
    (historical_df['year'] == pd.to_datetime('2023')) &
    (historical_df['parameter'] == 'EV sales')
]

# Aggregate sales by region
sales_by_region = filtered_data.groupby('region')['value'].sum().reset_index()

# Create the choropleth map
fig = px.choropleth(
    sales_by_region,
    locations='region',
    locationmode='country names',
    color='value',
    color_continuous_scale='redor',
    title='Number of New Electric Cars Sold in 2023',
    labels={'value': 'Number of Electric Cars Sold'},
    range_color=[0, 10000000]
)

fig.update_geos(
    visible=False,
    showcoastlines=True, coastlinecolor="Black",
    showland=True, landcolor="lightgray"
)

fig.update_layout(
    title_x=0.5,
    geo=dict(showframe=False, showcoastlines=True, projection_type='equirectangular')
)

# Initialize Dash app
app = dash.Dash(__name__)

# Layout of the app
app.layout = html.Div([
    html.H2("Number of New Electric Cars Sold in 2023", style={'textAlign': 'center'}),
    dcc.Graph(figure=fig)
])

# Run the Dash app
if __name__ == '__main__':
    app.run_server(debug=True)


<IPython.core.display.Javascript object>

###CHART 6- Number of New Cars Sold by Type

In [None]:
import dash
from dash import dcc, html
import plotly.graph_objects as go
import pandas as pd

# Provided data
data = {
    "year": list(range(2010, 2024)),
    "electric_cars": [
        7450, 49000, 120000, 201000, 330000, 550000, 760000, 1180000, 2060000,
        2080000, 2980000, 6600000, 10200000, 13800000
    ],
    "non_electric_cars": [
        67719820, 72009816, 74880000, 74243440, 80157810, 80332350, 83684450,
        83105710, 83773330, 77920000, 67972380, 67557304, 62657144, 62866668
    ]
}

# Year range
data["year"] = list(range(2010, 2023 + 1))

# Create a DataFrame
df = pd.DataFrame(data)

# Create a stacked bar chart
fig = go.Figure()

# Add non-electric cars trace (stacked on top of electric cars)
fig.add_trace(go.Bar(
    x=df['year'],
    y=df['non_electric_cars'],
    name='Non-Electric Cars',
    marker_color='#e66101'
))

# Add electric cars trace (placed below non-electric cars)
fig.add_trace(go.Bar(
    x=df['year'],
    y=df['electric_cars'],
    name='Electric Cars',
    marker_color='#377eb8'
))

# Update layout
fig.update_layout(
    title='Number of New Cars Sold by Type, World (2010–2023)',
    xaxis_title='Year',
    yaxis_title='Number of Cars Sold',
    barmode='stack',
    xaxis=dict(
        tickmode='array',
        tickvals=df['year'],
        ticktext=[str(year) for year in df['year']]
    ),
    yaxis=dict(
        tickformat=".1s",
        title='Cars Sold (millions)'
    ),
    legend_title='Car Type',
    template='plotly_white',
    title_x=0.5,
    plot_bgcolor='white'
)

# Dash App Layout
app = dash.Dash(__name__)

app.layout = html.Div([
    html.H2("Number of New Cars Sold by Type, World (2010–2023)", style={'textAlign': 'center'}),
    dcc.Graph(figure=fig)
])

# Run the Dash app
if __name__ == '__main__':
    app.run_server(debug=True)

<IPython.core.display.Javascript object>

###CHART 7- Share of Cars Currently In Use That Are Electric 2010-2023

In [None]:
import dash
from dash import dcc, html
import plotly.graph_objects as go
import pandas as pd

selected_regions = ['Norway', 'Sweden', 'China', 'World', 'USA']

region_colors = {
    'Norway': 'orange',
    'Sweden': 'brown',
    'China': 'blue',
    'World': 'purple',
    'USA': 'green'
}

filtered_data = historical_df[
    (historical_df['powertrain'] == 'EV') &
    (historical_df['parameter'] == 'EV sales share') &
    historical_df['region'].isin(selected_regions)
]

filtered_data['year'] = pd.to_datetime(filtered_data['year'], errors='coerce').dt.year
filtered_data = filtered_data[filtered_data['year'].between(2010, 2023)]

ev_stock_by_year_region = filtered_data.groupby(['year', 'region'])['value'].sum().reset_index()

app = dash.Dash(__name__)

fig = go.Figure()

for region in selected_regions:
    region_data = ev_stock_by_year_region[ev_stock_by_year_region['region'] == region]
    fig.add_trace(go.Scatter(
        x=region_data['year'],
        y=region_data['value'],
        mode='lines+markers',
        name=region,
        line=dict(color=region_colors[region]),
        marker=dict(symbol='circle')
    ))
end_points = {}
for region in selected_regions:
    region_data = ev_stock_by_year_region[ev_stock_by_year_region['region'] == region]
    if not region_data.empty:  # Skip empty DataFrames
        last_year = region_data['year'].iloc[-1]
        last_share = region_data['value'].iloc[-1]
        end_points[region] = (last_year, last_share)

annotations = []
for region, (x, y) in end_points.items():
    annotations.append(
        go.layout.Annotation(
            x=x,
            y=y,
            text=region,
            showarrow=False,
            xanchor='left',
            yanchor='middle',
            font=dict(size=10)
        )
    )
fig.update_layout(
    title='Share of Cars in Use (2010 to 2023)',
    xaxis_title='Year',
    yaxis_title='Percentage',
    legend_title='Region',
     xaxis=dict(
        tickangle=45,
        tickvals=[2010, 2012, 2014, 2016, 2018, 2020, 2022, 2023],
        ticktext=['2010', '2012', '2014', '2016', '2018', '2020', '2022', '2023']
    ),
    template='plotly',
    showlegend=True,
    plot_bgcolor='white',
    margin= dict(l=50, r=50, t=50, b=50),
    annotations= annotations,
)

app.layout = html.Div([
    html.H2('Share of Cars in Use (2010-2023)', style={'textAlign': 'center'}),
    dcc.Graph(
        id='ev-share-plot',
        figure=fig
    )
])

if __name__ == '__main__':
    app.run_server(debug=True)




A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



<IPython.core.display.Javascript object>

###CHART 8- Electric Car Stocks 2010-2023

In [None]:
#CHART 8- Electric Car Stocks 2010-2023
import pandas as pd
import plotly.express as px
from dash import Dash, dcc, html

selected_regions = ['World', 'China', 'EU27', 'USA']
filtered_data = historical_df[
    ((historical_df['powertrain'] == 'BEV') | (historical_df['powertrain'] == 'PHEV')) &
    (historical_df['parameter'] == 'EV stock') &
    historical_df['region'].isin(selected_regions)
]

filtered_data['year'] = pd.to_datetime(filtered_data['year'], errors='coerce').dt.year

ev_stock_by_year_region = filtered_data.groupby(['year', 'region'])['value'].sum().reset_index()
ev_stock_by_year_region['value'] = ev_stock_by_year_region['value'] / 1_000_000

fig = px.line(
    ev_stock_by_year_region,
    x='year',
    y='value',
    color='region',
    title='Electric Car Stocks, 2010 to 2023',
    labels={
        'year': 'Year',
        'value': 'Electric Car Stocks (Millions)',
        'region': 'Region'
    }
)
end_points = {}
for region in selected_regions:
    region_data = ev_stock_by_year_region[ev_stock_by_year_region['region'] == region]
    last_year = region_data['year'].iloc[-1]
    last_share = region_data['value'].iloc[-1]
    end_points[region] = (last_year, last_share)

annotations = []
for region, (x, y) in end_points.items():
    annotations.append(
        go.layout.Annotation(
            x=x,
            y=y,
            text=region,
            showarrow=False,
            xanchor='left',
            yanchor='middle',
            font=dict(size=10)
        )
    )
fig.update_layout(
    xaxis=dict(title='Year', tickangle=45,
               tickvals=[2010, 2012, 2014, 2016, 2018, 2020, 2022, 2023], #add this
               ticktext=['2010', '2012', '2014', '2016', '2018', '2020', '2022', '2023']), #add this
    yaxis=dict(title='Electric Car Stocks (Millions)', tickformat=".1f"),
    legend_title='Region',
    template='plotly_white',
    annotations= annotations,
)

app = Dash(__name__)

app.layout = html.Div([
    html.H2('Electric Car Stocks Dashboard', style={'textAlign': 'center'}),
    dcc.Graph(figure=fig)
])

if __name__ == '__main__':
    app.run_server(debug=True)



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



<IPython.core.display.Javascript object>

##Additional diagrams

###DIAGRAM 1 & 2 (Line)- Chinese Cars Sold By Year for Each EV Type

In [None]:
import dash
from dash import dcc, html
import plotly.graph_objects as go
import pandas as pd

chinese_sales = historical_df[historical_df['region'] == 'China']
chinese_sales['year'] = chinese_sales['year'].dt.year
filtered_chinese_sales = chinese_sales[chinese_sales['parameter'] == 'EV sales']
sales_by_powertrain = filtered_chinese_sales.groupby(['year', 'powertrain'])['value'].sum().reset_index()
sales_by_powertrain_pivot = sales_by_powertrain.pivot(index='year', columns='powertrain', values='value')

chinese_sales_proj = projected_df[projected_df['region'] == 'China']
chinese_sales_proj['year'] = chinese_sales_proj['year'].dt.year
filtered_chinese_sales_proj = chinese_sales_proj[chinese_sales_proj['parameter'] == 'EV sales']
sales_by_powertrain_proj = filtered_chinese_sales_proj.groupby(['year', 'powertrain'])['value'].sum().reset_index()
sales_by_powertrain_pivot_proj = sales_by_powertrain_proj.pivot(index='year', columns='powertrain', values='value')

app = dash.Dash(__name__)

def millions_formatter(x):
    return f'{x / 1e6:.1f}M'

fig1 = go.Figure()

for powertrain in sales_by_powertrain_pivot.columns:
    fig1.add_trace(go.Scatter(
        x=sales_by_powertrain_pivot.index,
        y=sales_by_powertrain_pivot[powertrain],
        mode='lines+markers',
        name=powertrain
    ))

fig1.update_layout(
    title='Historical Chinese EV Sales',
    xaxis_title='Year',
    yaxis_title='EV Sales By Powertrain (In Millions)',
    yaxis_tickformat='.1f',
    template='plotly',
    showlegend=True
)

fig2 = go.Figure()

for powertrain in sales_by_powertrain_pivot_proj.columns:
    fig2.add_trace(go.Scatter(
        x=sales_by_powertrain_pivot_proj.index,
        y=sales_by_powertrain_pivot_proj[powertrain],
        mode='lines+markers',
        name=powertrain
    ))

fig2.update_layout(
    title='Projected Chinese EV Sales',
    xaxis_title='Year',
    yaxis_title='EV Sales By Powertrain (In Millions)',
    yaxis_tickformat='.1f',
    template='plotly',
    showlegend=True
)

app.layout = html.Div([
    html.H3('Chinese EV Sales by Powertrain', style={'textAlign': 'center'}),

    html.Div([
        dcc.Graph(
            id='historical-ev-sales',
            figure=fig1
        ),
        dcc.Graph(
            id='projected-ev-sales',
            figure=fig2
        )
    ])
])

if __name__ == '__main__':
    app.run_server(debug=True)


###DIAGRAM 3 (Facet)- Powertrain Distribution in the U.S., UK, China, India, and Europe in 2023

In [None]:
#Diagram 3 (Facet)- Powertrain Distribution in the U.S., UK, China, India and Europe in 2023
import dash
import dash_core_components as dcc
import dash_html_components as html
import plotly.express as px
import pandas as pd
import numpy as np


historical_df['year'] = pd.to_datetime(historical_df['year'], format='%Y')
selected_regions = ['USA', 'United Kingdom', 'China', 'Europe', 'India']
filtered_data = historical_df[
    (historical_df['region'].isin(selected_regions)) & (historical_df['year'] == pd.to_datetime('2023'))
]

powertrain_distribution = filtered_data.groupby(['region', 'powertrain'])['value'].sum().reset_index()

app = dash.Dash(__name__)

fig = px.bar(
    powertrain_distribution,
    x='powertrain',
    y='value',
    color='powertrain',
    facet_col='region',
    category_orders={'region': selected_regions},
    labels={'powertrain': 'Powertrain Type', 'value': 'Sales'},
    title="Powertrain Distribution in Selected Regions (2023)"
)

fig.update_layout(
    title_text="Powertrain Distribution in Selected Regions (2023)",
    title_x=0.5,
    title_y=0.95,
    xaxis_title="Powertrain Type",
    yaxis_title="Sales",
    xaxis_tickangle=-45,
    showlegend=False
)

app.layout = html.Div([
    dcc.Graph(figure=fig)
])

if __name__ == '__main__':
    app.run_server(debug=True)


###DIAGRAM 4 (HeatMap)- Correlation Between EV Sales and the U.S., UK, China, Europe & India Over Time


In [None]:
#Diagram 4 (HeatMap)- Correlation Between EV Sales and the U.S., UK, China, Europe & India Over Time
import plotly.graph_objs as go
selected_regions = ['USA', 'United Kingdom', 'China', 'Europe', 'India']
ev_sales_data = historical_df[
    (historical_df['parameter'] == 'EV sales') & (historical_df['region'].isin(selected_regions))
]

heatmap_data = ev_sales_data.pivot_table(
    index='region', columns='year', values='value', aggfunc='sum'
)

app = Dash(__name__)

app.layout = html.Div([
    html.H3("Correlation Between EV Sales and Regions Over Time"),

    dcc.Graph(
        id='ev-sales-heatmap',
        figure={
            'data': [
                go.Heatmap(
                    z=heatmap_data.values,
                    x=heatmap_data.columns,
                    y=heatmap_data.index,
                    colorscale='Viridis',
                    colorbar=dict(title='EV Sales'),
                    hoverongaps=False
                )
            ],
            'layout': go.Layout(
                title='Correlation between EV Sales and Regions Over Time',
                xaxis=dict(title='Year'),
                yaxis=dict(title='Region'),
                height=600,
                width=800
            )
        }
    )
])

if __name__ == '__main__':
    app.run_server(debug=True)

###DIAGRAM 5 (Geoplot) - Global Distribution of EV Sales Share 2023

In [None]:
#Diagram 5 (Geoplot)-Global Distribution of EV Sales Share 2023
import dash
from dash import dcc, html
import plotly.express as px
import pandas as pd

#Filter the data for EV sales share in 2023
filtered_data = historical_df[
    (historical_df['year'] == pd.to_datetime('2023')) &
    (historical_df['parameter'] == 'EV sales share')
]
sales_by_region = filtered_data.groupby('region')['value'].sum().reset_index()

#Create the choropleth map using Plotly Express
fig = px.choropleth(
    sales_by_region,
    locations='region',
    locationmode='country names',
    color='value',
    color_continuous_scale='Viridis',
    title='Global Distribution of EV Sales Share in 2023',
    labels={'value': 'EV Sales Share (%)'},
    range_color=[0, sales_by_region['value'].max()],
)

#Update geographical layout
fig.update_geos(
    visible=False,
    showcoastlines=True, coastlinecolor="Black",
    showland=True, landcolor="lightgray"
)

fig.update_layout(
    title_x=0.5,
    geo=dict(showframe=False, showcoastlines=True, projection_type='equirectangular')
)

#Create Dash app
app = dash.Dash(__name__)

#Dash layout
app.layout = html.Div([
    html.H3('Global Distribution of EV Sales Share in 2023', style={'textAlign': 'center'}),

    dcc.Graph(
        id='ev-sales-map',
        figure=fig
    )
])

#Run the app
if __name__ == '__main__':
    app.run_server(debug=True)


###DIAGRAM 6 (Bubble Graph)- EV Sales vs Stock in 2023 in the U.S., UK, Europe, China & India

In [None]:
#Diagram 6 (Bubble Graph)- EV Sales vs Stock in 2023 in the U.S., UK, Europe, China & India
import plotly.express as px

filtered_df = historical_df[
    (historical_df['year'] == pd.to_datetime('2023')) &
    (historical_df['parameter'].isin(['EV sales', 'EV stock'])) &
    (historical_df['region'].isin(['USA', 'United Kingdom', 'China', 'India', 'Europe']))
]

aggregated_df = filtered_df.groupby(['region', 'parameter'])['value'].sum().reset_index()

fig = px.scatter(
    aggregated_df,
    x='region',
    y='value',
    size='value',
    color='parameter',
    title='EV Sales vs. Stock in 2023: Focus on Key Regions',
    labels={'region': 'Region', 'value': 'Value'},
    hover_data=['region', 'parameter', 'value'], #displays data on hover
    size_max=60,
)

fig.update_layout(
    xaxis_title='Region',
    yaxis_title='Value',
    legend_title='Parameter',
)

fig.show()

# Machine Learning: EV Stock Share Prediction

## Random Forest Regressor and XGBoost Regressor

In [None]:
import pandas as pd
from xgboost import XGBRegressor
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, r2_score
import matplotlib.pyplot as plt

#Compute EV stock share
for df in [historical_df, projected_df]:
    df['EV_stock_share'] = df.groupby('region')['value'].cumsum()

#Combine historical and projected data
combined_data = pd.concat([historical_df, projected_df], ignore_index=True)

#Ensure 'year' is of consistent type
combined_data['year'] = pd.to_numeric(combined_data['year'], errors='coerce')
historical_df['year'] = pd.to_numeric(historical_df['year'], errors='coerce')
projected_df['year'] = pd.to_numeric(projected_df['year'], errors='coerce')

#Create lagged feature and drop rows with missing lagged values
combined_data = combined_data.sort_values(by=['region', 'year'])
combined_data['lagged_share'] = combined_data.groupby('region')['EV_stock_share'].shift(1)
combined_data.dropna(subset=['lagged_share'], inplace=True)

#One-hot encode categorical variables
combined_data = pd.get_dummies(combined_data, columns=['region', 'powertrain'], drop_first=True)

#Drop any remaining non-numeric columns
non_numeric_columns = combined_data.select_dtypes(exclude=['number']).columns
if not non_numeric_columns.empty:
    combined_data.drop(columns=non_numeric_columns, inplace=True)

#Prepare features and target
X = combined_data.drop(columns=['EV_stock_share'], errors='ignore')
y = combined_data['EV_stock_share']

#Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=302)

#Train models
rf_model = RandomForestRegressor(random_state=30, n_estimators=200, max_depth=10)
rf_model.fit(X_train, y_train)
rf_pred = rf_model.predict(X_test)

xgb_model = XGBRegressor(random_state=42, n_estimators=200, learning_rate=0.1, max_depth=5)
xgb_model.fit(X_train, y_train)
xgb_pred = xgb_model.predict(X_test)

#Prepare plot summaries
projected_summary = projected_df.groupby('year')['EV_stock_share'].sum().reset_index(name='Projected_Stock_Share')
actual_summary = historical_df.groupby('year')['EV_stock_share'].sum().reset_index(name='Actual_Stock_Share')

predicted_summary = pd.DataFrame({
    'year': combined_data.loc[X_test.index, 'year'],
    'Predicted_XGB': xgb_pred,
    'Predicted_RF': rf_pred
}).groupby('year').sum().reset_index()

#Ensure 'year' is of consistent type for merging
projected_summary['year'] = pd.to_numeric(projected_summary['year'], errors='coerce')
actual_summary['year'] = pd.to_numeric(actual_summary['year'], errors='coerce')
predicted_summary['year'] = pd.to_numeric(predicted_summary['year'], errors='coerce')

#Merge and normalize data
comparison_df = (
    projected_summary
    .merge(predicted_summary, on='year', how='outer')
    .merge(actual_summary, on='year', how='outer')
)
comparison_df.iloc[:, 1:] *= 100

#Plot
plt.figure(figsize=(12, 6))
plt.plot(comparison_df['year'], comparison_df['Actual_Stock_Share'], label='Actual (Historical)', marker='o')
plt.plot(comparison_df['year'], comparison_df['Predicted_XGB'], label='Predicted (XGBoost)', linestyle='--', marker='x')
plt.plot(comparison_df['year'], comparison_df['Predicted_RF'], label='Predicted (Random Forest)', linestyle='-.', marker='s')
plt.plot(comparison_df['year'], comparison_df['Projected_Stock_Share'], label='Projected (IEA)', linestyle=':')
plt.xlabel('Year')
plt.ylabel('EV Stock Share (%)')
plt.title('Comparison of Actual, Predicted, and Projected EV Stock Share (2010-2035)')
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()

#Evaluate models
print(f"Random Forest - MSE: {mean_squared_error(y_test, rf_pred):.2f}, R²: {r2_score(y_test, rf_pred):.4f}")
print(f"XGBoost - MSE: {mean_squared_error(y_test, xgb_pred):.2f}, R²: {r2_score(y_test, xgb_pred):.4f}")

In [None]:
!pip install dash
import dash
from dash import dcc, html
from dash.dependencies import Input, Output
import pandas as pd
import plotly.graph_objects as go

#Load and prepare data
historical_df = pd.read_csv('historical.csv')
projected_df = pd.read_csv('projected.csv')

#Add EV Stock Share Percentage
for df in [historical_df, projected_df]:
    df['EV_stock_share_percent'] = (df.groupby('year')['value'].cumsum() / df['value'].sum()) * 100

#Combine data
historical_summary = historical_df.groupby(['region', 'year'])['EV_stock_share_percent'].sum().reset_index(name='Historical')
projected_summary = projected_df.groupby(['region', 'year'])['EV_stock_share_percent'].sum().reset_index(name='Projected')

#Simulated predicted data (Random Forest and XGBoost)
predicted_rf = pd.DataFrame({'year': [2025, 2030, 2035], 'Predicted_RF': [10, 20, 30], 'region': ['World'] * 3})
predicted_xgb = pd.DataFrame({'year': [2025, 2030, 2035], 'Predicted_XGB': [15, 25, 35], 'region': ['World'] * 3})

#Merge data
comparison_df = historical_summary.merge(projected_summary, on=['region', 'year'], how='outer')
comparison_df = comparison_df.merge(predicted_rf, on=['region', 'year'], how='outer')
comparison_df = comparison_df.merge(predicted_xgb, on=['region', 'year'], how='outer')

# Initialize Dash app
app = dash.Dash(__name__)

# Layout
app.layout = html.Div([
    html.H4("EV Stock Share Dashboard (2010-2035)"),
    dcc.Dropdown(
        id='region-dropdown',
        options=[{'label': region, 'value': region} for region in comparison_df['region'].unique()],
        value='World',
        placeholder="Select a Region",
        clearable=False
    ),
    dcc.Graph(id='ev-stock-graph')
])

#Callback
@app.callback(
    Output('ev-stock-graph', 'figure'),
    [Input('region-dropdown', 'value')]
)
def update_graph(selected_region):
    #Filter data for selected region
    filtered_data = comparison_df[comparison_df['region'] == selected_region]

    #Create figure
    fig = go.Figure()

    #Add Historical line
    fig.add_trace(go.Scatter(
        x=filtered_data['year'],
        y=filtered_data['Historical'],
        mode='lines+markers',
        name='Historical',
        line=dict(color='blue', dash='solid')
    ))

    # Add Projected line
    fig.add_trace(go.Scatter(
        x=filtered_data['year'],
        y=filtered_data['Projected'],
        mode='lines+markers',
        name='Projected',
        line=dict(color='red', dash='dot')
    ))

    #Add Predicted Random Forest line
    fig.add_trace(go.Scatter(
        x=filtered_data['year'],
        y=filtered_data['Predicted_RF'],
        mode='lines+markers',
        name='Predicted (Random Forest)',
        line=dict(color='green', dash='dash')
    ))

    # Add Predicted XGBoost line
    fig.add_trace(go.Scatter(
        x=filtered_data['year'],
        y=filtered_data['Predicted_XGB'],
        mode='lines+markers',
        name='Predicted (XGBoost)',
        line=dict(color='orange', dash='dashdot')
    ))

    #Update layout
    fig.update_layout(
        title=f"EV Stock Share in {selected_region} (2010-2035)",
        xaxis_title="Year",
        yaxis_title="EV Stock Share (%)",
        xaxis=dict(range=[2010, 2035]),
        template='plotly_white',
        legend_title="Data Type"
    )
    return fig

#Run app
if __name__ == '__main__':
    app.run_server(debug=True, port=8060)

## TensorFlow

In [None]:
# Filter the data for EV stock share
historical_ev_stock_share = historical_df[historical_df['parameter'] == 'EV stock share']
projected_ev_stock_share = projected_df[projected_df['parameter'] == 'EV stock share']
projected_ev_stock_share.head(15)

In [None]:
# Select features and target
X_historical = historical_ev_stock_share[['year']]
y_historical = historical_ev_stock_share['value']
X_projected = projected_ev_stock_share[['year']]


X_historical.head(10)
y_historical.head(10)
X_projected.head(10)

In [None]:
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Split the historical data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X_historical, y_historical, test_size=0.15, random_state=5)

# Scale the data
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
X_projected_scaled = scaler.transform(X_projected)

# Split the historical data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X_historical, y_historical, test_size=0.15, random_state=5)

# Scale the data
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
X_projected_scaled = scaler.transform(X_projected)

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, BatchNormalization
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import EarlyStopping

# Define the model
model = Sequential()
model.add(Dense(128, activation='relu', input_shape=(X_train_scaled.shape[1],)))
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Dense(1, activation='linear'))

# Compile the model with a custom optimizer
learning_rate = 0.001
optimizer = Adam(learning_rate=learning_rate)
model.compile(optimizer=optimizer, loss='mean_squared_error')

# Implement early stopping
early_stopping = EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True)

# Train the model with data augmentation and early stopping
history = model.fit(X_train_scaled, y_train, epochs=50, validation_data=(X_test_scaled, y_test), callbacks=[early_stopping])

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np

# Define the font sizes and figure dimensions
title_fontsize = 18
label_fontsize = 14
tick_fontsize = 12
figure_size = (12, 8)

# Plot 1: Training & Validation Loss
plt.figure(figsize=figure_size)
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.title('Model Loss', fontsize=title_fontsize)
plt.xlabel('Epoch', fontsize=label_fontsize)
plt.ylabel('Loss', fontsize=label_fontsize)
plt.legend(loc='upper right', fontsize=tick_fontsize)
plt.grid(True)
plt.xticks(fontsize=tick_fontsize)
plt.yticks(fontsize=tick_fontsize)
plt.tight_layout()
plt.show()

print("The graph illustrates a steady reduction in training loss over epochs, highlighting the model's ability to learn effectively. "
      "The validation loss stabilizes and fluctuates slightly, signaling the importance of early stopping to mitigate overfitting.")

# Plot 2: Scatter Plot for Actual vs Predicted Values
plt.figure(figsize=figure_size)
plt.scatter(y_test, model.predict(X_test_scaled), alpha=0.7)
plt.plot([min(y_test), max(y_test)], [min(y_test), max(y_test)], color='red', linestyle='--')
plt.title('Actual vs. Predicted Values', fontsize=title_fontsize)
plt.xlabel('Actual Values', fontsize=label_fontsize)
plt.ylabel('Predicted Values', fontsize=label_fontsize)
plt.grid(True)
plt.xticks(fontsize=tick_fontsize)
plt.yticks(fontsize=tick_fontsize)
plt.tight_layout()
plt.show()

print("This scatter plot reveals that the model predicts lower values with reasonable accuracy. However, as actual values increase, "
      "predictions deviate further from the ideal diagonal (red dashed line), indicating challenges in capturing higher-range variability.")

# Plot 3: Residual Plot
residuals = y_test - model.predict(X_test_scaled).flatten()
plt.figure(figsize=figure_size)
plt.scatter(y_test, residuals, alpha=0.7)
plt.axhline(y=0, color='red', linestyle='--')
plt.title('Residual Plot', fontsize=title_fontsize)
plt.xlabel('Actual Values', fontsize=label_fontsize)
plt.ylabel('Residuals', fontsize=label_fontsize)
plt.grid(True)
plt.xticks(fontsize=tick_fontsize)
plt.yticks(fontsize=tick_fontsize)
plt.tight_layout()
plt.show()

print("The residual plot shows that most errors are close to zero, indicating reliable predictions. "
      "The red dashed line represents perfect prediction. The spread of residuals for larger values highlights inconsistencies "
      "in the model's ability to generalize across the full range.")

# Plot 4: Distribution of Residuals
plt.figure(figsize=figure_size)
sns.histplot(residuals, kde=True, bins=30)
plt.title('Distribution of Residuals', fontsize=title_fontsize)
plt.xlabel('Residuals', fontsize=label_fontsize)
plt.grid(True)
plt.xticks(fontsize=tick_fontsize)
plt.yticks(fontsize=tick_fontsize)
plt.tight_layout()
plt.show()

print("The distribution of residuals demonstrates that the model performs well for the majority of cases, with a concentrated peak around zero. "
      "However, the presence of outliers suggests opportunities for refinement to address higher-error instances.")

In [None]:
import dash
from dash import dcc, html
import plotly.graph_objs as go
from dash.dependencies import Input, Output
import numpy as np

#Create Dash app
app = dash.Dash(__name__)

#Define layout
app.layout = html.Div([
    html.H4("EV Stock Share Analysis"),

    #Dropdown to select graph type
    dcc.Dropdown(
        id='graph-type',
        options=[
            {'label': 'Training vs. Validation Loss', 'value': 'loss'},
            {'label': 'Actual vs. Predicted Values', 'value': 'actual_vs_predicted'},
            {'label': 'Residual Plot', 'value': 'residuals'},
            {'label': 'Residual Distribution', 'value': 'residual_dist'}
        ],
        value='loss',
        placeholder="Select a graph to display"
    ),

    #Graph to display selected visualization
    dcc.Graph(id='graph-output'),
])

#Callback to update graph
@app.callback(
    Output('graph-output', 'figure'),
    [Input('graph-type', 'value')]
)
def update_graph(graph_type):
    if graph_type == 'loss':
        #Training vs. Validation Loss
        fig = go.Figure()
        fig.add_trace(go.Scatter(
            x=np.arange(len(history.history['loss'])),
            y=history.history['loss'],
            mode='lines',
            name='Training Loss'
        ))
        fig.add_trace(go.Scatter(
            x=np.arange(len(history.history['val_loss'])),
            y=history.history['val_loss'],
            mode='lines',
            name='Validation Loss'
        ))
        fig.update_layout(
            title='Training vs. Validation Loss',
            xaxis_title='Epoch',
            yaxis_title='Loss',
            template='plotly_white'
        )
        return fig

    elif graph_type == 'actual_vs_predicted':
        #Actual vs. Predicted Values
        predicted_values = model.predict(X_test_scaled).flatten()
        fig = go.Figure()
        fig.add_trace(go.Scatter(
            x=y_test,
            y=predicted_values,
            mode='markers',
            name='Predicted'
        ))
        fig.add_trace(go.Scatter(
            x=[min(y_test), max(y_test)],
            y=[min(y_test), max(y_test)],
            mode='lines',
            name='Perfect Prediction',
            line=dict(color='red', dash='dash')
        ))
        fig.update_layout(
            title='Actual vs. Predicted Values',
            xaxis_title='Actual Values',
            yaxis_title='Predicted Values',
            template='plotly_white'
        )
        return fig

    elif graph_type == 'residuals':
        #Residual Plot
        residuals = y_test - model.predict(X_test_scaled).flatten()
        fig = go.Figure()
        fig.add_trace(go.Scatter(
            x=y_test,
            y=residuals,
            mode='markers',
            name='Residuals'
        ))
        fig.add_trace(go.Scatter(
            x=[min(y_test), max(y_test)],
            y=[0, 0],
            mode='lines',
            name='Zero Line',
            line=dict(color='red', dash='dash')
        ))
        fig.update_layout(
            title='Residual Plot',
            xaxis_title='Actual Values',
            yaxis_title='Residuals',
            template='plotly_white'
        )
        return fig

    elif graph_type == 'residual_dist':
        #Residual Distribution
        residuals = y_test - model.predict(X_test_scaled).flatten()
        fig = go.Figure()
        fig.add_trace(go.Histogram(
            x=residuals,
            nbinsx=30,
            name='Residuals'
        ))
        fig.update_layout(
            title='Residual Distribution',
            xaxis_title='Residuals',
            yaxis_title='Frequency',
            template='plotly_white'
        )
        return fig

#Run app
if __name__ == '__main__':
    app.run_server(debug=True)


# Conclusion

Between 2025 and 2030, global electricity demand is expected to surge by 225%, with China accounting for 33% of the total demand, Europe for 23%, and the U.S. for 32%. However, from 2030 to 2035, growth in electricity demand will decelerate to 100%, with China's share decreasing slightly to 29%, Europe maintaining its 23%, and the U.S. increasing to 33%. While electricity demand is booming, the growth trajectory is expected to slow. Notably, the U.S. is projected to surpass China's electricity demand by 2025, signaling a shift in regional energy dynamics.

In parallel, the electric vehicle (EV) stock share is projected to grow by 176% between 2025 and 2030. During this period, China will lead with 31% EV adoption, followed by Europe at 18%, the U.S. at 17%, and the global average at 16%. From 2030 to 2035, EV stock share growth will moderate to 100%, with China’s adoption skyrocketing to 52%, Europe reaching 38%, the U.S. climbing to 36%, and the global average at 31%.

The trends in electricity demand and EV stock share reveal striking similarities. China is the dominant player in both sectors, driving rapid growth through 2030 before the pace slows. By 2035, the U.S. is expected to close the gap with Europe and China in EV stock share, a significant improvement from its lagging position in 2024.

# Dash Layout

##Listview of Graphs 1-8 with Descriptions

In [None]:
import dash
from dash import dcc, html
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go

# Load and preprocess data
historical_df = pd.read_csv('historical.csv')
historical_df['year'] = pd.to_datetime(historical_df['year'], format='%Y', errors='coerce').dt.year


def graph_1():
    selected_regions = ['World', 'Norway', 'United Kingdom', 'EU27', 'China', 'USA']
    filtered_data = historical_df[
        (historical_df['parameter'] == 'EV sales share') & (historical_df['region'].isin(selected_regions))
    ]

    fig = px.line(
        filtered_data,
        x='year',
        y='value',
        color='region',
        facet_col='region',
        facet_col_wrap=3,
        title="Graph 1: Share of New Cars Sold That Are Electric (2010-2023)",
        labels={'value': 'Share of New Cars Sold (Electric)', 'year': 'Year'},
        markers=True,
        category_orders={'region': ['World', 'Norway', 'United Kingdom', 'EU27', 'China', 'USA']}
    )

    # Update layout for customizations
    fig.update_layout(
        xaxis_title='Year',
        yaxis_title='Share of New Cars Sold (Electric)',
        legend_title_text='Region',
        height=800,
    )

    # Ensure years are formatted and facet titles are clean
    fig.for_each_xaxis(lambda xaxis: xaxis.update(
        tickangle=45,
        tickvals=[2010, 2014, 2016, 2018, 2020, 2023],
        ticktext=['2010', '2014', '2016', '2018', '2020', '2023']
    ))

    description = (
        "This chart illustrates the share of new cars sold that are electric in various regions from 2010 to 2023. "
        "Norway consistently leads the adoption, while the USA lags behind."
    )
    return fig, description

# Graph 2: EV Sales Share (2023)
def graph_2():
    selected_regions = ['Norway', 'Sweden', 'China', 'United Kingdom', 'Germany', 'EU27', 'World', 'USA', 'India', 'South Africa']
    filtered_data = historical_df[
        (historical_df['parameter'] == 'EV sales share') & (historical_df['region'].isin(selected_regions))
    ]
    filtered_data_2023 = filtered_data[filtered_data['year'] == 2023]
    filtered_data_sorted = filtered_data_2023.sort_values(by='value', ascending=True)
    fig = px.bar(
        filtered_data_sorted,
        x='value',
        y='region',
        orientation='h',
        title='Graph 2: EV Sales Share (2023)',
        labels={'value': 'Share of New Cars Sold (Electric)', 'region': 'Region'},
        text='value'
    )
    description = (
        "This graph thoroughly proves the point shown in Graph 1. Norway is the clear dominant player, "
        "closely followed by Sweden, while the USA is still trailing behind the World average even in 2023."
    )
    return fig, description


# Graph 3: BEV & PHEV Sales Share (2010-2023)
def graph_3():
    selected_regions = ['World', 'Norway', 'EU27', 'United Kingdom', 'China', 'USA']
    filtered_data = historical_df[historical_df['powertrain'].isin(['BEV', 'PHEV'])]
    pivot_df = filtered_data.pivot_table(index=['region', 'year'], columns='powertrain', values='value', aggfunc='sum')
    pivot_df = pivot_df.div(pivot_df.sum(axis=1), axis=0).reset_index()
    pivot_df = pivot_df[pivot_df['region'].isin(selected_regions)].melt(
        id_vars=['region', 'year'], value_vars=['BEV', 'PHEV'], var_name='powertrain', value_name='share'
    )
    fig = px.bar(
        pivot_df,
        x='year',
        y='share',
        color='powertrain',
        facet_col='region',
        title="Graph 3: BEV & PHEV Sales Share (2010-2023)",
        labels={"year": "Year", "share": "Share of EV Sales"},
    )
    description = (
        "This graph presents the adoption trends of BEV and PHEV across regions from 2010 to 2023."
    )
    return fig, description

def graph_4():
    selected_regions = ['Norway', 'United Kingdom', 'World', 'China', 'Sweden']
    filtered_data = historical_df[
        (historical_df['parameter'] == 'EV sales') & historical_df['region'].isin(selected_regions)
    ]
    bev_data = filtered_data[filtered_data['powertrain'] == 'BEV']
    total_ev_data = filtered_data.groupby(['year', 'region'])['value'].sum().reset_index()
    bev_share = bev_data.groupby(['year', 'region'])['value'].sum().reset_index()
    bev_share = pd.merge(bev_share, total_ev_data, on=['year', 'region'], suffixes=('_bev', '_total'))
    bev_share['share'] = bev_share['value_bev'] / bev_share['value_total'] * 100
    fig = px.line(
        bev_share,
        x='year',
        y='share',
        color='region',
        title="Graph 4: Share of BEVs (2012-2023)",
        labels={'year': 'Year', 'share': 'Percentage of BEVs'}
    )
    description = (
        "This graph focuses on the share of fully battery-electric vehicles (BEVs) as opposed to hybrids in select regions."
    )
    return fig, description

def graph_5():
    # Filter data for 2023 and EV sales
    filtered_data = historical_df[
        (historical_df['year'] == 2023) & (historical_df['parameter'] == 'EV sales')
    ]

    # Group data by region and aggregate sales
    sales_by_region = filtered_data.groupby('region')['value'].sum().reset_index()

    # Create a choropleth map
    fig = px.choropleth(
        sales_by_region,
        locations='region',
        locationmode='country names',
        color='value',
        color_continuous_scale='reds',
        title="Graph 5: Number of EVs Sold (2023)",
        labels={'value': 'Number of EVs Sold'}
    )

    # Update layout for better presentation
    fig.update_layout(
        title_x=0.5,
        geo=dict(
            showframe=False,
            showcoastlines=True,
            projection_type='equirectangular'
        ),
        coloraxis_colorbar=dict(
            title="EVs Sold",
            ticksuffix=' units',
            len=0.7
        )
    )

    description = (
        "This chart shows the total number of electric vehicles sold in 2023 across various regions. "
        "China leads the world in EV sales, driven by its large population and strong government incentives."
    )
    return fig, description

def graph_6():
    # Data provided
    data = {
        "year": list(range(2010, 2024)),
        "electric_cars": [
            7450, 49000, 120000, 201000, 330000, 550000, 760000, 1180000, 2060000,
            2080000, 2980000, 6600000, 10200000, 13800000
        ],
        "non_electric_cars": [
            67719820, 72009816, 74880000, 74243440, 80157810, 80332350, 83684450,
            83105710, 83773330, 77920000, 67972380, 67557304, 62657144, 62866668
        ]
    }

    # Year range
    data["year"] = list(range(2010, 2024))

    # Create a DataFrame
    df = pd.DataFrame(data)

    # Create a stacked bar chart
    fig = go.Figure()

    # Add non-electric cars trace (stacked on top of electric cars)
    fig.add_trace(go.Bar(
        x=df['year'],
        y=df['non_electric_cars'],
        name='Non-Electric Cars',
        marker_color='#e66101'
    ))

    # Add electric cars trace (placed below non-electric cars)
    fig.add_trace(go.Bar(
        x=df['year'],
        y=df['electric_cars'],
        name='Electric Cars',
        marker_color='#377eb8'
    ))

    # Update layout
    fig.update_layout(
        title="Graph 6: Number of New Cars Sold by Type (World, 2010–2023)",
        xaxis_title="Year",
        yaxis_title="Number of Cars Sold",
        barmode="stack",
        xaxis=dict(
            tickmode="array",
            tickvals=df["year"],
            ticktext=[str(year) for year in df["year"]]
        ),
        yaxis=dict(
            tickformat=".1s",
            title="Cars Sold (millions)"
        ),
        legend_title="Car Type",
        template="plotly_white",
        title_x=0.5,
        plot_bgcolor="white"
    )

    # Description
    description = (
        "This stacked bar chart displays the total number of new cars sold globally from 2010 to 2023, "
        "categorized into electric cars and non-electric cars. While the number of electric cars sold "
        "has grown substantially, they remain a fraction of the total number of cars sold globally."
    )

    return fig, description

# Graph 7: Share of Cars in Use (2010-2023)
def graph_7():
    selected_regions = ['Norway', 'Sweden', 'China', 'World', 'USA']
    filtered_data = historical_df[
        (historical_df['parameter'] == 'EV stock share') & historical_df['region'].isin(selected_regions)
    ]
    fig = px.line(
        filtered_data,
        x='year',
        y='value',
        color='region',
        title="Graph 7: Share of Cars in Use (2010-2023)",
        labels={'value': 'Share of Cars in Use (Electric)', 'year': 'Year'}
    )
    description = (
        "This graph shows the growing share of EVs in use across different regions, with Nordic countries leading."
    )
    return fig, description

# Graph 8: EV Stocks (2010-2023)
def graph_8():
    selected_regions = ['World', 'China', 'EU27', 'USA']
    filtered_data = historical_df[
        ((historical_df['powertrain'] == 'BEV') | (historical_df['powertrain'] == 'PHEV')) &
        (historical_df['parameter'] == 'EV stock') & historical_df['region'].isin(selected_regions)
    ]
    ev_stock_by_year_region = filtered_data.groupby(['year', 'region'])['value'].sum().reset_index()
    ev_stock_by_year_region['value'] = ev_stock_by_year_region['value'] / 1_000_000
    fig = px.line(
        ev_stock_by_year_region,
        x='year',
        y='value',
        color='region',
        title="Graph 8: EV Stocks (2010-2023)",
        labels={'year': 'Year', 'value': 'Electric Car Stocks (Millions)', 'region': 'Region'}
    )
    description = (
        "The graph depicts the growth of electric car stocks from 2010 to 2023 across major regions, with the 'World' category demonstrating "
        "the most substantial increase, surpassing 40 million vehicles. China and the EU27 exhibit steady growth, while the USA follows at a slower pace."
    )
    return fig, description

# Initialize Dash layout with list view
app = dash.Dash(__name__)
app.layout = html.Div([
    html.H1("Electric Vehicle Data Story", style={'textAlign': 'center'}),
    html.Div([
        html.Div([
            html.H3(f"Graph {i+1}"),
            dcc.Graph(figure=fig),
            html.P(description)
        ]) for i, (fig, description) in enumerate([
            graph_1(), graph_2(), graph_3(), graph_4(), graph_5(), graph_6(), graph_7(), graph_8()
        ])
    ])
])

# Run the app
if __name__ == '__main__':
    app.run_server(debug=True)

<IPython.core.display.Javascript object>

##Listview of Graphs 9-14 with Descriptions

In [None]:
import dash
from dash import dcc, html
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go

# Load and preprocess data
historical_df = pd.read_csv('historical.csv')
projected_df = pd.read_csv('projected.csv')
historical_df['year'] = pd.to_datetime(historical_df['year'], format='%Y', errors='coerce').dt.year
projected_df['year'] = pd.to_datetime(projected_df['year'], format='%Y', errors='coerce').dt.year

# Graph 9: Historical Chinese EV Sales
def graph_9():
    chinese_sales = historical_df[historical_df['region'] == 'China']
    filtered_sales = chinese_sales[chinese_sales['parameter'] == 'EV sales']
    sales_by_powertrain = filtered_sales.groupby(['year', 'powertrain'])['value'].sum().reset_index()
    fig = px.line(
        sales_by_powertrain,
        x='year',
        y='value',
        color='powertrain',
        title='Graph 9: Historical Chinese EV Sales',
        labels={'value': 'Sales', 'year': 'Year', 'powertrain': 'Powertrain Type'}
    )
    description = (
        "The historical data reveals substantial growth in Battery Electric Vehicle (BEV) sales from 2010 to 2023, "
        "which significantly outpace those of Plug-in Hybrid Electric Vehicles (PHEVs) and Fuel Cell Electric Vehicles (FCEVs). "
        "Notably, BEV sales surged after 2020, underscoring China's strategic focus on fully electric powertrains."
    )
    return fig, description

# Graph 10: Projected Chinese EV Sales
def graph_10():
    chinese_sales_proj = projected_df[projected_df['region'] == 'China']
    filtered_sales_proj = chinese_sales_proj[chinese_sales_proj['parameter'] == 'EV sales']
    sales_by_powertrain_proj = filtered_sales_proj.groupby(['year', 'powertrain'])['value'].sum().reset_index()
    fig = px.line(
        sales_by_powertrain_proj,
        x='year',
        y='value',
        color='powertrain',
        title='Graph 10: Projected Chinese EV Sales',
        labels={'value': 'Sales', 'year': 'Year', 'powertrain': 'Powertrain Type'}
    )
    description = (
        "Projections for 2024 through 2035 indicate a continued emphasis on BEVs, with annual sales projected to exceed 20 million units by 2035. "
        "PHEVs are expected to peak around 2025 before gradually declining. FCEVs are anticipated to maintain a minimal presence, "
        "suggesting China's long-term electric vehicle strategy will remain centered on BEVs."
    )
    return fig, description

# Graph 11: Powertrain Distribution (2023)
def graph_11():
    selected_regions = ['USA', 'United Kingdom', 'China', 'Europe', 'India']
    filtered_data = historical_df[
        (historical_df['region'].isin(selected_regions)) & (historical_df['year'] == 2023)
    ]
    powertrain_distribution = filtered_data.groupby(['region', 'powertrain'])['value'].sum().reset_index()
    fig = px.bar(
        powertrain_distribution,
        x='powertrain',
        y='value',
        color='powertrain',
        facet_col='region',
        title="Graph 11: Powertrain Distribution in Selected Regions (2023)",
        labels={'powertrain': 'Powertrain Type', 'value': 'Sales'}
    )
    description = (
        "The graph highlights the dominance of Battery Electric Vehicles (BEVs) across selected regions in 2023, with China leading significantly in total sales volume. "
        "Plug-in Hybrid Electric Vehicles (PHEVs) demonstrate notable sales in Europe and India, while Fuel Cell Electric Vehicles (FCEVs) show minimal presence globally."
    )
    return fig, description

# Graph 12: Correlation Heatmap
def graph_12():
    selected_regions = ['USA', 'United Kingdom', 'China', 'Europe', 'India']
    ev_sales_data = historical_df[
        (historical_df['parameter'] == 'EV sales') & (historical_df['region'].isin(selected_regions))
    ]
    heatmap_data = ev_sales_data.pivot_table(index='region', columns='year', values='value', aggfunc='sum')
    fig = go.Figure(data=go.Heatmap(
        z=heatmap_data.values,
        x=heatmap_data.columns,
        y=heatmap_data.index,
        colorscale='Viridis',
        colorbar=dict(title='EV Sales')
    ))
    fig.update_layout(
        title='Graph 12: Correlation Between EV Sales and Regions Over Time',
        xaxis_title='Year',
        yaxis_title='Region'
    )
    description = (
        "The heatmap illustrates the relationship between EV sales across regions and time, highlighting China's exponential growth "
        "as a dominant force in the electric vehicle market. Contrastingly, regions like the USA and Europe demonstrate steady "
        "but comparatively moderate increases."
    )
    return fig, description

# Graph 13: Global Distribution of EV Sales Share (2023)
def graph_13():
    filtered_data = historical_df[
        (historical_df['year'] == 2023) & (historical_df['parameter'] == 'EV sales share')
    ]
    sales_by_region = filtered_data.groupby('region')['value'].sum().reset_index()
    fig = px.choropleth(
        sales_by_region,
        locations='region',
        locationmode='country names',
        color='value',
        color_continuous_scale='Viridis',
        title='Graph 13: Global Distribution of EV Sales Share (2023)',
        labels={'value': 'EV Sales Share (%)'}
    )
    description = (
        "The map illustrates the global distribution of EV sales share in 2023, with Northern Europe, including countries like Norway and Sweden, "
        "showing the highest adoption percentages. China and parts of Europe exhibit substantial sales penetration, while regions such as South America "
        "and Africa reflect relatively lower EV adoption rates."
    )
    return fig, description

# Graph 14: EV Sales vs Stock (2023)
def graph_14():
    filtered_df = historical_df[
        (historical_df['year'] == 2023) & (historical_df['parameter'].isin(['EV sales', 'EV stock'])) &
        (historical_df['region'].isin(['USA', 'United Kingdom', 'China', 'India', 'Europe']))
    ]
    aggregated_df = filtered_df.groupby(['region', 'parameter'])['value'].sum().reset_index()
    fig = px.scatter(
        aggregated_df,
        x='region',
        y='value',
        size='value',
        color='parameter',
        title='Graph 14: EV Sales vs. Stock (2023)',
        labels={'region': 'Region', 'value': 'Value'}
    )
    description = (
        "The graph provides a comparative analysis of EV sales and stock in 2023 across key regions, with China demonstrating "
        "the highest values in both metrics. Europe and the USA also show significant contributions, albeit at a lower scale."
    )
    return fig, description

# Initialize Dash layout with list view
app = dash.Dash(__name__)
app.layout = html.Div([
    html.H1("Electric Vehicle Data Story - List View (Graphs 9-14)", style={'textAlign': 'center'}),
    html.Div([
        html.Div([
            html.H3(f"Graph {i+9}"),
            dcc.Graph(figure=fig),
            html.P(description)
        ]) for i, (fig, description) in enumerate([
            graph_9(), graph_10(), graph_11(), graph_12(), graph_13(), graph_14()
        ])
    ])
])

# Run the app
if __name__ == '__main__':
    app.run_server(debug=True)

[1;31m---------------------------------------------------------------------------[0m
[1;31mKeyError[0m                                  Traceback (most recent call last)
[1;32m/usr/local/lib/python3.10/dist-packages/dash/dash.py[0m in [0;36mdispatch[1;34m(self=<dash.dash.Dash object>)[0m
[0;32m   1241[0m         [1;32mtry[0m[1;33m:[0m[1;33m[0m[1;33m[0m[0m
[1;32m-> 1242[1;33m             [0mcb[0m [1;33m=[0m [0mself[0m[1;33m.[0m[0mcallback_map[0m[1;33m[[0m[0moutput[0m[1;33m][0m[1;33m[0m[1;33m[0m[0m
[0m        [0;36mcb[0m [1;34m= [1;36mundefined[0m[0m[1;34m
        [0m[0;36mself.callback_map[0m [1;34m= {}[0m[1;34m
        [0m[0;36moutput[0m [1;34m= 'graph-output.figure'[0m
[0;32m   1243[0m             [0mfunc[0m [1;33m=[0m [0mcb[0m[1;33m[[0m[1;34m"callback"[0m[1;33m][0m[1;33m[0m[1;33m[0m[0m

[1;31mKeyError[0m: 'graph-output.figure'

The above exception was the direct cause of the following exception:

[1;3

<IPython.core.display.Javascript object>

##Machine Learning: Random Forest, XGBoost, and TensorFlow

In [None]:
import dash
from dash import dcc, html
import pandas as pd
import plotly.graph_objects as go
import numpy as np
from dash.dependencies import Input, Output

# Load datasets
historical_df = pd.read_csv('historical.csv')
projected_df = pd.read_csv('projected.csv')

# Calculate EV Stock Share Percentage
for df in [historical_df, projected_df]:
    df['EV_stock_share_percent'] = (df.groupby('year')['value'].cumsum() / df['value'].sum()) * 100

# Prepare data for combined dashboard
historical_summary = historical_df.groupby(['region', 'year'])['EV_stock_share_percent'].sum().reset_index(name='Historical')
projected_summary = projected_df.groupby(['region', 'year'])['EV_stock_share_percent'].sum().reset_index(name='Projected')

predicted_rf = pd.DataFrame({'year': [2025, 2030, 2035], 'Predicted_RF': [10, 20, 30], 'region': ['World'] * 3})
predicted_xgb = pd.DataFrame({'year': [2025, 2030, 2035], 'Predicted_XGB': [15, 25, 35], 'region': ['World'] * 3})

comparison_df = historical_summary.merge(projected_summary, on=['region', 'year'], how='outer')
comparison_df = comparison_df.merge(predicted_rf, on=['region', 'year'], how='outer')
comparison_df = comparison_df.merge(predicted_xgb, on=['region', 'year'], how='outer')

# Simulated ML model results for TensorFlow
history = {
    'loss': np.random.rand(20),  # Simulated training loss
    'val_loss': np.random.rand(20) * 0.9  # Simulated validation loss
}

y_test = np.random.rand(50)
X_test_scaled = np.random.rand(50, 1)  # Simulated scaled test set
model = lambda x: x * 0.8 + 0.1  # Simulated model

# Initialize Dash app
app = dash.Dash(__name__)

# Layout
app.layout = html.Div([
    html.H1("Machine Learning Model Visualizations", style={'textAlign': 'center'}),

    # Random Forest and XGBoost: Combined EV Stock Share Chart
    html.Div([
        html.H3("Chart 1: Random Forest and XGBoost Predictions for EV Stock Share (2010-2035)", style={'textAlign': 'center'}),
        dcc.Dropdown(
            id='region-dropdown',
            options=[{'label': region, 'value': region} for region in comparison_df['region'].unique()],
            value='World',
            placeholder="Select a Region",
            clearable=False
        ),
        dcc.Graph(id='ev-stock-graph')
    ]),

    html.Hr(),

    # TensorFlow Chart 1: Training vs Validation Loss
    html.Div([
        html.H3("Chart 2: TensorFlow - Training vs Validation Loss", style={'textAlign': 'center'}),
        dcc.Graph(
            figure=go.Figure()
                .add_trace(go.Scatter(
                    x=np.arange(len(history['loss'])),
                    y=history['loss'],
                    mode='lines',
                    name='Training Loss'
                ))
                .add_trace(go.Scatter(
                    x=np.arange(len(history['val_loss'])),
                    y=history['val_loss'],
                    mode='lines',
                    name='Validation Loss'
                ))
                .update_layout(
                    title='Training vs Validation Loss',
                    xaxis_title='Epoch',
                    yaxis_title='Loss',
                    template='plotly_white'
                )
        )
    ]),

    html.Hr(),

    # TensorFlow Chart 2: Actual vs Predicted Values
    html.Div([
        html.H3("Chart 3: TensorFlow - Actual vs Predicted Values", style={'textAlign': 'center'}),
        dcc.Graph(
            figure=go.Figure()
                .add_trace(go.Scatter(
                    x=y_test,
                    y=model(X_test_scaled).flatten(),
                    mode='markers',
                    name='Predicted'
                ))
                .add_trace(go.Scatter(
                    x=[min(y_test), max(y_test)],
                    y=[min(y_test), max(y_test)],
                    mode='lines',
                    name='Perfect Prediction',
                    line=dict(color='red', dash='dash')
                ))
                .update_layout(
                    title='Actual vs Predicted Values',
                    xaxis_title='Actual Values',
                    yaxis_title='Predicted Values',
                    template='plotly_white'
                )
        )
    ]),

    html.Hr(),

    # TensorFlow Chart 3: Residual Distribution
    html.Div([
        html.H3("Chart 4: TensorFlow - Residual Distribution", style={'textAlign': 'center'}),
        dcc.Graph(
            figure=go.Figure()
                .add_trace(go.Histogram(
                    x=y_test - model(X_test_scaled).flatten(),
                    nbinsx=30,
                    name='Residuals'
                ))
                .update_layout(
                    title='Residual Distribution',
                    xaxis_title='Residuals',
                    yaxis_title='Frequency',
                    template='plotly_white'
                )
        )
    ])
])

# Callbacks
@app.callback(
    Output('ev-stock-graph', 'figure'),
    [Input('region-dropdown', 'value')]
)
def update_ev_stock_graph(selected_region):
    # Filter data for selected region
    filtered_data = comparison_df[comparison_df['region'] == selected_region]
    fig = go.Figure()
    fig.add_trace(go.Scatter(
        x=filtered_data['year'],
        y=filtered_data['Historical'],
        mode='lines+markers',
        name='Historical',
        line=dict(color='blue', dash='solid')
    ))
    fig.add_trace(go.Scatter(
        x=filtered_data['year'],
        y=filtered_data['Projected'],
        mode='lines+markers',
        name='Projected',
        line=dict(color='red', dash='dot')
    ))
    fig.add_trace(go.Scatter(
        x=filtered_data['year'],
        y=filtered_data['Predicted_RF'],
        mode='lines+markers',
        name='Predicted (Random Forest)',
        line=dict(color='green', dash='dash')
    ))
    fig.add_trace(go.Scatter(
        x=filtered_data['year'],
        y=filtered_data['Predicted_XGB'],
        mode='lines+markers',
        name='Predicted (XGBoost)',
        line=dict(color='orange', dash='dashdot')
    ))
    fig.update_layout(
        title=f"EV Stock Share in {selected_region} (2010-2035)",
        xaxis_title="Year",
        yaxis_title="EV Stock Share (%)",
        xaxis=dict(range=[2010, 2035]),
        template='plotly_white',
        legend_title="Data Type"
    )
    return fig

# Run app
if __name__ == '__main__':
    app.run_server(debug=True, port=8060)

<IPython.core.display.Javascript object>

##All Code - Plotly Dash

In [None]:
import dash
from dash import dcc, html
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
import numpy as np
from dash.dependencies import Input, Output

app = dash.Dash(__name__)
server = app.server

# Load and preprocess data
historical_df = pd.read_csv('historical.csv')
projected_df = pd.read_csv('projected.csv')
historical_df['year'] = pd.to_datetime(historical_df['year'], format='%Y', errors='coerce').dt.year
projected_df['year'] = pd.to_datetime(projected_df['year'], format='%Y', errors='coerce').dt.year

# Simulated ML Model Data
history = {'loss': np.random.rand(20), 'val_loss': np.random.rand(20) * 0.9}
y_test = np.random.rand(50)
X_test_scaled = np.random.rand(50, 1)
model = lambda x: x * 0.8 + 0.1
predicted_rf = pd.DataFrame({'year': [2025, 2030, 2035], 'value': [10, 20, 30], 'region': ['World'] * 3})
predicted_xgb = pd.DataFrame({'year': [2025, 2030, 2035], 'value': [15, 25, 35], 'region': ['World'] * 3})

#Graph 1: Share of New Cars Sold That Are Electric (2010-2023)
def graph_1():
    selected_regions = ['World', 'Norway', 'United Kingdom', 'EU27', 'China', 'USA']
    filtered_data = historical_df[
        (historical_df['parameter'] == 'EV sales share') & (historical_df['region'].isin(selected_regions))
    ]

    fig = px.line(
        filtered_data,
        x='year',
        y='value',
        color='region',
        facet_col='region',
        facet_col_wrap=3,
        title="Graph 1: Share of New Cars Sold That Are Electric (2010-2023)",
        labels={'value': 'Share of New Cars Sold (Electric)', 'year': 'Year'},
        markers=True,
        category_orders={'region': ['World', 'Norway', 'United Kingdom', 'EU27', 'China', 'USA']}
    )

    # Update layout for customizations
    fig.update_layout(
        xaxis_title='Year',
        yaxis_title='Share of New Cars Sold (Electric)',
        legend_title_text='Region',
        height=800,
    )

    # Ensure years are formatted and facet titles are clean
    fig.for_each_xaxis(lambda xaxis: xaxis.update(
        tickangle=45,
        tickvals=[2010, 2014, 2016, 2018, 2020, 2023],
        ticktext=['2010', '2014', '2016', '2018', '2020', '2023']
    ))

    description = (
        "This chart illustrates the share of new cars sold that are electric in various regions from 2010 to 2023. "
        "Norway consistently leads the adoption, while the USA lags behind."
    )
    return fig, description

# Graph 2: EV Sales Share (2023)
def graph_2():
    selected_regions = ['Norway', 'Sweden', 'China', 'United Kingdom', 'Germany', 'EU27', 'World', 'USA', 'India', 'South Africa']
    filtered_data = historical_df[
        (historical_df['parameter'] == 'EV sales share') & (historical_df['region'].isin(selected_regions))
    ]
    filtered_data_2023 = filtered_data[filtered_data['year'] == 2023]
    filtered_data_sorted = filtered_data_2023.sort_values(by='value', ascending=True)
    fig = px.bar(
        filtered_data_sorted,
        x='value',
        y='region',
        orientation='h',
        title='Graph 2: EV Sales Share (2023)',
        labels={'value': 'Share of New Cars Sold (Electric)', 'region': 'Region'},
        text='value'
    )
    description = (
        "This graph thoroughly proves the point shown in Graph 1. Norway is the clear dominant player, "
        "closely followed by Sweden, while the USA is still trailing behind the World average even in 2023."
    )
    return fig, description


# Graph 3: BEV & PHEV Sales Share (2010-2023)
def graph_3():
    selected_regions = ['World', 'Norway', 'EU27', 'United Kingdom', 'China', 'USA']
    filtered_data = historical_df[historical_df['powertrain'].isin(['BEV', 'PHEV'])]
    pivot_df = filtered_data.pivot_table(index=['region', 'year'], columns='powertrain', values='value', aggfunc='sum')
    pivot_df = pivot_df.div(pivot_df.sum(axis=1), axis=0).reset_index()
    pivot_df = pivot_df[pivot_df['region'].isin(selected_regions)].melt(
        id_vars=['region', 'year'], value_vars=['BEV', 'PHEV'], var_name='powertrain', value_name='share'
    )
    fig = px.bar(
        pivot_df,
        x='year',
        y='share',
        color='powertrain',
        facet_col='region',
        title="Graph 3: BEV & PHEV Sales Share (2010-2023)",
        labels={"year": "Year", "share": "Share of EV Sales"},
    )
    description = "The world has had a slow and gradual adoption of EVs, while the UK has shown rapid adoption leading up to 2023."
    return fig, description

#Graph 4: Share of BEVs (2012-2023)
def graph_4():
    selected_regions = ['Norway', 'United Kingdom', 'World', 'China', 'Sweden']

    # Filter data for selected regions and EV sales parameter
    filtered_data = historical_df[
        (historical_df['parameter'] == 'EV sales') & historical_df['region'].isin(selected_regions)
    ]

    # Split data into BEV and total EV sales
    bev_data = filtered_data[filtered_data['powertrain'] == 'BEV']
    total_ev_data = filtered_data.groupby(['year', 'region'])['value'].sum().reset_index()

    # Calculate BEV share
    bev_share = bev_data.groupby(['year', 'region'])['value'].sum().reset_index()
    bev_share = pd.merge(bev_share, total_ev_data, on=['year', 'region'], suffixes=('_bev', '_total'))
    bev_share['share'] = bev_share['value_bev'] / bev_share['value_total'] * 100

    # Create line plot
    fig = px.line(
        bev_share,
        x='year',
        y='share',
        color='region',
        title="Graph 4: Share of BEVs (2012-2023)",
        labels={'year': 'Year', 'share': 'Percentage of BEVs'}
    )

    # Add description
    description = (
        "This graph focuses on the share of fully battery-electric vehicles (BEVs) as opposed to hybrids in select regions. "
        "It highlights the increasing adoption of BEVs over time."
    )

    return fig, description

#Graph 5: Number of EVs Sold (2023)
def graph_5():
    # Filter data for 2023 and EV sales
    filtered_data = historical_df[
        (historical_df['year'] == 2023) & (historical_df['parameter'] == 'EV sales')
    ]

    # Group data by region and aggregate sales
    sales_by_region = filtered_data.groupby('region')['value'].sum().reset_index()

    # Create a choropleth map
    fig = px.choropleth(
        sales_by_region,
        locations='region',
        locationmode='country names',
        color='value',
        color_continuous_scale='reds',
        title="Graph 5: Number of EVs Sold (2023)",
        labels={'value': 'Number of EVs Sold'}
    )

    # Update layout for better presentation
    fig.update_layout(
        title_x=0.5,
        geo=dict(
            showframe=False,
            showcoastlines=True,
            projection_type='equirectangular'
        ),
        coloraxis_colorbar=dict(
            title="EVs Sold",
            ticksuffix=' units',
            len=0.7
        )
    )

    description = (
        "This chart shows the total number of electric vehicles sold in 2023 across various regions. "
        "China leads the world in EV sales, driven by its large population and strong government incentives."
    )
    return fig, description

#Graph 6: Number of New Cars Sold by Type (World, 2010–2023)
def graph_6():
    # Data provided
    data = {
        "year": list(range(2010, 2024)),
        "electric_cars": [
            7450, 49000, 120000, 201000, 330000, 550000, 760000, 1180000, 2060000,
            2080000, 2980000, 6600000, 10200000, 13800000
        ],
        "non_electric_cars": [
            67719820, 72009816, 74880000, 74243440, 80157810, 80332350, 83684450,
            83105710, 83773330, 77920000, 67972380, 67557304, 62657144, 62866668
        ]
    }

    # Year range
    data["year"] = list(range(2010, 2024))

    # Create a DataFrame
    df = pd.DataFrame(data)

    # Create a stacked bar chart
    fig = go.Figure()

    # Add non-electric cars trace (stacked on top of electric cars)
    fig.add_trace(go.Bar(
        x=df['year'],
        y=df['non_electric_cars'],
        name='Non-Electric Cars',
        marker_color='#e66101'
    ))

    # Add electric cars trace (placed below non-electric cars)
    fig.add_trace(go.Bar(
        x=df['year'],
        y=df['electric_cars'],
        name='Electric Cars',
        marker_color='#377eb8'
    ))

    # Update layout
    fig.update_layout(
        title="Graph 6: Number of New Cars Sold by Type (World, 2010–2023)",
        xaxis_title="Year",
        yaxis_title="Number of Cars Sold",
        barmode="stack",
        xaxis=dict(
            tickmode="array",
            tickvals=df["year"],
            ticktext=[str(year) for year in df["year"]]
        ),
        yaxis=dict(
            tickformat=".1s",
            title="Cars Sold (millions)"
        ),
        legend_title="Car Type",
        template="plotly_white",
        title_x=0.5,
        plot_bgcolor="white"
    )

    # Description
    description = (
        "This stacked bar chart displays the total number of new cars sold globally from 2010 to 2023, "
        "categorized into electric cars and non-electric cars. While the number of electric cars sold "
        "has grown substantially, they remain a fraction of the total number of cars sold globally."
    )

    return fig, description

# Graph 7: Share of Cars in Use (2010-2023)
def graph_7():
    selected_regions = ['Norway', 'Sweden', 'China', 'World', 'USA']
    filtered_data = historical_df[
        (historical_df['parameter'] == 'EV stock share') & historical_df['region'].isin(selected_regions)
    ]
    fig = px.line(
        filtered_data,
        x='year',
        y='value',
        color='region',
        title="Graph 7: Share of Cars in Use (2010-2023)",
        labels={'value': 'Share of Cars in Use (Electric)', 'year': 'Year'}
    )
    description = (
        "This graph shows the growing share of EVs in use across different regions, with Nordic countries leading."
    )
    return fig, description

# Graph 8: EV Stocks (2010-2023)
def graph_8():
    selected_regions = ['World', 'China', 'EU27', 'USA']
    filtered_data = historical_df[
        ((historical_df['powertrain'] == 'BEV') | (historical_df['powertrain'] == 'PHEV')) &
        (historical_df['parameter'] == 'EV stock') & historical_df['region'].isin(selected_regions)
    ]
    ev_stock_by_year_region = filtered_data.groupby(['year', 'region'])['value'].sum().reset_index()
    ev_stock_by_year_region['value'] = ev_stock_by_year_region['value'] / 1_000_000
    fig = px.line(
        ev_stock_by_year_region,
        x='year',
        y='value',
        color='region',
        title="Graph 8: EV Stocks (2010-2023)",
        labels={'year': 'Year', 'value': 'Electric Car Stocks (Millions)', 'region': 'Region'}
    )
    description = (
        "The graph depicts the growth of electric car stocks from 2010 to 2023 across major regions, with the 'World' category demonstrating "
        "the most substantial increase, surpassing 40 million vehicles. China and the EU27 exhibit steady growth, while the USA follows at a slower pace."
    )
    return fig, description

    # Graph 9: Historical Chinese EV Sales
def graph_9():
    chinese_sales = historical_df[historical_df['region'] == 'China']
    filtered_sales = chinese_sales[chinese_sales['parameter'] == 'EV sales']
    sales_by_powertrain = filtered_sales.groupby(['year', 'powertrain'])['value'].sum().reset_index()
    fig = px.line(
        sales_by_powertrain,
        x='year',
        y='value',
        color='powertrain',
        title='Graph 9: Historical Chinese EV Sales',
        labels={'value': 'Sales', 'year': 'Year', 'powertrain': 'Powertrain Type'}
    )
    description = (
        "The historical data reveals substantial growth in Battery Electric Vehicle (BEV) sales from 2010 to 2023, "
        "which significantly outpace those of Plug-in Hybrid Electric Vehicles (PHEVs) and Fuel Cell Electric Vehicles (FCEVs). "
        "Notably, BEV sales surged after 2020, underscoring China's strategic focus on fully electric powertrains."
    )
    return fig, description

# Graph 10: Projected Chinese EV Sales
def graph_10():
    chinese_sales_proj = projected_df[projected_df['region'] == 'China']
    filtered_sales_proj = chinese_sales_proj[chinese_sales_proj['parameter'] == 'EV sales']
    sales_by_powertrain_proj = filtered_sales_proj.groupby(['year', 'powertrain'])['value'].sum().reset_index()
    fig = px.line(
        sales_by_powertrain_proj,
        x='year',
        y='value',
        color='powertrain',
        title='Graph 10: Projected Chinese EV Sales',
        labels={'value': 'Sales', 'year': 'Year', 'powertrain': 'Powertrain Type'}
    )
    description = (
        "Projections for 2024 through 2035 indicate a continued emphasis on BEVs, with annual sales projected to exceed 20 million units by 2035. "
        "PHEVs are expected to peak around 2025 before gradually declining. FCEVs are anticipated to maintain a minimal presence, "
        "suggesting China's long-term electric vehicle strategy will remain centered on BEVs."
    )
    return fig, description

# Graph 11: Powertrain Distribution (2023)
def graph_11():
    selected_regions = ['USA', 'United Kingdom', 'China', 'Europe', 'India']
    filtered_data = historical_df[
        (historical_df['region'].isin(selected_regions)) & (historical_df['year'] == 2023)
    ]
    powertrain_distribution = filtered_data.groupby(['region', 'powertrain'])['value'].sum().reset_index()
    fig = px.bar(
        powertrain_distribution,
        x='powertrain',
        y='value',
        color='powertrain',
        facet_col='region',
        title="Graph 11: Powertrain Distribution in Selected Regions (2023)",
        labels={'powertrain': 'Powertrain Type', 'value': 'Sales'}
    )
    description = (
        "The graph highlights the dominance of Battery Electric Vehicles (BEVs) across selected regions in 2023, with China leading significantly in total sales volume. "
        "Plug-in Hybrid Electric Vehicles (PHEVs) demonstrate notable sales in Europe and India, while Fuel Cell Electric Vehicles (FCEVs) show minimal presence globally."
    )
    return fig, description

# Graph 12: Correlation Heatmap
def graph_12():
    selected_regions = ['USA', 'United Kingdom', 'China', 'Europe', 'India']
    ev_sales_data = historical_df[
        (historical_df['parameter'] == 'EV sales') & (historical_df['region'].isin(selected_regions))
    ]
    heatmap_data = ev_sales_data.pivot_table(index='region', columns='year', values='value', aggfunc='sum')
    fig = go.Figure(data=go.Heatmap(
        z=heatmap_data.values,
        x=heatmap_data.columns,
        y=heatmap_data.index,
        colorscale='Viridis',
        colorbar=dict(title='EV Sales')
    ))
    fig.update_layout(
        title='Graph 12: Correlation Between EV Sales and Regions Over Time',
        xaxis_title='Year',
        yaxis_title='Region'
    )
    description = (
        "The heatmap illustrates the relationship between EV sales across regions and time, highlighting China's exponential growth "
        "as a dominant force in the electric vehicle market. Contrastingly, regions like the USA and Europe demonstrate steady "
        "but comparatively moderate increases."
    )
    return fig, description

# Graph 13: Global Distribution of EV Sales Share (2023)
def graph_13():
    filtered_data = historical_df[
        (historical_df['year'] == 2023) & (historical_df['parameter'] == 'EV sales share')
    ]
    sales_by_region = filtered_data.groupby('region')['value'].sum().reset_index()
    fig = px.choropleth(
        sales_by_region,
        locations='region',
        locationmode='country names',
        color='value',
        color_continuous_scale='Viridis',
        title='Graph 13: Global Distribution of EV Sales Share (2023)',
        labels={'value': 'EV Sales Share (%)'}
    )
    description = (
        "The map illustrates the global distribution of EV sales share in 2023, with Northern Europe, including countries like Norway and Sweden, "
        "showing the highest adoption percentages. China and parts of Europe exhibit substantial sales penetration, while regions such as South America "
        "and Africa reflect relatively lower EV adoption rates."
    )
    return fig, description

# Graph 14: EV Sales vs Stock (2023)
def graph_14():
    filtered_df = historical_df[
        (historical_df['year'] == 2023) & (historical_df['parameter'].isin(['EV sales', 'EV stock'])) &
        (historical_df['region'].isin(['USA', 'United Kingdom', 'China', 'India', 'Europe']))
    ]
    aggregated_df = filtered_df.groupby(['region', 'parameter'])['value'].sum().reset_index()
    fig = px.scatter(
        aggregated_df,
        x='region',
        y='value',
        size='value',
        color='parameter',
        title='Graph 14: EV Sales vs. Stock (2023)',
        labels={'region': 'Region', 'value': 'Value'}
    )
    description = (
        "The graph provides a comparative analysis of EV sales and stock in 2023 across key regions, with China demonstrating "
        "the highest values in both metrics. Europe and the USA also show significant contributions, albeit at a lower scale."
    )
    return fig, description

# Dash App Layout
app = dash.Dash(__name__)
app.layout = html.Div([
    html.H1("Electric Vehicle Data Story (2010-2035)", style={'textAlign': 'center'}),

          # Graphs Section
    html.Div([
        html.Div([
            html.H3(f"Graph {i+1}"),
            dcc.Graph(figure=fig),
            html.P(description)
        ]) for i, (fig, description) in enumerate([ graph_1(), graph_2(), graph_3(), graph_4(), graph_5(), graph_6(), graph_7(), graph_8(), graph_9(), graph_10(), graph_11(), graph_12(), graph_13(), graph_14()])
    ]),

    html.Hr(),

    # Combined Historical, Projected, Random Forest, XGBoost, and TensorFlow Predictions Chart
    html.Div([
        html.H3("Chart 1: EV Stock Share - Historical, Projected, Random Forest, XGBoost, and TensorFlow Predictions (2010-2035)", style={'textAlign': 'center'}),
        dcc.Dropdown(
            id='region-dropdown',
            options=[{'label': region, 'value': region} for region in historical_df['region'].unique()],
            value='World',
            placeholder="Select a Region",
            clearable=False
        ),
        dcc.Graph(id='ev-stock-graph')
    ]),

    html.Hr(),

    # TensorFlow Chart 1: Training vs Validation Loss
    html.Div([
        html.H3("Chart 2: TensorFlow - Training vs Validation Loss", style={'textAlign': 'center'}),
        dcc.Graph(
            figure=go.Figure()
                .add_trace(go.Scatter(
                    x=np.arange(len(history['loss'])),
                    y=history['loss'],
                    mode='lines',
                    name='Training Loss'
                ))
                .add_trace(go.Scatter(
                    x=np.arange(len(history['val_loss'])),
                    y=history['val_loss'],
                    mode='lines',
                    name='Validation Loss'
                ))
                .update_layout(
                    title='Training vs Validation Loss',
                    xaxis_title='Epoch',
                    yaxis_title='Loss',
                    template='plotly_white'
                )
        )
    ]),

    html.Hr(),

    # TensorFlow Chart 2: Actual vs Predicted Values
    html.Div([
        html.H3("Chart 3: TensorFlow - Actual vs Predicted Values", style={'textAlign': 'center'}),
        dcc.Graph(
            figure=go.Figure()
                .add_trace(go.Scatter(
                    x=y_test,
                    y=model(X_test_scaled).flatten(),
                    mode='markers',
                    name='Predicted'
                ))
                .add_trace(go.Scatter(
                    x=[min(y_test), max(y_test)],
                    y=[min(y_test), max(y_test)],
                    mode='lines',
                    name='Perfect Prediction',
                    line=dict(color='red', dash='dash')
                ))
                .update_layout(
                    title='Actual vs Predicted Values',
                    xaxis_title='Actual Values',
                    yaxis_title='Predicted Values',
                    template='plotly_white'
                )
        )
    ]),

    html.Hr(),

    # TensorFlow Chart 3: Residual Distribution
    html.Div([
        html.H3("Chart 4: TensorFlow - Residual Distribution", style={'textAlign': 'center'}),
        dcc.Graph(
            figure=go.Figure()
                .add_trace(go.Histogram(
                    x=y_test - model(X_test_scaled).flatten(),
                    nbinsx=30,
                    name='Residuals'
                ))
                .update_layout(
                    title='Residual Distribution',
                    xaxis_title='Residuals',
                    yaxis_title='Frequency',
                    template='plotly_white'
                )
        )
    ]),

    html.Hr(),

    # Conclusion Section
    html.Div([
        html.H3("Conclusion", style={'textAlign': 'center'}),
        html.P("""
            Between 2025 and 2030, global electricity demand is expected to surge by 225%, with China accounting for
            33% of the total demand, Europe for 23%, and the U.S. for 32%. However, from 2030 to 2035, growth in
            electricity demand will decelerate to 100%, with China's share decreasing slightly to 29%, Europe maintaining
            its 23%, and the U.S. increasing to 33%."""),

        html.P("""
            In parallel, the electric vehicle (EV) stock share is projected to grow by 176% between 2025 and 2030.
            During this period, China will lead with 31% EV adoption, followed by Europe at 18%, the U.S. at 17%, and the
            global average at 16%. From 2030 to 2035, EV stock share growth will moderate to 100%."""),

        html.P("""
            The trends in electricity demand and EV stock share reveal striking similarities. China is the dominant player
            in both sectors, driving rapid growth through 2030 before the pace slows. By 2035, the U.S. is expected to close
            the gap with Europe and China in EV stock share, a significant improvement from its lagging position in 2024.""")
    ])
])

@app.callback(
    Output('ev-stock-graph', 'figure'),
    [Input('region-dropdown', 'value')]
)
def update_ev_stock_graph(selected_region):
    # Filter historical data for the selected region
    historical_data = historical_df[(historical_df['parameter'] == 'EV stock share') &
                                     (historical_df['region'] == selected_region)]

    # Filter projected data for the selected region (from IEA projections)
    projected_data = projected_df[(projected_df['region'] == selected_region) &
                                  (projected_df['parameter'] == 'EV stock share')]

    # Add predictions from Random Forest and XGBoost models
    rf_predictions = predicted_rf[predicted_rf['region'] == selected_region]
    xgb_predictions = predicted_xgb[predicted_xgb['region'] == selected_region]

    # Combine all data into a single DataFrame for plotting
    historical_data['source'] = 'Historical'
    projected_data['source'] = 'Projected (IEA)'
    rf_predictions['source'] = 'Random Forest'
    xgb_predictions['source'] = 'XGBoost'

    # Combine all data
    combined_data = pd.concat([historical_data[['year', 'value', 'source']],
                               projected_data[['year', 'value', 'source']],
                               rf_predictions[['year', 'value', 'source']],
                               xgb_predictions[['year', 'value', 'source']]])

    # Create the line plot using Plotly
    fig = px.line(
        combined_data,
        x='year',
        y='value',
        color='source',
        title="EV Stock Share - Historical, Projected, and Model Predictions (2010-2035)",
        labels={'value': 'EV Stock Share (%)', 'year': 'Year'},
        line_shape='linear'
    )

    # Customize the layout
    fig.update_layout(
        xaxis_title="Year",
        yaxis_title="EV Stock Share (%)",
        template="plotly_white"
    )

    return fig


# Run the App
if __name__ == '__main__':
    app.run_server(debug=True)

<IPython.core.display.Javascript object>