# Analyze the historical trends in automobile sales during recession periods

## Components of the report items

1. Yearly Automobile Sales Statistics
  - Yearly Average Automobile sales using line chart for the whole period.
  - For the chosen year provide,
    - Total Monthly Automobile sales using line chart.
    - Average Monthly Automobile sales of each vehicle type using bar chart.
    - Total Advertisement Expenditure for each vehicle using pie chart

2. Recession Period Statistics
  - Average Automobile sales using line chart for the Recession Period using line chart.
  - Average number of vehicles sold by vehicle type using bar chart
  - Total expenditure share by vehicle type during recession usssing pie chart
  - Effect of unemployment rate on vehicle type and sales using bar chart

## Dataset variables


- *Date*: The date of the observation.
- *Recession*: A binary variable indicating recession perion; 1 means it was recession, 0 means it was normal.
- *Automobile_Sales*: The number of vehicles sold during the period.
- *GDP*: The per capita GDP value in USD.
- *Unemployment_Rate*: The monthly unemployment rate.
- *Consumer_Confidence*: A synthetic index representing consumer confidence, which can impact consumer spending and automobile purchases.
- *Seasonality_Weight*: The weight representing the seasonality effect on automobile sales during the period.
- *Price*: The average vehicle price during the period.
- *Advertising_Expenditure*: The advertising expenditure of the company.
- *Vehicle_Type*: The type of vehicles sold; Supperminicar, Smallfamiliycar, Mediumfamilycar, Executivecar, Sports.
- *Competition*: The measure of competition in the market, such as the number of competitors or market share of major manufacturers.
- *Month*: Month of the observation extracted from Date.
- *Year*: Year of the observation extracted from Date.

## Requirements to create the expected Dashboard

- Two dropdown menus: For choosing report type and year
- Each dropdown will be designed in a division
  - The second dropdown (for selecting the year) should be enabled only if when the user selects “Yearly Statistics report” from the previous dropdown, else it should be disabled only. - The second dropdown (for selecting the year) should be enabled only if when the user selects “Yearly Statistics report” from the previous dropdown, else it should be disabled only.


- Layout for adding graphs.
- Callback functions to return to the layout and display graphs.
  - First callback will be required to take the input for the report type and set the years dropdown to be enabled to take the year input for “Years Statistics Report”, else this dropdown be put on disabled.
  - In the second callback you will fetch the value of report type and year and return the required graphs appropriately for each type of report
- The four plots to be displayed in 2 rows, 2 column representation

In [None]:
# Solution skeleton
# !wget https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMSkillsNetwork-DV0101EN-Coursera/labs/v4/Final_Project/DV0101EN-Final_Assign_Part_2_Questions.py

## Dependencies

```bash
python --version
Python 3.10.12
```

In [None]:
!python --version

In [None]:
%pip install -q pandas plotly dash dash-bootstrap-components pyarrow
# %pip freeze

import warnings
warnings.filterwarnings('ignore', category=FutureWarning)

### Imports

In [None]:
import dash
from dash import dcc
from dash import html
from dash.dependencies import Input, Output
import pandas as pd
import plotly.graph_objs as go
import plotly.express as px
import dash_bootstrap_components as dbc

### Dataset

In [None]:
import pyarrow as pa

categorical_cols = ['Vehicle_Type', 'City']

data = pd.read_csv(
    "https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DV0101EN-SkillsNetwork/Data%20Files/historical_automobile_sales.csv",
    parse_dates=["Date"],
    dtype={col: "category" for col in categorical_cols}
)

# Converts the numeric and datetime data types to their PyArrow counterparts
data_py = data.convert_dtypes(dtype_backend="pyarrow")

# To convert categorical columns to PyArrow dictionary type
for col in categorical_cols:
    data_py[col] = data_py[col].astype(pd.ArrowDtype(pa.dictionary(pa.int16(), pa.string())))

data_py.to_parquet('data.parquet', engine='pyarrow')

In [None]:
#  Year, Month
data.head()

## Data Quality Checks

In [None]:
data.info(memory_usage='deep')

### Number of missing values

In [None]:
_ = data.isna().sum()
missing = _[_ > 0]
missing


### Static values

In [None]:
_ = data.nunique()
static_vals = _[_ == 1]
static_vals

### Rows with missing cells


In [None]:
rows_with_missing = data.loc[data.isna().any(axis=1), :]
rows_with_missing.head()

## EDA

In [None]:
data.describe(include='all')

## Dash App

In [None]:
%pip install -q pyngrok

import getpass
from pyngrok import ngrok, conf
ngrok.kill()
print("Enter your authtoken, which can be copied from https://dashboard.ngrok.com/get-started/your-authtoken")
NGROK_AUTH_TOKEN = getpass.getpass()
ngrok.set_auth_token(NGROK_AUTH_TOKEN)
ngrok.connect(8050)

In [None]:
import time
# Initialize the Dash app
app = dash.Dash(name=__name__, external_stylesheets=[dbc.themes.BOOTSTRAP])

# Set the title of the dashboard
app.title = "Automobile Statistics Dashboard"

#---------------------------------------------------------------------------------

# Create the dropdown menu options
dropdown_options = [
  {'label': 'Yearly Statistics', 'value': 'Yearly Statistics'},
  {'label': 'Recession Period Statistics', 'value': 'Recession Period Statistics'}
]
# List of years
year_list = [i for i in range(1980, 2024, 1)]

#---------------------------------------------------------------------------------------

# Create the layout of the app
app.layout = dbc.Container([
  # TASK 2.1 Add title to the dashboard
  dbc.Row([
    html.H1(app.title, style={'textAlign': 'left', 'color': '#503D36', 'font-size': 24, 'text-align': 'center'}), # May include style for title
  ]),
  # TASK 2.2: Add two dropdown menus
  dbc.Row([
    dbc.Col([
      html.Label("Select Statistics:"),
      dcc.Dropdown(
          id='dropdown-statistics',
          options=dropdown_options,
          value='Select Statistics',
          placeholder='Select a report type',
          searchable=False
      )
    ], width=9)
  ], justify='center'),
  dbc.Row([
    dbc.Col([
      html.Label("Select Year:"),
      html.Div(dcc.Dropdown(
        id='select-year',
        options=[{'label': i, 'value': i} for i in year_list],
        placeholder='Select an year',
        searchable=False,
      )),
    ], width=9)
  ], justify='center'),
  # TASK 2.3: Add a division for output display
  dbc.Container(id='output-container', className='chart-grid')
], fluid=True)

#TASK 2.4: Creating Callbacks
# Define the callback function to update the input container based on the selected statistics

@app.callback(
  Output(component_id='select-year', component_property='disabled'),
  Input(component_id='dropdown-statistics',component_property='value'))
def update_input_container(selected_statistics: str):
  if selected_statistics == dropdown_options[0]['value']:   # 'Yearly Statistics'
    return False
  return True

# Callback for plotting
# Define the callback function to update the input container based on the selected statistics
@app.callback(
    Output(component_id='output-container', component_property='children'),
    Input(component_id='dropdown-statistics', component_property='value'), 
    Input(component_id='select-year', component_property='value'))
def update_output_container(selected_statistics: str, input_year: str):
  if selected_statistics == dropdown_options[1]['value']:   # 'Recession Period Statistics'
    # Filter the data for recession periods
    recession_data = data[data['Recession'] == 1]

    #TASK 2.5: Create and display graphs for Recession Report Statistics

    #Plot 1 Automobile sales fluctuate over Recession Period (year wise)
    # use groupby to create relevant data for plotting
    sales_rec_by_year=recession_data.groupby('Year')['Automobile_Sales'].mean().reset_index()
    R_chart1 = dcc.Graph(
      figure=px.line(
        sales_rec_by_year, 
        x='Year', 
        y='Automobile_Sales', 
        title="Average Automobile Sales fluctuation over Recession Period"))

    #Plot 2 Calculate the average number of vehicles sold by vehicle type
    # use groupby to create relevant data for plotting
    sales_rec_by_veh = recession_data.groupby('Vehicle_Type')['Automobile_Sales'].mean().reset_index()
    R_chart2  = dcc.Graph(
      figure=px.bar(
        data_frame=sales_rec_by_veh,
        x='Vehicle_Type',
        y='Automobile_Sales'))

    # Plot 3 Pie chart for total expenditure share by vehicle type during recessions
    # use groupby to create relevant data for plotting
    exp_rec_by_veh = recession_data.groupby('Vehicle_Type')['Advertising_Expenditure'].sum().reset_index()
    R_chart3 = dcc.Graph(
      figure=px.pie(
        data_frame=exp_rec_by_veh,
        names='Vehicle_Type',
        values='Advertising_Expenditure'))

    # Plot 4 bar chart for the effect of unemployment rate on vehicle type and sales
    sales_rec_by_uemp_veh = recession_data.groupby(['unemployment_rate', 'Vehicle_Type'])['Automobile_Sales'].sum().reset_index()
    R_chart4 = dcc.Graph(
      figure=px.bar(
        sales_rec_by_uemp_veh,
        x='unemployment_rate',
        y='Automobile_Sales',
        color='Vehicle_Type',
        title='Vehicle-wise Sales by Unemployment Rate during Recessions'))

    return [
      dbc.Row(className='chart-item', children=[dbc.Col(children=R_chart1),dbc.Col(children=R_chart2)]),
      dbc.Row(className='chart-item', children=[dbc.Col(children=R_chart3),dbc.Col(children=R_chart4)])]

  # TASK 2.6: Create and display graphs for Yearly Report Statistics
  # Yearly Statistic Report Plots
  elif (input_year and selected_statistics == dropdown_options[0]['value']): # 'Yearly Statistics'
    yearly_data = data[data['Year'] == input_year]

    # #TASK 2.5: Creating Graphs Yearly data

    #plot 1 Yearly Automobile sales using line chart for the whole period.
    yas= data.groupby('Year')['Automobile_Sales'].mean().reset_index()
    Y_chart1 = dcc.Graph(
      figure=px.line(
        data_frame=yas,
        x='Year',
        y='Automobile_Sales'))

    # Plot 2 Total Monthly Automobile sales using line chart.
    mas = yearly_data.groupby('Month')['Automobile_Sales'].sum().reset_index()
    Y_chart2 = dcc.Graph(
      figure=px.line(
        data_frame=mas,
        x='Month',
        y='Automobile_Sales'))

    # Plot bar chart for average number of vehicles sold during the given year
    avr_vdata=yearly_data.groupby('Vehicle_Type')['Automobile_Sales'].sum().reset_index()
    Y_chart3 = dcc.Graph(
      figure=px.bar(
        data_frame=avr_vdata,
        x='Vehicle_Type',
        y='Automobile_Sales',
        title='Average Vehicles Sold by Vehicle Type in the year {}'.format(input_year)))

    # Total Advertisement Expenditure for each vehicle using pie chart
    exp_data=yearly_data.groupby('Vehicle_Type')['Advertising_Expenditure'].sum().reset_index()
    Y_chart4 = dcc.Graph(
       figure=px.pie(
          data_frame=exp_data,
          names='Vehicle_Type',
          values='Advertising_Expenditure'))

    # TASK 2.6: Returning the graphs for displaying Yearly data
    # return [
    #   html.Div(className='.........', children=[html.Div(....,html.Div(....)],style={...}),
    #   html.Div(className='.........', children=[html.Div(....),html.Div(....)],style={...})]

    return [
      dbc.Row(className='chart-item', children=[dbc.Col(children=Y_chart1),dbc.Col(children=Y_chart2)]),
      dbc.Row(className='chart-item', children=[dbc.Col(children=Y_chart3),dbc.Col(children=Y_chart4)])]

  else:
      return None

# Run the Dash app
if __name__ == '__main__':
    app.run_server()