# Analyze the historical trends in automobile sales during recession periods

## Components of the report items

1. Yearly Automobile Sales Statistics
  - Yearly Average Automobile sales using line chart for the whole period.
  - For the chosen year provide,
    - Total Monthly Automobile sales using line chart.
    - Average Monthly Automobile sales of each vehicle type using bar chart.
    - Total Advertisement Expenditure for each vehicle using pie chart

2. Recession Period Statistics
  - Average Automobile sales using line chart for the Recession Period using line chart.
  - Average number of vehicles sold by vehicle type using bar chart
  - Total expenditure share by vehicle type during recession usssing pie chart
  - Effect of unemployment rate on vehicle type and sales using bar chart

## Dataset variables


- *Date*: The date of the observation.
- *Recession*: A binary variable indicating recession perion; 1 means it was recession, 0 means it was normal.
- *Automobile_Sales*: The number of vehicles sold during the period.
- *GDP*: The per capita GDP value in USD.
- *Unemployment_Rate*: The monthly unemployment rate.
- *Consumer_Confidence*: A synthetic index representing consumer confidence, which can impact consumer spending and automobile purchases.
- *Seasonality_Weight*: The weight representing the seasonality effect on automobile sales during the period.
- *Price*: The average vehicle price during the period.
- *Advertising_Expenditure*: The advertising expenditure of the company.
- *Vehicle_Type*: The type of vehicles sold; Supperminicar, Smallfamiliycar, Mediumfamilycar, Executivecar, Sports.
- *Competition*: The measure of competition in the market, such as the number of competitors or market share of major manufacturers.
- *Month*: Month of the observation extracted from Date.
- *Year*: Year of the observation extracted from Date.

## Requirements to create the expected Dashboard

- Two dropdown menus: For choosing report type and year
- Each dropdown will be designed in a division
  - The second dropdown (for selecting the year) should be enabled only if when the user selects “Yearly Statistics report” from the previous dropdown, else it should be disabled only. - The second dropdown (for selecting the year) should be enabled only if when the user selects “Yearly Statistics report” from the previous dropdown, else it should be disabled only.


- Layout for adding graphs.
- Callback functions to return to the layout and display graphs.
  - First callback will be required to take the input for the report type and set the years dropdown to be enabled to take the year input for “Years Statistics Report”, else this dropdown be put on disabled.
  - In the second callback you will fetch the value of report type and year and return the required graphs appropriately for each type of report
- The four plots to be displayed in 2 rows, 2 column representation

In [None]:
# Solution skeleton
# !wget https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMSkillsNetwork-DV0101EN-Coursera/labs/v4/Final_Project/DV0101EN-Final_Assign_Part_2_Questions.py

## Dependencies

```bash
python --version
Python 3.9
```

In [None]:
!python --version

In [None]:
# %pip install -q pandas plotly dash dash-bootstrap-components pyarrow python-dotenv
# store requirements
# %pip freeze > requirements.txt TODO Create a shell script that generates a unix requirements.txt

import warnings
warnings.filterwarnings('ignore', category=FutureWarning)

from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

### Imports

In [None]:
import pyarrow as pa
import pandas as pd

### Dataset

#### Data processing and save to parquet

In [None]:
from app.data_preprocessing import (
    data_preprocessing,
    url_remote_csv,
    filename_parquet,
    load_parquet
)

data_preprocessing(url_remote_csv, filename_parquet)

#### Read from parquet

In [None]:
data = load_parquet(filename_parquet)

In [None]:
#  Year, Month
data.head()

## Data Quality Checks

In [None]:
data.info(memory_usage="deep")

### Number of missing values

In [None]:
_ = data.isna().sum()
missing = _[_ > 0]
missing

### Static values

In [None]:
_ = data.nunique()
static_vals = _[_ == 1]
static_vals

### Rows with missing cells


In [None]:
rows_with_missing = data.loc[data.isna().any(axis=1), :]
rows_with_missing.head()

## EDA

In [None]:
data.describe(include="all")

## Dash App

In [None]:
from app import create_app

def run_app():
    df = load_parquet(filename_parquet)
    app = create_app(df)
    app.run_server(debug=True, host="localhost", port=8050)

# Run the Dash app
run_app()