# Plotly Challenge
---

## Step #1
In a new jupyter notebook, import pandas, matplotlib, plotly, & numpy.

In [1]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

# plotly 'express' uses lite version to call in figure and axes methods.  
# -- Can be expanded using other libraries.
import plotly.express as px

## Step #2
Use **pandas.read_csv** to access the provided dataset:
- Create a DataFrame by importing the “Fuel_Consumption_2000-2022.csv”
- *This source data already has tidy data principles.*

In [2]:
# pull in the raw csv data
vehicle_data = pd.read_csv('raw/Fuel_Consumption_2000-2022.csv')

## Step #3
In a Markdown cell, indicate:
- How many rows of data are in the original source?

In [9]:
# read the 'shape' (or rows and columns) of the data
vehicle_data.shape

(22556, 12)

<div class="alert alert-block alert-info">
There are 22556 rows!  (and 12 columns)
</div>

## Step #4
To better sort and group the data, we’ll want to change some data types to ‘category’.  

Using the .astype() feature change the following data columns to ‘category’ of your working data frame.
- ‘YEAR’, ‘VEHICLE CLASS’, & ‘MAKE’

In [10]:
# change the type of data in a few columns
data = data.astype({
    'YEAR': 'category',
    'VEHICLE CLASS': 'category',
    'MAKE': 'category'
})

## Step #5
Put the output of a .groupby() & .describe() statement into it’s own variable from the updated data frame from [Step #4](#Step-#4).
- Use ‘YEAR’ and ‘VEHICLE CLASS’ as your group by criteria in the above command.

In [13]:
# group the data by year and class (new variable)
y_vc_group = data.groupby(['YEAR', 'VEHICLE CLASS'])

# describe the grouped data (new variable to save grouped data incase we need it later)
y_vc_group_desc = y_vc_group.describe()

## Step #6
With the variable created in [Step #5](#Step-#5), in a new coding cell, use Plotly to create a line graph showing the mean of  'COMB (mpg)’ of each ‘VEHICLE CLASS’ across ‘YEAR’.
- Label the axes ‘YEAR’ (x-axis) and ‘Combined MPG Average’ (y-axis)
- Set the markers value as ‘True’ (line markers)
- Title the figure ‘Average Combined MPG per Vehicle Class per Year’

In [15]:
# list comprehension (loop the 'for' loop data with the indicies and turn into a list)
Year = [x[0] for x in y_vc_group_desc.index]
VehicleClass = [x[1] for x in y_vc_group_desc.index]

In [17]:
# convert the series to a list (even though express will allow series data to be read)
CombineMPG = y_vc_group_desc['COMB (mpg)']['mean'].values.tolist()

In [39]:
# create a plotly line chart figure
fig = px.line(
    x = Year,
    y = CombineMPG,
    color = VehicleClass,
    markers = True
)

# update the title, axes and legend title
fig.update_layout(
    title = 'Average Combined MPG per Vehicle Class per Year',
    xaxis_title = 'Year',
    yaxis_title = 'Combined MPG Average',
    legend_title = 'Vehicle Class'
)

# customize the hover text...
# for some reason it would not take %{color} as a variable so assuming it is needing 
# to be read someway different, but couldnt find the documentation on how.
fig.update_traces(
    hovertemplate="<br>".join([
        "<b>Year</b>: %{x}",
        "<b>MPG Average</b>: %{y}"
    ])
)

# show the figure
fig.show()

## Step #7
In a Markdown cell talk about what interactive features in Plotly you find most appealing and why.

<div class="alert alert-block alert-info">
Plotly is a really interesting framework and it definitely put a lot of ideas into my head about what I could potentially work with for my final project.  Right now my thoughts for that are leaning toward tracking impact data related to my field (well, company) across the country. With the ability to drill down into the larger data chunks using plotly's native hover and interaction (legend on/off, etc), it will be tremendously helpful and easy to better represent the story we're trying to tell. For my current role I work on the web so the ability to use dropdowns or sliders to move into and out of a data set is not only commonplace, but preferred and that mixed with the number of mapping visualizations plotly offers, could end up being a good choice for what i'm trying to accomplish, making it appealing not only for its simplicity, but for its real-world application.
</div>