# Storytelling with Custom Visualizations

In this activity, you’ll enhance a basic plot by using the visualization options that the hvPlot library makes available.

Instructions:

1. Review the code to import the required libraries and to generate the DataFrame that you’ll use for your story and visualization. You’ll work with the `average_sales_per_date_2019` DataFrame, which consists of the **average** home sales per day for the first three months of the year.

2. Use the `hvplot` function to generate a default line plot.

3. To observe the data from a different perspective, plot a bar chart by using the `hvplot` function. Assign values to the x- and y-axes by using the `hvplot.bar(x='saleDate', y='saleAmt')` syntax.

4. Using the code from the previous plot, add the `rot` parameter to rotate the x-axis labels by 90 degrees. Then create appropriate labels for the x- and y-axes: `xlabel="Sale Date"` and `ylabel="Average Sale Amount"`.

5. Using the code from the previous plot, apply the `opts.(yformatter)` option to redisplay the y-axis labels as whole numbers (with zero decimal places). Given the values of the average home sales, you don’t need decimal places for this visualization.

6. Using the code from the previous plot, add the `title` parameter to give the visualization a descriptive title.

7. Using the code from the previous plot, add the `invert_axes` option to invert the x- and y-axes.

    >**Hint** You need to adjust the values of the x- and y-axes labels and change `yformatter` to `xformatter` to accommodate the axes inversion.

8. Using the code from the previous plot, add a dynamic visual element to the plot by incorporating the `hover-color` parameter and assigning it a value of `"orange"`.

9. Based on your enhanced visualization, compose your version of the data story for the trend of real-estate commissions over the first three months of 2019.

References:

[hvPlot Customization page](https://hvplot.holoviz.org/user_guide/Customization.html)

[HoloViews Styling Mapping page](http://holoviews.org/user_guide/Style_Mapping.html)


## Step 1: Review the code to import the required libraries and to generate the DataFrame that you’ll use for your story and visualization. You’ll work with the `average_sales_per_date_2019` DataFrame, which consists of the **average** home sales per day for the first three months of the year.

In [1]:
# Import the required libraries and dependencies
import pandas as pd
from pathlib import Path
import hvplot.pandas

In [2]:
# Using the read_csv function and Path module, read in the "housing_sale_data.csv" file  
# and create Pandas DataFrame
home_sale_prices_df = pd.read_csv(
    Path("../Resources/housing_sale_data.csv"), 
    index_col="salesHistoryKey"
)

# Review the first and last five rows of the DataFrame
display(home_sale_prices_df.head())
display(home_sale_prices_df.tail())

Unnamed: 0_level_0,propertyKey,streetAddress,salesTypeDsc,saleAmt,saleDate
salesHistoryKey,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
117701,64736,1121 ARLINGTON BLVD 831,O-Assignment of Lease,275000,2019-06-24
270452,64190,989 S BUCHANAN ST 418,B-Not Previously Assessed,565000,2019-06-21
117663,23057,1121 ARLINGTON BLVD 820,O-Assignment of Lease,165000,2019-06-20
117485,23019,1121 ARLINGTON BLVD 717,O-Assignment of Lease,171900,2019-06-20
86768,53495,3800 FAIRFAX DR 1-83,C-Condo Parking Space,34000,2019-06-17


Unnamed: 0_level_0,propertyKey,streetAddress,salesTypeDsc,saleAmt,saleDate
salesHistoryKey,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
163323,52647,3817 16th ST S,"1-Foreclosure, Auction, Bankru",165000,1999-10-07
115652,22633,1200 N NASH ST 1130,C-Condo Parking Space,5100,1999-10-07
194989,67748,3429 22nd ST S,"1-Foreclosure, Auction, Bankru",178598,1999-10-07
88529,17508,900 N STAFFORD ST 1611,"4-Multiple RPCs, Not A Coded S",111350,1999-10-02
89926,17779,901 N STUART ST 4-202,"4-Multiple RPCs, Not A Coded S",111350,1999-10-02


In [3]:
# Using loc as well as conditional and logical operators, slice the data 
# to only capture the information from January to March 2019
# home_sales_2019 = home_sale_prices_df.loc[
#     (home_sale_prices_df["saleDate"] >= "2019-01-01")
#     & (home_sale_prices_df["saleDate"] <= "2019-03-31")
# ]
home_sales_2019 = home_sale_prices_df.query('saleDate >= "2019-01-01" and saleDate <= "2019-03-31"')

# Review the first and last five rows of the resulting DataFrame
display(home_sales_2019.head())
display(home_sales_2019.tail())

Unnamed: 0_level_0,propertyKey,streetAddress,salesTypeDsc,saleAmt,saleDate
salesHistoryKey,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
63678,12688,6356 11th RD N,5-Not Market Sale,779950,2019-03-29
119124,23368,1111 ARLINGTON BLVD 822,O-Assignment of Lease,130000,2019-03-29
122208,24013,1021 ARLINGTON BLVD 1030,O-Assignment of Lease,250000,2019-03-29
56731,11285,1424 MCKINLEY RD,5-Not Market Sale,735000,2019-03-29
69377,13878,4810 9th ST N,E-Estate Sale,1200000,2019-03-29


Unnamed: 0_level_0,propertyKey,streetAddress,salesTypeDsc,saleAmt,saleDate
salesHistoryKey,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
269811,51531,6713 19th ST N,G-New Construction,1628000,2019-01-08
86693,17183,3800 FAIRFAX DR 1-35,"4-Multiple RPCs, Not A Coded S",371000,2019-01-07
86100,17074,3800 FAIRFAX DR 1112,"4-Multiple RPCs, Not A Coded S",371000,2019-01-07
121653,23899,1021 ARLINGTON BLVD 722,O-Assignment of Lease,149500,2019-01-04
141340,57891,5608 1st ST S,5-Not Market Sale,729750,2019-01-04


In [4]:
# Create a DataFrame containing the average home sale per day for the
# first three months of 2019 using the "saleAmt" and "saleDate" columns 
# Group by "saleDate" and then take the mean and sort by "saleDate" 
average_sales_per_date_2019 = (
    home_sales_2019[["saleAmt", "saleDate"]]
    .groupby("saleDate")
    .mean()
    .sort_values("saleDate")
)

# Review the resulting Series
display(average_sales_per_date_2019.head())
display(average_sales_per_date_2019.tail())

Unnamed: 0_level_0,saleAmt
saleDate,Unnamed: 1_level_1
2019-01-04,439625.0
2019-01-07,371000.0
2019-01-08,1628000.0
2019-01-09,96000.0
2019-01-15,623500.0


Unnamed: 0_level_0,saleAmt
saleDate,Unnamed: 1_level_1
2019-03-22,556000.0
2019-03-25,990000.0
2019-03-26,167000.0
2019-03-28,840000.0
2019-03-29,618990.0


## Step 2: Use the `hvplot` function to generate a default line plot.
  

In [5]:
# Utilize the hvplot function to generate default line plot 
# to visualize the sales data for January through March 2019
average_sales_per_date_2019.hvplot(xlabel='Days',rot=45,ylabel='Avg. sale price ($)',yformatter='%.0f',title='The avg. sale price varied wildly between $100k and $1.5m in Jan and Feb 2019')

## Step 3: To observe the data from a different perspective, plot a bar chart by using the `hvplot` function. Assign values to the x- and y-axes by using the `hvplot.bar(x='saleDate', y='saleAmt')` syntax.

In [6]:
# Plot bar chart of the sales data for the first 3 months of 2019 
# Specify the variables for the x- and y-axes using the syntax (x='saleDate', y='saleAmt')
bar_plot = average_sales_per_date_2019.hvplot.bar(x='saleDate',y='saleAmt')
bar_plot

### Step 4: Using the code from the previous plot, add the `rot` parameter to rotate the x-axis labels by 90 degrees. Then create appropriate labels for the x- and y-axes: `xlabel="Sale Date"` and `ylabel="Average Sale Amount"`.

In [7]:
# Using the code from the existing plot, include rotation of x-axis labels as well as labels for the x- and y-axes
bar_plot = average_sales_per_date_2019.hvplot.bar(x='saleDate',y='saleAmt',xlabel='Days',rot=45,ylabel='Avg. sale price ($)')
bar_plot

## Step 5: Using the code from the previous plot, apply the `opts.(yformatter)` option to redisplay the y-axis labels as whole numbers (with zero decimal places). Given the values of the average home sales, you don’t need decimal places for this visualization. 

In [8]:
# Using the code from the existing plot, include a y-formatter that round the y-axis labels to the whole number
bar_plot.opts(yformatter='%.0f')

## Step 6: Using the code from the previous plot, add the `title` parameter to give the visualization a descriptive title.

In [9]:
# Using the code from the existing plot, add an informative title helps to define the visualization
bar_plot.opts(title='The avg. sale price varied wildly between $100k and $1.5m in Jan and Feb 2019')

## Step 7: Using the code from the previous plot, add the `invert_axes` option to invert the x- and y-axes.

In [11]:
# Using the code from the existing plot, invert the axes for dramatic effect
# Be sure to adjust the yformatter to the xformatter as well as the xlabel and ylabel values.
bar_plot.opts(invert_axes=True,ylabel='Days',xlabel='Avg. sale price ($)',yformatter=None,xformatter='%.0f')

## Step 8: Using the code from the previous plot, add a dynamic visual element to the plot by incorporating the `hover-color` parameter and assigning it a value of `"orange"`.

In [12]:
# Using the code from the existing plot, add the parameter hover_color assigning it a value of "orange"
bar_plot.opts(hover_color='orange')

## Step 9:  Based on your enhanced visualization, compose your version of the data story for the trend of real-estate commissions over the first three months of 2019.

**Answer:** # YOUR ANSWER HERE

We observe that average sales prices are very sporatic in the first 2 months of 2019, and they seem to be stabilizing more in march with fewer outliers