# Crafting Composite Plots

In this activity, you’ll use composite plots to visualize the average cost of diabetes treatment and care by state.

Instructions:

1. Using the `read_csv` function and the Path module, read the `hospital_claims.csv` file into a Pandas DataFrame named `hospital_claims_df`. Review the resulting DataFrame.

2. Using the `hospital_claims_df` DataFrame, create a new DataFrame named `procedure_638_charges_df` that consists of all the rows where the value of the DRG Definition column equals "638 - DIABETES W CC". Review the resulting DataFrame.

3. From the `procedure_638_charges_df` DataFrame, slice out the “Average Total Payments” and “Provider State” columns. Name this new DataFrame `average_total_payments_df`, and review it.

4. Create a DataFrame named `average_total_payments_by_state`. To do this, group the data in the `average_total_payments_df` DataFrame by “Provider State” and use the `mean` function to calculate the results. Review the resulting DataFrame.

5. Create a bar chart for the `average_total_payments_by_state` DataFrame. Include a `title` for the plot, rotate the x-axis labels, and add a `yformatter` that rounds the values to the nearest whole number.

6. Create a new Dataframe named `average_total_payments_by_state_sorted` that holds the sorted values of the `avg_total_payments_by_state` DataFrame based on the “Average Total Payments” column. Plot the sorted DataFrame in a bar chart.

    > **Hint** Be sure to create a variable named `plot_average_total_payments_by_state` that holds the plot. You'll need to access this variable when creating your composition plots.

7. Repeat Steps 3 through 6, slicing out the “Average Medicare Payments” and “Provider State” columns.

8. Create a side-by-side composite plot that includes `plot_average_total_payments_by_state_sorted` and `plot_average_medicare_payments_by_state_sorted`.

9. Compose an overlay plot that includes `plot_average_total_payments_by_state_sorted` and `plot_average_medicare_payments_by_state_sorted`. Be sure to set your overlay plot equal to a variable so that you can appropriately style the overlay plot.

10. Review your visualizations, and then answer the following questions. You can use the widgets to help craft your responses:

    * In which states do you recommend launching the diabetes assistance program, and why?

    * Which states would you avoid, and why?

    * What additional data can be used to help make this decision?


References:

[hvPlot Composing Plots](https://hvplot.holoviz.org/user_guide/Plotting.html)

[hvPlot Customization page](https://hvplot.holoviz.org/user_guide/Customization.html)

[HoloViews Styling Mapping page](http://holoviews.org/user_guide/Style_Mapping.html)

In [5]:
# Import the required libraries and dependencies
from pathlib import Path
import pandas as pd
import hvplot.pandas
!pwd

/Users/pete/Documents/Fintech-Workspace/M6/02-Crafting_Composite_Plots/Unsolved


## Step 1: Using the `read_csv` function and the Path module, read the `hospital_claims.csv` file into a Pandas DataFrame named `hospital_claims_df`. Review the resulting DataFrame.


In [6]:
# Using the `read_csv` function and Path module, read `hospital_claims.csv` file 
# and create the Pandas DataFrame.
hospital_claims_df = pd.read_csv(Path("../Resources/hospital_claims.csv"))

# Review the first and last five rows of the DataFrame
display(hospital_claims_df.head())
display(hospital_claims_df.tail())
    

Unnamed: 0,DRG Definition,Provider Id,Provider Name,Provider Street Address,Provider City,Provider State,Provider Zip Code,Hospital Referral Region Description,Total Discharges,Average Covered Charges,Average Total Payments,Average Medicare Payments
0,039 - EXTRACRANIAL PROCEDURES W/O CC/MCC,10001,SOUTHEAST ALABAMA MEDICAL CENTER,1108 ROSS CLARK CIRCLE,DOTHAN,AL,36301,AL - Dothan,91,32963.07,5777.24,4763.73
1,039 - EXTRACRANIAL PROCEDURES W/O CC/MCC,10005,MARSHALL MEDICAL CENTER SOUTH,2505 U S HIGHWAY 431 NORTH,BOAZ,AL,35957,AL - Birmingham,14,15131.85,5787.57,4976.71
2,039 - EXTRACRANIAL PROCEDURES W/O CC/MCC,10006,ELIZA COFFEE MEMORIAL HOSPITAL,205 MARENGO STREET,FLORENCE,AL,35631,AL - Birmingham,24,37560.37,5434.95,4453.79
3,039 - EXTRACRANIAL PROCEDURES W/O CC/MCC,10011,ST VINCENT'S EAST,50 MEDICAL PARK EAST DRIVE,BIRMINGHAM,AL,35235,AL - Birmingham,25,13998.28,5417.56,4129.16
4,039 - EXTRACRANIAL PROCEDURES W/O CC/MCC,10016,SHELBY BAPTIST MEDICAL CENTER,1000 FIRST STREET NORTH,ALABASTER,AL,35007,AL - Birmingham,18,31633.27,5658.33,4851.44


Unnamed: 0,DRG Definition,Provider Id,Provider Name,Provider Street Address,Provider City,Provider State,Provider Zip Code,Hospital Referral Region Description,Total Discharges,Average Covered Charges,Average Total Payments,Average Medicare Payments
163060,948 - SIGNS & SYMPTOMS W/O MCC,670041,SETON MEDICAL CENTER WILLIAMSON,201 SETON PARKWAY,ROUND ROCK,TX,78664,TX - Austin,23,26314.39,3806.86,3071.39
163061,948 - SIGNS & SYMPTOMS W/O MCC,670055,METHODIST STONE OAK HOSPITAL,1139 E SONTERRA BLVD,SAN ANTONIO,TX,78258,TX - San Antonio,11,21704.72,4027.36,2649.72
163062,948 - SIGNS & SYMPTOMS W/O MCC,670056,SETON MEDICAL CENTER HAYS,6001 KYLE PKWY,KYLE,TX,78640,TX - Austin,19,39121.73,5704.36,4058.36
163063,948 - SIGNS & SYMPTOMS W/O MCC,670060,TEXAS REGIONAL MEDICAL CENTER AT SUNNYVALE,231 SOUTH COLLINS ROAD,SUNNYVALE,TX,75182,TX - Dallas,11,28873.09,7663.09,6848.54
163064,948 - SIGNS & SYMPTOMS W/O MCC,670068,TEXAS HEALTH PRESBYTERIAN HOSPITAL FLOWER MOUND,4400 LONG PRAIRIE ROAD,FLOWER MOUND,TX,75028,TX - Dallas,12,15042.0,3539.75,2887.41


## Step 2: Using the `hospital_claims_df` DataFrame, create a new DataFrame named `procedure_638_charges_df` that consists of all the rows where the value of the DRG Definition column equals "638 - DIABETES W CC". Review the resulting DataFrame.

In [10]:
# Create a new DataFrame where the column DRG Definition equals "638 - DIABETES W CC"
procedure_638_charges_df = hospital_claims_df.loc[hospital_claims_df["DRG Definition"] == "638 - DIABETES W CC"]

# Review the first five rows of the DataFrame
procedure_638_charges_df.head()


Unnamed: 0,DRG Definition,Provider Id,Provider Name,Provider Street Address,Provider City,Provider State,Provider Zip Code,Hospital Referral Region Description,Total Discharges,Average Covered Charges,Average Total Payments,Average Medicare Payments
125166,638 - DIABETES W CC,330004,KINGSTON HOSPITAL,396 BROADWAY,KINGSTON,NY,12401,NY - Albany,21,20006.57,6048.85,5177.85
127575,638 - DIABETES W CC,10001,SOUTHEAST ALABAMA MEDICAL CENTER,1108 ROSS CLARK CIRCLE,DOTHAN,AL,36301,AL - Dothan,32,21175.81,4678.43,4047.68
127576,638 - DIABETES W CC,10005,MARSHALL MEDICAL CENTER SOUTH,2505 U S HIGHWAY 431 NORTH,BOAZ,AL,35957,AL - Birmingham,12,9719.16,4863.75,4203.41
127577,638 - DIABETES W CC,10006,ELIZA COFFEE MEMORIAL HOSPITAL,205 MARENGO STREET,FLORENCE,AL,35631,AL - Birmingham,35,17021.54,4434.57,3537.2
127578,638 - DIABETES W CC,10011,ST VINCENT'S EAST,50 MEDICAL PARK EAST DRIVE,BIRMINGHAM,AL,35235,AL - Birmingham,14,15875.5,5176.07,3394.64


## Step 3: From the `procedure_638_charges_df` DataFrame, slice out the “Average Total Payments” and “Provider State” columns. Name this new DataFrame `average_total_payments_df`, and review it.

In [14]:
# Using the "procedure_638_charges" dataframe 
# slice out the "Average Total Payments" and "Provider State" information 
average_total_payments_df = procedure_638_charges_df[["Average Total Payments","Provider State"]]

# Review the first five rows resulting DataFrame
# YOUR CODE HERE
average_total_payments_df.head()

Unnamed: 0,Average Total Payments,Provider State
125166,6048.85,NY
127575,4678.43,AL
127576,4863.75,AL
127577,4434.57,AL
127578,5176.07,AL


## Step 4: Create a DataFrame named `average_total_payments_by_state`. To do this, group the data in the `average_total_payments_df` DataFrame by “Provider State” and use the `mean` function to calculate the results. Review the resulting DataFrame.

In [18]:
# Using the `avg_total_payments_df` DataFrame
# group the information by "Provider State" and average the data
average_total_payments_by_state = average_total_payments_df.groupby("Provider State").mean()

# Review the first five rows resulting DataFrame
average_total_payments_by_state.head()


Unnamed: 0_level_0,Average Total Payments
Provider State,Unnamed: 1_level_1
AK,7762.52
AL,4744.762439
AR,4858.523043
AZ,6088.806923
CA,7507.761812


## Step 5: Create a bar chart for the `average_total_payments_by_state` DataFrame. Include a `title` for the plot, rotate the x-axis labels, and add a `yformatter` that rounds the values to the nearest whole number.

In [20]:
# Create a bar chart of the `average_total_payment_by_state` DataFrame
# YOUR CODE HERE
average_total_payments_by_state_plot = average_total_payments_by_state.hvplot.bar(title = "Average Total Patments by Sate").opts(yformatter ="%.0f")
average_total_payments_by_state_plot

## Step 6: Create a new Dataframe named `average_total_payments_by_state_sorted` that holds the sorted values of the `average_total_payments_by_state` DataFrame based on the “Average Total Payments” column. Plot the sorted DataFrame in a bar chart.

In [21]:
# Sort the average_total_payments_by_state DataFrame by "Average Total Payments"
average_total_payments_by_state_sorted = average_total_payments_by_state.sort_values("Average Total Payments")

# Create a bar chart to visualize the average_total_payment_by_state_sorted DataFrame
plot_average_total_payments_by_state_sorted = average_total_payments_by_state_sorted.hvplot.bar(title = "Average Total Patments by Sate").opts(yformatter ="%.0f")

# Visualize the bar chart
plot_average_total_payments_by_state_sorted

## Step 7: Repeat Steps 3 through 6, slicing out the “Average Medicare Payments” and “Provider State” columns.

In [24]:
# Using the "procedure_638_charges" dataframe, 
# slice out the "Average Medicare Payments" and "Provider State" information. 
average_medicare_payments_df = procedure_638_charges_df[["Average Medicare Payments", "Provider State"]]

# Review the first five rows resulting DataFrame
average_medicare_payments_df.head()

Unnamed: 0,Average Medicare Payments,Provider State
125166,5177.85,NY
127575,4047.68,AL
127576,4203.41,AL
127577,3537.2,AL
127578,3394.64,AL


In [26]:
# Using the average_medicare_payments_df DataFrame, 
# group the information by "Provider State" and average the data.
average_medicare_payments_by_state = average_medicare_payments_df.groupby("Provider State").mean()

# Review the first five rows of the resulting DataFrame
average_medicare_payments_by_state.head()


Unnamed: 0_level_0,Average Medicare Payments
Provider State,Unnamed: 1_level_1
AK,6852.92
AL,3742.141463
AR,3915.584348
AZ,4934.449615
CA,6622.250875


In [28]:
# Create a bar chart of the `average_medicare_payments_by_state` DataFrame
# YOUR CODE HERE
average_medicare_payments_by_state_plot = average_medicare_payments_by_state.hvplot.bar(title = "Average Medicare Payments by State")
average_medicare_payments_by_state_plot

In [29]:
# Sort the average_medicare_payments_by_state DataFrame by "Average Medicare Payments"
average_medicare_payments_by_state_sorted = average_medicare_payments_by_state.sort_values("Average Medicare Payments")

# Create a bar chart to visualize the `average_medicare_payment_by_state_sorted` DataFrame
plot_average_medicare_payments_by_state_sorted = average_medicare_payments_by_state_sorted.hvplot.bar(title = "Average Medicare Payments by State")

# Visualize the bar chart
plot_average_medicare_payments_by_state_sorted

## Step 8: Create a side-by-side composite plot that includes `plot_average_total_payments_by_state_sorted` and `plot_average_medicare_payments_by_state_sorted`.

In [30]:
# Create a side-by-side composite plot of the sorted DataFrame from the average total payments per state 
# and the average medicare payments per state
plot_average_medicare_payments_by_state_sorted + plot_average_total_payments_by_state_sorted


## Step 9: Compose an overlay plot that includes `plot_average_total_payments_by_state_sorted` and `plot_average_medicare_payments_by_state_sorted`. Be sure to set your overlay plot equal to a variable so that you can appropriately style the overlay plot.

In [32]:
# Compose an overlay plot of the average total payments per state 
# and the average medicare payments per state
overlay_plot = plot_average_total_payments_by_state_sorted * plot_average_medicare_payments_by_state_sorted 

# Style the operlay plot
overlay_plot.opts(
    title="Average Total Payments versus Average Medicare Payments for Diabetes", 
    ylabel="Average Payment Amount", 
    width=1000, 
    height=500
)


## Step 10: Review your visualizations, and then answer the following questions. You can use the widgets to help craft your responses

**Question:** In which states would you recommend launching the diabetes assistance program? Why? 

**Answer:** # YOUR ANSWER HERE

**Question:** Which states would you avoid? Why? 

**Answer:** # YOUR ANSWER HERE

**Question:** What additional data could be used to help make this decision? 


**Answer:** # YOUR ANSWER HERE