# D210 Data Dashboard and Storytelling Assessment — Task 1
### NAM Task 1: Data Dashboard And Storytelling
#### Representation and Reporting — D210
#### PRFA — NAM2
> André Davis
> StudentID: 010630641
> MSDA
>
> Competencies
> 4033.2.1 : Storytelling with Data
>   The graduate communicates data insights to technical and nontechnical audiences.
>
> 4033.2.2 : Data Visualizations and Representations
>   The graduate creates data representations to offer insight into an organizational problem.
>
> 4033.2.3 : Dashboards
>   The graduate designs interactive dashboards to support executive decision-making.

#### Table of Contents
<ul>
    <li><a href="#data-cleaning">Pre-work: Data Cleaning</a></li>
    <li><a href="#data-sets">A1: Data Sets</a></li>
    <li><a href="#installation-instructions">A2: Installation Instructions</a></li>
    <li><a href="#navigation-instructions">A3: Navigation Instructions</a></li>
    <li><a href="#panopto-storying-telling-with-data">B: Panopto Storying Telling With Data</a></li>
    <li><a href="#dashboard-alignment">C1: Dashboard Alignment</a></li>
    <li><a href="#additional-data-set-insights">C2: Additional Data Set Insights</a></li>
    <li><a href="#decision-making-support">C3: Decision-Making Support</a></li>
    <li><a href="#interactice-controls">C4: Interactice Controls</a></li>
    <li><a href="#colorblindness">C5: Colorblindness</a></li>
    <li><a href="#data-representation">C6: Data Representation</a></li>
    <li><a href="#audience-analysis">C7: Audience Analysis</a></li>
    <li><a href="#universal-access">C8: Universal Access</a></li>
    <li><a href="#effective-storytelling">C9: Effective Storytelling</a></li>
    <li><a href="#sources">D: Sources</a></li>
    <li><a href="#professional-communication">E: Professional Communication</a></li>
</ul>

<a id="data-sets"></a>
# A1: Data Sets

#### DataSets & Cleaning

 * Cleaning the [WGU](https://www.wgu.edu/) supplied Medical Data with some basic cleaning to keep data similar to `D208` & `D209`
 * Cleaning additional data set related to Readmission from [`Kaggle`](https://www.kaggle.com/) called [`U.S. Hospital Overall Star Ratings 2016-2020`](https://www.kaggle.com/datasets/abrambeyer/us-hospital-overall-star-ratings-20162020)

In [1]:
import warnings
import pandas as pd
from pandas.api.types import CategoricalDtype
import numpy as np

warnings.filterwarnings('ignore')

medical_data = pd.read_csv('./Data/medical-data/medical_clean.csv', index_col=0)
any_missing_values = medical_data.isna().values.any()
if not any_missing_values:
    print('Medical data does NOT contain any missing values\n')
else:
    print('Medical data CONTAINS missing values.\n')

medical_data['Zip'] = medical_data['Zip'].astype('str').str.zfill(5)

column_renames = {
    'Item1': 'Timely_Admission'
    ,'Item2': 'Timely_Treatment'
    ,'Item3': 'Timely_Visits'
    ,'Item4': 'Reliability'
    ,'Item5': 'Options'
    ,'Item6': 'Hours_Of_Treatment'
    ,'Item7': 'Courteous_Staff'
    ,'Item8': 'Listening' #Evidence of active listening from Doctor
}
medical_data.rename(columns=column_renames, inplace=True)

category_dtype = 'category'
convert_to_category = {
    'Gender': category_dtype,
    'ReAdmis': category_dtype,
    'Soft_drink': category_dtype,
    'Initial_admin': category_dtype,
    'HighBlood': category_dtype,
    'Stroke': category_dtype,
    'Complication_risk': category_dtype,
    'Overweight': category_dtype,
    'Arthritis': category_dtype,
    'Diabetes': category_dtype,
    'Hyperlipidemia': category_dtype,
    'BackPain': category_dtype,
    'Anxiety': category_dtype,
    'Allergic_rhinitis': category_dtype,
    'Reflux_esophagitis': category_dtype,
    'Asthma': category_dtype,
    'Services': category_dtype,
    'Timely_Admission': category_dtype,
    'Timely_Treatment': category_dtype,
    'Timely_Visits': category_dtype,
    'Reliability': category_dtype,
    'Options': category_dtype,
    'Hours_Of_Treatment': category_dtype,
    'Courteous_Staff': category_dtype,
    'Listening': category_dtype
}

medical_data = medical_data.astype(convert_to_category)

#Convert Yes/No's to True and False for charting in Tableau
columns_to_reexpress = ['ReAdmis', 'Soft_drink', 'HighBlood', 'Stroke',
                        'Overweight', 'Arthritis', 'Diabetes', 'Hyperlipidemia',
                        'BackPain', 'Anxiety', 'Allergic_rhinitis', 'Reflux_esophagitis',
                        'Asthma']
for column in columns_to_reexpress:
    medical_data[column] = medical_data[column].map({'Yes': True, 'No': False }).astype(np.bool_)

#tableau_visualizations = ['Zip', 'Children', 'Age', 'VitD_levels', 'HighBlood', 'Overweight', 'Arthritis', 'Diabetes', 'BackPain', 'Asthma', 'Initial_days', 'ReAdmis', 'Complication_risk', 'Initial_admin', 'Gender']

#prepared_medical_data = medical_data[tableau_visualizations]

medical_data.to_csv('./tableau-wgu-dataset.csv')
print(f'WGU Medical Data Row Count {len(medical_data)}')
#print(prepared_medical_data.info())

#Additional Data source
#wgu_dataset_zip_codes = medical_data['Zip'].unique()
overall_hospital_ratings = pd.read_csv('./Data/Additional/Us Hospital Overall Rating/Hospital_General_Information_2016_2020.csv', index_col=0)

overall_hospital_ratings['ZIP Code'] = overall_hospital_ratings['ZIP Code'].astype('str').str.zfill(5)
#match_overall_hospital_ratings = overall_hospital_ratings[overall_hospital_ratings['ZIP Code'].isin(wgu_dataset_zip_codes)]

#print(match_overall_hospital_ratings.info())

#TODO: Clean matched data

#move columns
ehr_column = 'Meets criteria for promoting interoperability of EHRs'

#remove footnote columns
remove_footnote_columns = [column_name for column_name in overall_hospital_ratings.columns if 'footnote' in column_name]
print(remove_footnote_columns)
overall_hospital_ratings.drop(columns=remove_footnote_columns, inplace=True)

overall_hospital_ratings[ehr_column].fillna('N', inplace=True)

overall_hospital_ratings[ehr_column] = overall_hospital_ratings[ehr_column].map({'Y': True, 'N': False }).astype(np.bool_)
overall_hospital_ratings['Emergency Services'] = overall_hospital_ratings['Emergency Services'].map({'Yes': True, 'No': False}).astype(np.bool_)


final_column_removals = ['Facility ID']

overall_hospital_ratings.to_csv('./tableau-additional-dataset.csv')
print(f'Additional Data Row Count: {len(overall_hospital_ratings)}')


Medical data does NOT contain any missing values

WGU Medical Data Row Count 10000
['Hospital overall rating footnote', 'Mortality national comparison footnote', 'Safety of care national comparison footnote', 'Readmission national comparison footnote', 'Patient experience national comparison footnote', 'Effectiveness of care national comparison footnote', 'Timeliness of care national comparison footnote', 'Efficient use of medical imaging national comparison footnote']
Additional Data Row Count: 25082


<a id="installation-instructions"></a>
# A2: Installation Instructions

Because of [Tableau Public](https://public.tableau.com/app/discover), there is nothing to install to use this dashboard. It is hosted by Tableau and can be found by clicking here [`Hospital Medical Data Breakdown - Tableau Dashboard`](https://public.tableau.com/views/WGU-D210-PA-NAM2-Task1-Data-Dashboard-And-Storyingtelling/HospitalMedicalDataBreakdown?:language=en-US&:display_count=n&:origin=viz_share_link).

All that is needed is a modern browser such as:
 * [Safari](https://support.apple.com/safari)
 * [Edge](https://www.microsoft.com/edge)
 * [FireFox](https://www.mozilla.org/en-US/firefox/)
 * [Chrome](https://www.google.com/chrome/)
 * [Brave](https://brave.com/download/)

<a id="navigation-instructions"></a>
# A3: Navigation Instructions

To navigate the dashboard for `D210 - Representation and Reporting` simple start by navigating to the dashboard with this [link](https://public.tableau.com/views/WGU-D210-PA-NAM2-Task1-Data-Dashboard-And-Storyingtelling/HospitalMedicalDataBreakdown?:language=en-US&:display_count=n&:origin=viz_share_link).

Here you will find a Tableau Dashboard with 5 Navigation boxes located at the top of the dashboard. These include `Introduction`, `Who Are Our Patients`, `Hopistal Over All Ratings`, `KPIs based on Medical Condition`, and `Assessment Analysis`

##### *Introduction*
This section of the dashboard is where my introductions are made about me and the two datasets used within the dashboard.

##### *Who Are Our Patients*
This interactive dashboard offers users the opportunity to delve into our patient data through interactive exploration based on gender, number of children, and marital status. The dashboard provides five filters to enable interactivity: 'Income Range', 'Gender', 'Patient Age Range', 'Marital Status', and 'By State'. These filters empower users to delve deeper into the data and request more specific information about hospital patients.

For instance, utilizing these controls, you can pose a question such as "Please provide me with details about female patients with an income range of 40,000–50,000, who are married and reside in Washington or Oregon, within the age group of 18–25." To accomplish this, you begin by adjusting the 'Income Range' filter, dragging the left side of the dual slider to 40,000 and the right side to 50,000. Then, you proceed to the 'Gender' filter, selecting only the Female option and clicking Apply. Moving on to the 'Patient Age Range' filter, you set the left slider to 18 and the right slider to 25. Next, you use the 'Marital Status' filter to choose only the Married option and click Apply. Finally, you utilize the 'By State' filter, selecting the checkboxes corresponding to WA and OR, and once again click Apply. Consequently, the data will be filtered to match the criteria specified in the question.

This interactive approach enables users to explore the data more precisely, unraveling valuable insights about our hospital's patient demographics.

##### *Hospitals Over All Ratings*

This interactive dashboard provides users with an opportunity to explore hospital ratings, categorized and color-coded by National Readmission National Comparison and Year.

Within the dashboard, a legend is included, allowing users to highlight specific categories by clicking on the corresponding colors. Additionally, a 'By State' filter is available, enabling the selection of multiple values.

This dashboard facilitates answering queries such as: "How many patients in Kansas, within the 'Above the national average' group, rated hospitals with a score of 2 in 2019?"

To achieve this, the user would simply adjust the 'By State' filter to KS and click Apply. Next, by selecting the 'Above the national average' label in the legend, the associated values would be highlighted. Lastly, the user can locate the row corresponding to 2019 and the relevant Rating number to obtain the desired information.

##### *KPIs based on Medical Conditions*

We have successfully developed an interactive dashboard to comprehensively analyze Key Performance Indicators (KPIs) pertaining to Hospital Readmission. Our previous coursework, including D207, D208, and D209, primarily focused on the [WGU](https://www.wgu.edu/) medical dataset with an emphasis on readmission rates. However, due to the challenges associated with finding external datasets that align closely with our project, we decided to broaden our perspective beyond readmission alone. This expanded approach enables us to explore additional features and KPIs that have the potential to contribute to reducing a hospital's overall readmission rate.

In addition to the detailed examination of readmission rates by age, our analysis now encompasses the following crucial metrics: 'Readmission National Ratings', which provide insights into the national ratings related to readmission; 'Patient Experience National Average'*, which offers an overview of the average patient experience across the nation; 'High-blood Pressure by Age', which focuses on the prevalence of high-blood pressure within different age groups; 'Readmission by Age', which allows us to understand readmission patterns based on age; and 'Hyperlipidemia By Age', which provides valuable information on hyperlipidemia occurrences across different age groups.

By incorporating these additional metrics into our analysis, we can gain a more comprehensive understanding of the factors impacting readmission rates and potentially identify areas for improvement.

##### *Assessment Analysis*

This is a non-interactive tab within the Tableau Story Dashboard. Here you'll find the findings of the overall dashboard.

<a id="panopto-storying-telling-with-data"></a>
# B: Panopto Storying Telling With Data

<a id="dashboard-alignment"></a>
# C1: Dashboard Alignment

In the context of the [WGU](https://www.wgu.edu/) Master's in Data Analytics degree program, the Medical dataset has a focus on readmission of patients who had gone to the hospital with a prior ailment. This Tableau dashboard setup performs a mix of staying on the topic of readmission while bringing a few additional elements into the mix. This would give the governing bodies better information towards making decisions to help mitigate readmission rates. Reducing readmission rates it both good for the hospital and patient. This reduces cost for both parties and helps hospitals not cross past the government standard for readmission rates and fines associated with passing those thresholds in the case of [Medicare](https://hospitalmedicaldirector.com/understanding-the-2023-medicare-hospital-readmission-penalty/).

This Tableau dashboard has 3 interactive sections to help administration make decisions on the reduction of readmission rates. These dashboards are *'Who Are Our Patients'*, *'Hospital Over All Ratings'*, and *'KPIs based on Medical Conditions'*.

The initial dashboard serves as the foundation of the entire analysis. Patient demographics play a critical role, regardless of the specific inquiry posed to the data. It is crucial to understand who will be impacted by the decisions made based on the data and dashboards. This particular dashboard enables us to explore various aspects of the patient population using filters such as income range, patient age range, gender, and state. By leveraging these filters, we can pose different questions and gain insights into the characteristics of the patients involved.

The second dashboard integrates an additional dataset, the `U.S. Hospital Overall Star Ratings`, with the [WGU](https://www.wgu.edu/) medical data, thereby introducing a new dimension to the analysis. This expanded dataset includes ratings such as `Readmission national comparison` and `Patient experience national comparison.` These ratings provide valuable insights into how patients perceive hospitals in relation to readmission rates. Specifically, the dashboard focuses on the "Readmission national comparison" for the years 2016-2019.

The correlation between hospital ratings and readmission rates becomes a significant point of interest. Hospital administrators can derive valuable insights by examining whether low ratings potentially align with higher readmission rates or vice versa. Understanding these dynamics can inform policy changes aimed at reducing readmission rates, considering the relationship between patient satisfaction and readmission rates. These factors should be carefully considered when utilizing this data to make informed decisions regarding hospital policies and strategies to mitigate readmission rates.

The third interactive Tableau dashboard incorporates several key performance indicators (KPIs). While this is an introductory course, I've carefully selected a subset of KPIs that focus on readmission rates. These chosen indicators represent common reasons for readmission and include ratings to provide a comprehensive view that combines hospital medical data with the public perception of hospitals.

The purpose of this dashboard is to enable Executive leadership to utilize filters and delve deeper into the data before making policy decisions. The included data encompasses "Readmission national ratings," "Patient experience national average," "Hypertension by Age," "Readmission's by Age," and "Hyperlipidemia by Age." These categories are further segmented into age ranges, typically spanning four years (e.g., 16-19), and filters for gender and age range facilitate a more detailed breakdown of the KPIs.

By leveraging this dashboard, executive leadership can gain insights into the specific factors contributing to readmission rates and make informed decisions based on the analyzed data.


<a id="additional-data-set-insights"></a>
# C2: Additional Data Set Insights

Throughout my participation in the [WGU MSDA program](https://www.wgu.edu/online-it-degrees/data-analytics-masters-program.html), I have been using the [`Medical Dataset`](https://access.wgu.edu/ASP3/aap/content/g9rke9s0rlc9ejd92md0.html). The dataset has some good features within it, but there seems to be big sections of data missing that indicate context surrounding the data. An example of this would be the [WGU](https://www.wgu.edu/) medical dataset there isn't any indication of years this data is from which hospital, which state they were admitted in (*we have the patients state and address residence not the hospital they were admitted into*), etc.

This brings us to the justification for the additional dataset. Because this is an educational setting and the dataset used was from [Kaggle](https://www.kaggle.com/) which is a site for Data Analytics learning we are artificially blending these datasets together for practicing within Tableau. By blending the dataset [`U.S Hospital Overall Star Ratings 2016-2020`](https://www.kaggle.com/datasets/abrambeyer/us-hospital-overall-star-ratings-20162020) with the [WGU](https://www.wgu.edu) medical dataset we are adding a little context of hospitals and various ratings of hospitals between 2016 and 2020. Specifically the ratings related to readmission and patient experience. Not only is it beneficial for the hospital to figure out which factors help lead to higher readmission rates, but it's also good to be aware of how these readmission factors may affect a hospital perception within the patient community. This is why this "educational" blend of datasets was chosen to simulate a Tableau dashboard.

Variables from [`U.S Hospital Overall Star Ratings 2016-2020`](https://www.kaggle.com/datasets/abrambeyer/us-hospital-overall-star-ratings-20162020) used to increase context to the [WGU](https://www.wgu.edu/) [`Medical Dataset`](https://access.wgu.edu/ASP3/aap/content/g9rke9s0rlc9ejd92md0.html):


| Variable                               | Reason                                                                          |
|----------------------------------------|---------------------------------------------------------------------------------|
| State                                  | Joined together the WGU and Kaggle datasets                                     |
| Year                                   | Use to help divide the ratings over time for deeper analysis                    |
| Readmission national comparison        | A rating scale added to the WGU readmission data for Executive decisions making |
| Patient experience national comparison | A rating scale added to WGU readmission about patient perception                |





<a id="decision-making-support"></a>
# C3: Decision-Making Support

<a id="interactice-controls"></a>
# C4: Interactive Controls

<a id="colorblindness"></a>
# C5: Colorblindness

"Color blindness occurs when you are unable to see colors in a normal way. It is also known as color deficiency. Color blindness often happens when someone cannot distinguish between certain colors. This usually happens between greens and reds, and occasionally blues." (American Academy of Ophthalmology, n.d.).

To make sure that the Tableau Dashboards are accessible for people who may suffer from color blindness, I checked the Tableau documentation for [Color Palettes with RGB Values](https://public.tableau.com/views/TableauColors/ColorPaletteswithRGBValues?:embed=y&:showVizHome=no&:display_count=y&:display_static_image=y#1). Conventionally Tableau provides a color palette for color blindness. I applied these to the visualization within Tableau to be inclusive and accessible.

<a id="data-representation"></a>
# C6: Data Representation

<a id="audience-analysis"></a>
# C7: Audience Analysis

<a id="universal-access"></a>
# C8: Universal Access

To ensure universal access with accessibility in mind, several important steps were taken:

1. Color blindness considerations: Visualizations were designed using the Tableau Color-Blind palette, accommodating users with color vision deficiencies.

2. Hosting on Tableau Public: The dashboard was created and hosted on Tableau Public, which offers a self-hosted version of Tableau. This approach simplifies data access, as it only requires a computer, modern browser, and a link to the Tableau Dashboard.
    - Eliminates the need for technical expertise in software installation.
    - Removes the cost associated with owning the software.
    - Facilitates seamless sharing of data among stakeholders interested in this type of information.

3. Enhancements for the visually impaired: Additional features were implemented to improve accessibility for individuals with visual impairments. Alt text descriptions were provided for images, enabling screen readers and assistive technologies to convey the content effectively.

4. Panopto Video summary: To address other accessibility issues, a concise summary of the data and dashboard was presented via a Panopto Video. This format caters to individuals with different access requirements and ensures that the information is accessible to a wider audience.

By implementing these measures, we have made significant strides towards achieving universal accessibility, allowing diverse users to engage with the data and dashboard in a meaningful way.

<a id="effective-storytelling"></a>
# C9: Effective Storytelling

<a id="sources"></a>
# D: Sources

 * American Academy of Ophthalmology. (n.d.). What is color blindness? Retrieved from https://www.aao.org/eye-health/diseases/what-is-color-blindness

<a id="professional-communication"></a>
# D: Professional Communication