In [None]:

# Part I: Data Question & Sources


## 1. Data Question:

Since 2020 the use of telemedicine has skyrocketed and continues to be a key method for healthcare delivery. It has become a subject of interest to the broader healthcare industry, as evidenced by increased investments from healthcare companies as well as federal and state governments. For example, the Biden administration recently announced it will invest over 770 million for “telemedicine pods” in rural areas across the U.S. - 746,000 of this will be used for TN counties. A state task force dedicated to rural health outcomes recommended an additional 66,900 should be earmarked for expanding telemedicine capabilities in rural communities.

The potential applications for telemedicine are broad and emergent. Many doctors and therapists now offer live video calls for routine issues like medication management, mental health services, or follow-up care after a hospital stay. Other applications of telehealth, such as remote patient monitoring and mobile health applications, showcase the potential of telehealth to radically transform the healthcare landscape. 

In particular, there is much opportunity in rural applications of telehealth to address lack of accessibility. TN has a high rural population, with 82% of its counties classified as rural. According to the TN Rural Health Care Task Force, rural residents are at particular risk for “poor health outcomes, including mental health challenges, obesity, and substance misuse”. 

This project focuses on using data exploration and analysis to address the question: **``How can telehealth best be used to target health challenges in TN?``**


## 2. Data Sources:

The data sources include two publiclly available datasets on telehealth trends among medicare and medicaid recipients respectively. 

There is also a robust dataset of various health factors and health outcomes collected in yearly datasets by the County Health Rankings & Roadmaps (CHR&R)  

##### 1. [Medicare Telehealth Trends](https://catalog.data.gov/dataset/medicare-telemedicine-snapshot) (medicare_trends_df)

The Medicare Telehealth Trends dataset provides information about people with Medicare who used telehealth services between January 1, 2020 and September 30, 2023. 

The data was also used to generate the [Medicare Telehealth Trends Report](https://data.cms.gov/sites/default/files/2024-03/Medicare%20Telehealth%20Trends%20Snapshot%2020240307_508.pdf).

[Data Dictionary](https://data.cms.gov/sites/default/files/2023-05/c22a72d5-1b0c-47db-9d19-5fa75df96f4e/Medicare%20Telehealth%20Trends%20Report%20Data%20Dictionary_20220906_508.pdf) 

##### 2. [Medicaid & Chip Telehealth Trends](https://data.medicaid.gov/dataset/651fa253-4dd4-4867-8725-2b5ae1dd5ce9?conditions[0][property]=state&conditions[0][value]=Tennessee&conditions[0][operator]=%3D&conditions[1][property]=dataquality&conditions[1][value]=DQ&conditions[1][operator]=%3C%3E#data-table) (medicaid_trends_df)

This data set includes monthly counts and rates (per 1,000 beneficiaries) of services provided via telehealth, including live audio video, remote patient monitoring, store and forward, and other telehealth, to Medicaid and CHIP beneficiaries, by state. Data was collected between January 2018 and December 2022. 

Note: Some states have serious data quality issues for one or more months, making the data unusable for calculating telehealth services measures...Cells with a value of “DQ” (in the DataQuality column). indicate that data were suppressed due to unusable data. Additionally, some cells have a value of “DS”. This indicates that data were suppressed for confidentiality reasons because the group included fewer than 11 beneficiaries.

There is no data dictionary for this data set

##### 3. [County Health Rankings and Roadmaps - TN](https://www.countyhealthrankings.org/health-data/tennessee?year=2023&measure=Mental+Health+Providers) (CHRR_df)

County Health Rankings & Roadmaps (CHR&R), a program of the University of Wisconsin Population Health Institute, is dedicated to understanding why there are differences in health within and across communities...CHR&R provides a snapshot of the health of nearly every county in the nation...

Will look at only 2023 data (since 2024 is incomplete)

What Impacts Health? - See the [CHRR Health Model](https://www.countyhealthrankings.org/what-impacts-health/county-health-rankings-model)

[Data Dictionary](https://docs.google.com/spreadsheets/d/18rWeCagA0EANH2OibUtBEG1RMDMgdEL4drCRTT9ekWg/edit?gid=1203469383#gid=1203469383)


# Part II: Data Import & Cleaning


import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import squarify  # pip install squarify (algorithm for treemap)

## 1. Medicare Data

#Read in csv for medicare telehealth trends
medicare_trends_df = pd.read_csv('capstone_data/Medicare/US_medicare_telehealth_data_2020_to_2023.csv')
medicare_trends_df.head()

#Renamed some of the columns based on their definitions so I can remember what they mean while I'm working with them
medicare_trends_df = medicare_trends_df.rename(columns={"Total_Bene_TH_Elig": "nunique_eligible_for_telehealth", "Total_PartB_Enrl": "total_enrollees_Part_B", 'Total_Bene_Telehealth':'nunique_users_of_telehealth', 'Pct_Telehealth':'pct_actual_users_out_of_eligible_users'})

#Multiply the pct column by 100 and round it, for easier readibility
medicare_trends_df['pct_actual_users_out_of_eligible_users'] = round((medicare_trends_df['pct_actual_users_out_of_eligible_users']*100),2)

#Add a new column called "year-quarter" to help break out changes by quarter per year
medicare_trends_df['year-quarter'] = medicare_trends_df['Year'].astype(str) + "-" + medicare_trends_df['quarter'].astype(str) 
medicare_trends_df.head()



## 2. Medicaid & CHIP Data

medicaid_trends_df = pd.read_csv('capstone_data/Medicaid/US_Telehealth-Services-Provided-to-the-MedicaidCHIP-Population.csv')

medicaid_trends_df.head()

#First we need to filter out "unusable" data
medicaid_trends_df = medicaid_trends_df[medicaid_trends_df['DataQuality']!='Unusable']
medicaid_trends_df = medicaid_trends_df[medicaid_trends_df['ServiceCount']!=' DS ']

#Then we need to change the service counts to summable data types

#Let's remove commas
medicaid_trends_df = medicaid_trends_df.replace(',','', regex=True)

#Then convert to numeric data type, coerce non-numbers to NaN
medicaid_trends_df['ServiceCount'] = medicaid_trends_df['ServiceCount'].apply(pd.to_numeric,errors='coerce')
medicaid_trends_df['RatePer1000Beneficiaries'] = medicaid_trends_df['RatePer1000Beneficiaries'].apply(pd.to_numeric,errors='coerce')
medicaid_trends_df.head()

## 3. TN County Health Rankings & Roadmaps

In order to get workable data, I downloaded the CHRR "2023 Tennessee Data" [available here](https://www.countyhealthrankings.org/health-data/tennessee/data-and-resources) 

- I chose the 2023 data set because I want the most recent (but also complete) data set. 
- I then [opened the data in google sheets](https://docs.google.com/spreadsheets/d/1KXcrV9TsnZVKNOwzPdFz0CMRN-HucbbreOU3uATFW34/edit?gid=1441039258#gid=1441039258) 
- I separated out and saved the first sheet called "Introduction", which is basically the [data dictionary](https://docs.google.com/spreadsheets/d/18rWeCagA0EANH2OibUtBEG1RMDMgdEL4drCRTT9ekWg/edit?gid=1203469383#gid=1203469383)
- I combined the two sheets called "Ranked Measure Data" and "Additional Measure Data" into a new workbook
- I removed the top row bc there were merged columns
- I removed all the confidence interval columns since I won't be using them
- I also renamed the z-score columns so none have an identical name
- I kept only columns that may be related to my analysis 
- Finally I [downloaded the result](https://docs.google.com/spreadsheets/d/1HbkNcmNVu_r4-1v-McU214BBaMfUkkywQ251ns_FDjY/edit?gid=1115464274#gid=1115464274) into an excel sheet which I read in below (the download and the original are both also in the capstone data folder)

I could have also merged the two worksheets on FIPS, and dropped all the non-important columns (but using google sheets was faster in this instance)

CHRR_df_all = pd.read_excel('capstone_data/CHRR/2023 TN CHRR Data.xlsx')

#There are a lot of columns so I set the display to show all of them
pd.set_option('display.max_columns', None)

#Will also remove the "All" County since this will create outliers in the analysis
CHRR_df = CHRR_df_all[CHRR_df_all['County']!='All']
CHRR_df

# Part III: Data Exploration & Visualizations

## 1. Initial Exploration

### A. Basic Aggregation Stats

#Use .describe() to get some basic aggregated stats for the data set. 
medicare_stats = round(medicare_trends_df.describe(),2)
medicaid_stats = round(medicaid_trends_df.describe(),2)

#Medicare Stats - telehealth benefit recipients 2020-2023:
medicare_stats.drop(['Year'], axis=1)

#Medicaid Stats - Telehealth services delivered from 2018-2022:
medicaid_stats.drop(['Year','Month','DataQuality'], axis=1)

***Note:*** *The dates of the two datasets do not completely align. Only Jan 2020 - Dec 2022 are overlapping*

### B. Creating Scalable Filter Functions

In order to chart some of the string variables, we need to create a way to filter them at scale

#Define a function that filters any df by specific value(s) in a column (the variable)
#Note: cannot filter this way using operators, more useful for str variables 
def quick_filter(df, variable, *values):
    return df[df[variable].isin(values)]

#test:
#quick_filter(medicare_trends_df,'Bene_Mdcd_Mdcr_Enrl_Stus', 'Medicare Only')

##### Variables Defined Here: 

#List and define the str variables:

#Medicare variables
year = 'Year'
quarter = 'quarter'
state = 'Bene_Geo_Desc'
enrollment_status = 'Bene_Mdcd_Mdcr_Enrl_Stus'
race = 'Bene_Race_Desc'
sex = 'Bene_Sex_Desc'
entitlement = 'Bene_Mdcr_Entlmt_Stus'
age = 'Bene_Age_Desc'
RUCA = 'Bene_RUCA_Desc'


#Medicaid variables:
state = 'State'
year = 'Year'
month = 'Month'
Type = 'TelehealthType'
Count = 'ServiceCount'
rate = 'RatePer1000Beneficiaries'

#For easier reference when chooseing the variables, here is a sampling of unique values in each column:
for series_name, series in medicare_trends_df.items():
       print(series_name)
       print(medicare_trends_df[series_name].unique())
    
#Also the base dataframe info:
#medicaid_trends_df.info()

#quick ref

##### Filters Defined Here:

#Define the value(s) to filter on
#For example, if we want to filter by a specific state, change this value to the state you want 
#Then use the 'state' variable in the quick_filter function 

value = 'Tennessee'
filtered_data = quick_filter(medicare_trends_df,state,value)

#Copy/paste as needed in the visualization section below
#note this doesnt work when needing to filter by operators (like !=, <, >, etc.)

#Can optionally use this for loop to filter most variables by "All" in the medicare data
all_var = filtered_data
for column in filtered_data.iloc[:,3:9]:
    all_var = quick_filter(all_var,column,'All') 

filtered_data = all_var

### C. Year over Year National Trends (Medicare and Medicaid)

#### Medicare Telehealth Usage Year over Year

#Quick visualization of overall telehealth trends across the data set (National) from 2020-2023
#Filter df to all states 
value = 'National'
filtered_data = quick_filter(medicare_trends_df,state,value)

#Remove the 'Overall' yearly data and look at each quarter
data_by_quarter = filtered_data[filtered_data['quarter']!='Overall']
data_by_quarter

#Plot the chart in sns
# set plot style: grey grid in the background:
sns.set_theme(style="whitegrid")

# Set the figure size
plt.figure(figsize=(10, 7))

# plot a bar chart
sns.lineplot(
    x="year-quarter",
    y="pct_actual_users_out_of_eligible_users",
    data=data_by_quarter, 
    color='tab:blue', 
    ci=None);

#Add markers for year
plt.axvline(linewidth=2, linestyle='-.', color='black', x='2020-1')
plt.axvline(linewidth=2, linestyle='-.', color='black', x='2021-1')
plt.axvline(linewidth=2, linestyle='-.', color='black', x='2022-1')
plt.axvline(linewidth=2, linestyle='-.', color='black', x='2023-1')


#labels
plt.xticks(rotation=45)
plt.ylabel('% Telehealth Usage')
plt.xlabel('Year-Quarter')
plt.title('Medicare Part B - National Telehealth Trends by Year & Quarter')

#Observation: Overall telehealth usage is declining year over year; why?

#### Medicaid Service Type Usage Year over Year

#Looking at number of users by telehealth type (year over year trends)
#Note these are national trends

service_counts_by_type_by_year = medicaid_trends_df.groupby(['TelehealthType','Year','Month'])['ServiceCount'].sum().reset_index(name='sum').sort_values(by='TelehealthType', ascending=True)
service_counts_by_type_by_year
#I saved this data as a csv so I can make a good chart in Excel, 
#service_counts_by_type_by_year.to_csv('service_counts_by_type_by_year.csv')


#the excel chart reveals that live audio/video is by far the most used telehealth type
#by filtering that value out we can also see upward trends for store & forward and remote patient monitoring


## 2. TN Medicare Demographic Trends
Telehealth trends are broken down by:
- Race
- Age
- Gender
- Eligibility
- Rural vs. Urban

#### Note: There are no demographic data about medicaid telehealth users, so I will only be looking at Medicare for now

### A. Race

#### Enrollees per race

medicare_trends_df

#Filter by TN, remove "all" 
value = 'Tennessee'
filtered_data = quick_filter(medicare_trends_df,state,value)
filtered_data = filtered_data[filtered_data['Bene_Race_Desc']!='All']
filtered_data = filtered_data[filtered_data['Bene_Race_Desc']!='Other/Unknown']

#Find total enrollee percentage by race
medicare_enrollees_by_race = filtered_data.groupby(['Bene_Race_Desc'])['total_enrollees_Part_B'].sum().reset_index(name='Total Enrollees').sort_values(by='Bene_Race_Desc', ascending=False)
medicare_enrollees_by_race['Percentage'] = round(medicare_enrollees_by_race['Total Enrollees'],2)
medicare_enrollees_by_race['Percentage'] = round(((medicare_enrollees_by_race['Total Enrollees'] / 11657578.25001404)*100),2)
medicare_enrollees_by_race

#medicare_enrollees_by_race = medicare_enrollees_by_race[medicare_enrollees_by_race['Bene_Race_Desc']!='All']

# plot it
ax = squarify.plot(sizes=medicare_enrollees_by_race['Total Enrollees'], color = ['#91DCEA', '#5FBB68',
          '#F9D23C', '#FD6F30'], alpha=.8, ec = 'black')
plt.axis('off')
plt.legend(labels=medicare_enrollees_by_race['Bene_Race_Desc'])
plt.show()

medicare_enrollees_by_race = medicare_enrollees_by_race[medicare_enrollees_by_race['Bene_Race_Desc']!='All']

#define data
height = medicare_enrollees_by_race['Percentage']
bars = medicare_enrollees_by_race['Bene_Race_Desc']
y_pos = np.arange(len(bars))

# Create bars
#plt.bar(bars, height)
plt.barh(bars, width=height)
#plt.xticks(rotation=20)


# Create names on the x-axis and labels
plt.xlabel('% Enrollees')
plt.ylabel('Race')
plt.title('TN Enrollees by Race')

# Show graphic
plt.show()

# doughnut chart example
names = medicare_enrollees_by_race['Bene_Race_Desc']
size = medicare_enrollees_by_race['Total Enrollees']

# Create a circle at the center of the plot
my_circle = plt.Circle( (0,0), 0.7, color='white')

# Give color names
#plt.pie(size,autopct='%1.00f%%',colors=['tab:blue','tab:orange','tab:green','tab:red','tab:purple']) #with values
plt.pie(size,colors=['tab:blue','tab:orange','tab:green','tab:red','tab:purple']) #without values
p = plt.gcf()
p.gca().add_artist(my_circle)


# Show the graph
plt.title('Enrollees by Race')
plt.legend(title = 'Race', labels=names, bbox_to_anchor=(-.05, 1))
plt.show()

#### Percentage telehealth users per race

#Filter by TN, remove "all" and 'other' (there are no values for "other" recorded)
value = 'Tennessee'
filtered_data = quick_filter(medicare_trends_df,state,value)
filtered_data = filtered_data[filtered_data['Bene_Race_Desc']!='All']
filtered_data = filtered_data[filtered_data['Bene_Race_Desc']!='Other/Unknown']
filtered_data['pct_enrollees'] = round(((filtered_data['total_enrollees_Part_B'] / 11657578.25001404)*100),2)

#Group telehealth usage trends by race
medicare_trends_by_race = filtered_data.groupby(['Bene_Race_Desc'])['pct_actual_users_out_of_eligible_users'].mean().reset_index(name='avg_pct_users').sort_values(by='Bene_Race_Desc', ascending=False)
medicare_trends_by_race['avg_pct_users'] = round(medicare_trends_by_race['avg_pct_users'],2)
medicare_trends_by_race

#define data
height = medicare_trends_by_race['avg_pct_users']
bars = medicare_trends_by_race['Bene_Race_Desc']
y_pos = np.arange(len(bars))

# Create bars
#plt.bar(y_pos, height)
plt.barh(bars, width=height, color=['tab:blue','tab:orange','tab:green','tab:red','tab:purple'])
#plt.gca().set_yticklabels([])

# Create names on the x-axis and labels
plt.xlabel('Avg % Utilization')
#plt.ylabel('Race')
plt.title('Average Telehealth Usage Rates per Race')

# Show graphic
plt.show()

#Filter by TN, remove "all" and 'other' (there are no values for "other" recorded)
value = 'Tennessee'
filtered_data = quick_filter(medicare_trends_df,state,value)
filtered_data = filtered_data[filtered_data['Bene_Race_Desc']!='All']
filtered_data = filtered_data[filtered_data['Bene_Race_Desc']!='Other/Unknown']

#Compare: total_enrollees_Part_B, nunique_eligible_for_telehealth, nunique_users_of_telehealth

#Group by race
medicare_trends_by_race = filtered_data.groupby(['Bene_Race_Desc'])[['total_enrollees_Part_B', 'nunique_eligible_for_telehealth', 'nunique_users_of_telehealth']].sum().reset_index().sort_values(by='Bene_Race_Desc', ascending=False)
    
    round(filtered_data['pct_actual_users_out_of_eligible_users'],2)
#df1.merge(df2, how='inner', on='a')
medicare_trends_by_race.head()

### B. Age

#Filter by TN, remove "all" 
value = 'Tennessee'
filtered_data = quick_filter(medicare_trends_df,state,value)
filtered_data = filtered_data[filtered_data['Bene_Age_Desc']!='All']

#Find total enrollee percentage by age
medicare_enrollees_by_age = filtered_data.groupby(['Bene_Age_Desc'])['total_enrollees_Part_B'].sum().reset_index(name='Total Enrollees').sort_values(by='Bene_Age_Desc', ascending=True)
medicare_enrollees_by_age['Total Enrollees'] = round(medicare_enrollees_by_age['Total Enrollees'],2)
medicare_enrollees_by_age['Percentage'] = round(((medicare_enrollees_by_age['Total Enrollees'] / 11657578.25)*100),2)
medicare_enrollees_by_age

# donut chart
names = medicare_enrollees_by_age['Bene_Age_Desc']
size = medicare_enrollees_by_age['Total Enrollees']
 
# Create a circle at the center of the plot
my_circle = plt.Circle( (0,0), 0.7, color='white')

# Give color names
plt.pie(size,autopct='%1.00f%%',colors=['tab:blue','tab:orange','tab:green','tab:red','tab:purple']) #with values
#plt.pie(size) #without values
p = plt.gcf()
p.gca().add_artist(my_circle)

# Show the graph
plt.title('Enrollees by Age')
plt.legend(title = 'Age', labels=names, bbox_to_anchor=(-.05, 1))
plt.show()

#Filter by TN, remove "all" 
value = 'Tennessee'
filtered_data = quick_filter(medicare_trends_df,state,value)
filtered_data = filtered_data[filtered_data['Bene_Age_Desc']!='All']

#Group percentage of users by age
medicare_trends_by_age = filtered_data.groupby(['Bene_Age_Desc'])['pct_actual_users_out_of_eligible_users'].mean().reset_index(name='average_pct').sort_values(by='Bene_Age_Desc', ascending=True)
medicare_trends_by_age['average_pct'] = round(medicare_trends_by_age['average_pct'],2)
medicare_trends_by_age

#Note that even though most medicare recipients are elderly/retired, most of the telehealth users are below 64

#define data
height = medicare_trends_by_age['average_pct']
bars = medicare_trends_by_age['Bene_Age_Desc']
y_pos = np.arange(len(bars))

# Create bars
#plt.bar(y_pos, height)
plt.barh(bars, width=height, color=['tab:blue','tab:orange','tab:green','tab:red','tab:purple'])
#plt.gca().set_yticklabels([])

# Create names on the x-axis and labels
plt.xlabel('Avg % Utilization')
#plt.ylabel('Race')
plt.title('Average Telehealth Usage Rates per Age')

# Show graphic
plt.show()


### C. Gender

#Filter by TN, remove "all" 
value = 'Tennessee'
filtered_data = quick_filter(medicare_trends_df,state,value)
filtered_data = filtered_data[filtered_data['Bene_Sex_Desc']!='All']

#Find total enrollee percentage by age
medicare_enrollees_by_gender = filtered_data.groupby(['Bene_Sex_Desc'])['total_enrollees_Part_B'].sum().reset_index(name='Total Enrollees').sort_values(by='Bene_Sex_Desc', ascending=True)
medicare_enrollees_by_gender['Total Enrollees'] = round(medicare_enrollees_by_gender['Total Enrollees'],2)
medicare_enrollees_by_gender['Percentage'] = round(((medicare_enrollees_by_gender['Total Enrollees'] / 11657578.25)*100),2)
medicare_enrollees_by_gender

# doughnut chart example
names = medicare_enrollees_by_gender['Bene_Sex_Desc']
size = medicare_enrollees_by_gender['Total Enrollees']

# Create a circle at the center of the plot
my_circle = plt.Circle( (0,0), 0.7, color='white')

# Give color names
plt.pie(size,autopct='%1.00f%%',colors=['tab:pink','tab:blue']) #with values
#plt.pie(size) #without values
p = plt.gcf()
p.gca().add_artist(my_circle)

# Show the graph
plt.title('Enrollees by Gender')
plt.legend(title = 'Gender', labels=names, bbox_to_anchor=(-.05, 1))
plt.show()

#Filter by TN, remove "all"
value = 'Tennessee'
filtered_data = quick_filter(medicare_trends_df,state,value)
filtered_data = filtered_data[filtered_data['Bene_Sex_Desc']!='All']

#Group by age
medicare_trends_by_gender = filtered_data.groupby(['Bene_Sex_Desc'])['pct_actual_users_out_of_eligible_users'].mean().reset_index(name='average_pct').sort_values(by='average_pct', ascending=False)
medicare_trends_by_gender['average_pct'] = round(medicare_trends_by_gender['average_pct'],2)
medicare_trends_by_gender

#slightly more participation from female medicare recipients

plt.bar(medicare_trends_by_gender['Bene_Sex_Desc'], medicare_trends_by_gender['average_pct'],color=['tab:pink','tab:blue'])
# Create names on the x-axis and labels
plt.ylabel('Avg % Utilization')
#plt.xlabel('Gender')
plt.title('Average Telehealth Usage Rates per Gender')

# Show graphic
plt.show()


### D. Eligibility

medicare_trends_df.Bene_Mdcr_Entlmt_Stus.unique()

#Filter by TN, remove "all" and "Unknown" (there are no recorded values for "unknown")
value = 'Tennessee'
filtered_data = quick_filter(medicare_trends_df,state,value)
filtered_data = filtered_data[filtered_data['Bene_Mdcr_Entlmt_Stus']!='All']

#Find total enrollee percentage by race
medicare_enrollees_by_entitlement = filtered_data.groupby(['Bene_Mdcr_Entlmt_Stus'])['total_enrollees_Part_B'].sum().reset_index(name='Total Enrollees').sort_values(by='Bene_Mdcr_Entlmt_Stus', ascending=True)
medicare_enrollees_by_entitlement['Total Enrollees'] = round(medicare_enrollees_by_entitlement['Total Enrollees'],2)
medicare_enrollees_by_entitlement['Percentage'] = round(((medicare_enrollees_by_entitlement['Total Enrollees'] / 11657578.25)*100),2)
medicare_enrollees_by_entitlement

# doughnut chart example
names = medicare_enrollees_by_entitlement['Bene_Mdcr_Entlmt_Stus']
size = medicare_enrollees_by_entitlement['Total Enrollees']

# Create a circle at the center of the plot
my_circle = plt.Circle( (0,0), 0.7, color='white')

# Give color names
plt.pie(size,autopct='%1.00f%%',pctdistance=.55, colors=['tab:green','tab:blue','tab:orange']) #with values
#plt.pie(size) #without values
p = plt.gcf()
p.gca().add_artist(my_circle)

# Show the graph
plt.title('Enrollees by Entitlement')
plt.legend(title = 'Entitlement', labels=names, bbox_to_anchor=(-.05, 1))
plt.show()

#Find percentage telehealth users by Entitlement
medicare_trends_by_entitlement = filtered_data.groupby(['Bene_Mdcr_Entlmt_Stus'])['pct_actual_users_out_of_eligible_users'].mean().reset_index(name='average_pct').sort_values(by='Bene_Mdcr_Entlmt_Stus', ascending=True)
medicare_trends_by_entitlement['average_pct'] = round(medicare_trends_by_entitlement['average_pct'],2)
medicare_trends_by_entitlement

plt.bar(medicare_trends_by_entitlement['Bene_Mdcr_Entlmt_Stus'], medicare_trends_by_entitlement['average_pct'],color=['tab:green','tab:blue','tab:orange'])
# Create names on the x-axis and labels
plt.ylabel('Avg % Utilization')
#plt.xlabel('Gender')
plt.title('Average Telehealth Usage Rates per Entitlement')

# Show graphic
plt.show()

### E. Rural vs. Urban

metric = 'Bene_RUCA_Desc'
name = 'RUCA Designation'

#Filter by TN, remove "All" "Other/Unknown" or "Unknown" values as needed
value = 'Tennessee'
filtered_data = quick_filter(medicare_trends_df,state,value)
filtered_data = filtered_data[filtered_data[metric]!='All']
filtered_data = filtered_data[filtered_data[metric]!='Unknown']


#Find total enrollee percentage by RUCA Designation
medicare_enrollees_by_RUCA_Designation = filtered_data.groupby([metric])['total_enrollees_Part_B'].sum().reset_index(name='Total Enrollees').sort_values(by=metric, ascending=True)
medicare_enrollees_by_RUCA_Designation['Total Enrollees'] = round(medicare_enrollees_by_RUCA_Designation['Total Enrollees'],2)
medicare_enrollees_by_RUCA_Designation['Percentage'] = round(((medicare_enrollees_by_RUCA_Designation['Total Enrollees'] / 11657578.25)*100),2)
medicare_enrollees_by_RUCA_Designation

#donut chart 
names = medicare_enrollees_by_RUCA_Designation['Bene_RUCA_Desc']
size = medicare_enrollees_by_RUCA_Designation['Total Enrollees']

# Create a circle at the center of the plot
my_circle = plt.Circle( (0,0), 0.7, color='white')

# Give color names
plt.pie(size,autopct='%1.00f%%',colors=['tab:red','tab:purple']) #with values
#plt.pie(size) #without values
p = plt.gcf()
p.gca().add_artist(my_circle)

# Show the graph
plt.title('Enrollees by RUCA Designation')
plt.legend(title = 'RUCA Designation', labels=names, bbox_to_anchor=(-.05, 1))
plt.show()

#Find percentage telehealth users by RUCA Designation
medicare_trends_by_RUCA_Designation = filtered_data.groupby(['Bene_RUCA_Desc'])['pct_actual_users_out_of_eligible_users'].mean().reset_index(name='average_pct').sort_values(by='Bene_RUCA_Desc', ascending=True)
medicare_trends_by_RUCA_Designation['average_pct'] = round(medicare_trends_by_RUCA_Designation['average_pct'],2)
medicare_trends_by_RUCA_Designation

plt.bar(medicare_trends_by_RUCA_Designation['Bene_RUCA_Desc'], medicare_trends_by_RUCA_Designation['average_pct'],color=['tab:red','tab:purple'])
# Create names on the x-axis and labels
plt.ylabel('Avg % Utilization')
#plt.xlabel('Gender')
plt.title('Average Telehealth Usage Rates per RUCA Designation')

# Show graphic
plt.show()



#### Medicare Demographic trends in violin chart form: 

#Change these according to the column you want to chart:
metric = 'Bene_RUCA_Desc'
name = 'RUCA Designation'

#Filter by TN, remove "all" and "Unknown" (there are no recorded values for "unknown")
value = 'Tennessee'
filtered_data = quick_filter(medicare_trends_df,state,value)
filtered_data = filtered_data[filtered_data[metric]!='All']
filtered_data = filtered_data[filtered_data[metric]!='Unknown']
filtered_data = filtered_data.sort_values(by=metric,ascending=True)

#Set colors and theme 
sns.set_theme(style="whitegrid")
colors=['tab:red','tab:purple','tab:orange']

#Plot violin chart
sns.violinplot(x=filtered_data['pct_actual_users_out_of_eligible_users'],y=filtered_data[metric], palette=colors)
plt.xlabel('% Users of Telehealth')
plt.ylabel(f"{''}")
plt.title('Distribution of Telehealth Usage by 'f"{name}")
plt.show()

## 3. TN County Health Trends

According to the TN Rural Health Care Task Force, rural residents are at particular risk for “poor health outcomes, including mental health challenges, obesity, and substance misuse”. 

78 of Tennessee's 95 counties are classified as rural (82%)

The focus of the analysis shall be on two risk factors: Mental Health and Obesity

Method: Analyze the correlation between healthcare accessibility factors and the various health factors within each topic (mental health and obesity). Where trends emerge, look for overlapping counties with multiple high risk variables

### Correlations Between Mental Health Variables and Demographics

The following variables in the CHRR dataframe are related to mental health outcomes: 

Clinical Care: 
- Mental Health Providers

Quality of Life/Health Factors:
- Poor Mental Health Days
- Frequent Mental Distress*
- Insufficient Sleep*

Health Factors:
- Drug Overdose Deaths*
- Alcohol-Impaired Driving Deaths
- Excessive Drinking

Social & Economic Factors:
- Disconnected Youth*
- Unemployment
- Suicides*

CHRR_df.head()

#First I'll create a basic dataframe with all the demographics and other data, so I can reuse and reorder the variables:
a = 'FIPS'
b = 'County'
c = 'Life Expectancy'
d = 'Population'
e = '% Less than 18 Years of Age'
f = '% 65 and Over'
g = '% Black'
h = '% American Indian or Alaska Native'
i = '% Asian'
j = '% Native Hawaiian or Other Pacific Islander'
k = '% Hispanic'
l = '% Non-Hispanic White'
m = '% Not Proficient in English'
n = '% Female'
o = '% Rural'
p = '% Households with Broadband Access'
q = '% Children in Poverty'
r = '% Children in Single-Parent Households'
s = '% Uninsured Children'

demographics = CHRR_df[[a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s]]
#demographics.head()

#Then I want to create another df with the mental health variables, plus FIPS and County so I can merge later
#variable columns: 
var1 = 'Mental Health Provider Rate'
var2 = 'Average Number of Mentally Unhealthy Days'
var3 = '% Frequent Mental Distress'
var4 = '% Insufficient Sleep'
var5 = 'Drug Overdose Mortality Rate'
var6 = '% Driving Deaths with Alcohol Involvement'
var7 = '% Excessive Drinking'
var8 = '% Disconnected Youth'
var9 = '% Unemployed'
var10 = 'Suicide Rate (Age-Adjusted)'

mental_health_variables = CHRR_df[[a,b,var1,var2,var3,var4,var5,var6,var7,var8,var9,var10]]

#Merge demographics with mental health variables
mental_health_df = demographics.merge(mental_health_variables)

#drop first two columns so that there are only numeric data for correlation analysis
mental_health_correlations = mental_health_df.drop([a, b], axis=1)

#use .corr() to examine if there are any relationships between the mental health variables and demographics
mental_health_correlations = mental_health_correlations.corr()
#mental_health_correlations

#I saved the results to a csv and used google sheets to apply conditional formatting and highlight correlations
#mental_health_correlations.to_csv('mental_health_correlations.csv')

mental_health_correlations

##### Strong Positive Correlations (over .7)
- Average Number of Mentally Unhealthy Days and % Frequent Mental Distress (0.7893453542)
- % Disconnected Youth and % Frequent Mental Distress (0.7420026506)
- Population and % Asian (0.7478482493)
- % Hispanic and % Not Proficient in English (0.7017949064)

##### Slight Positive Correlations (.6 to .69)
- Average Number of Mentally Unhealthy Days and % Disconnected Youth (0.6838628183)
- Life Expectancy and % Asian (0.6339640993)
- % Disconnected Youth and % Rural (0.6803646905)
- % Asian and % Households with Broadband Access (0.6027683762)

##### Strong Negative Correlations (under -.7)
- % Non-Hispanic White and % Black (-0.9629489514) 
- % Households with Broadband Access and % Disconnected Youth (-0.7732368548)
- % Asian and % Frequent Mental Distress (-0.7555997235)
- % Households with Broadband Access and % Frequent Mental Distress (-0.7064048201)
- Life Expectancy and % Frequent Mental Distress (-0.7062932683)


#### Slight Negative Correlations (-.6 to -.69)
- % Insufficient Sleep and % Excessive Drinking (-0.6999499361)
- Life Expectancy and % Disconnected Youth (-0.6950208498)
- % Rural and % Households with Broadband Access (-0.6641905053)
- Average Number of Mentally Unhealthy Days and % Asian (-0.6503108963)
- Life Expectancy and Average Number of Mentally Unhealthy Days (-0.6285612055)
- Average Number of Mentally Unhealthy Days and % Households with Broadband Access (-0.6251370564)
- % Disconnected Youth and % Asian (-0.6177181997)
- Mental Health Provider Rate and % Rural (-0.6038839708)

*Strong correlation between counties with higher percentage of rural population and low numbers of mental health providers compared to population*

*Strong correlation between rural populations and high percentages of "diconnected youths"*



##### Strong Correlations by sub-topic:
Mental Health and Rural Disconnected Youths
- Mental Health Provider Rate
- Average Number of Mentally Unhealthy Days
- % Frequent Mental Distress
- % Disconnected Youth
- % Rural
- % Households with Broadband Access

[[var1,var2,var3,var8,o,p]]

Children Overall
- % Less than 18 Years of Age
- % Disconnected Youth
- % Children in Poverty
- % Children in Single-Parent Households
- % Uninsured Children

[[e,var8,q,r,s]]

Substance Abuse
- Life Expectancy
- Suicide Rate (Age-Adjusted)
- % Insufficient Sleep 
- % Excessive Drinking

[[c,var10,var4,var7]]

#Create a correlagram


#To show a correlogram, I need to break down the data into less variables 

#Is there a correlation between life expectancy and mental health providers? 
rural_youth_mental_health = mental_health_df[[var1,o,var8,var3,var2]]
children_mental_health = mental_health_df[[e,var8,q,r,s]]
substance_abuse = mental_health_df[[c,var10,var4,var7]]

#Is there a correlation between rural populations and rates of disconnected youth, unemployment

rural_youth_mental_health.corr()

mental_health_df.info()

# Create the correlogram
data = mental_health_df[[c,o,var2,var3,var8]]
sns.pairplot(data)
plt.tight_layout()
plt.show()

Average Number of Mentally Unhealthy Days
% Frequent Mental Distress
% Disconnected Youth

# Create the correlogram
data = rural_youth_mental_health
sns.pairplot(data)
plt.tight_layout()
plt.show()


#correlogram
# left
#sns.pairplot(life_expectancy_corr, kind="scatter", hue='Life Expectancy', markers=["o", "s", "D"], palette="Set2")
#plt.show()

#correlogram
# right: you can give other arguments with plot_kws.
#sns.pairplot(mental_health_matrix, kind="scatter", hue="% Not Proficient in English", plot_kws=dict(s=80, edgecolor="white", linewidth=2.5))
#plt.show()

mental_health_df[mental_health_df['Average Number of Mentally Unhealthy Days']<4.00]
#Williamson county is an outlier. They have much higher life expectancy and much lower mental health issues

mental_health_df['Mental Health Provider Rate'].mean()

#### Create individual graphs for

a = 'FIPS'
b = 'County'
c = 'Life Expectancy'
d = 'Population'
e = '% Less than 18 Years of Age'
f = '% 65 and Over'
g = '% Black'
h = '% American Indian or Alaska Native'
i = '% Asian'
j = '% Native Hawaiian or Other Pacific Islander'
k = '% Hispanic'
l = '% Non-Hispanic White'
m = '% Not Proficient in English'
n = '% Female'
o = '% Rural'
p = '% Households with Broadband Access'
q = '% Children in Poverty'
r = '% Children in Single-Parent Households'
s = '% Uninsured Children'

var1 = 'Mental Health Provider Rate'
var2 = 'Average Number of Mentally Unhealthy Days'
var3 = '% Frequent Mental Distress'
var4 = '% Insufficient Sleep'
var5 = 'Drug Overdose Mortality Rate'
var6 = '% Driving Deaths with Alcohol Involvement'
var7 = '% Excessive Drinking'
var8 = '% Disconnected Youth'
var9 = '% Unemployed'
var10 = 'Suicide Rate (Age-Adjusted)'



#define the dataframes to chart:
mhp_corr = mental_health_df[[var1,var2,var3,var10,o]] 
mhp_per_rural_youth = mental_health_df[[var1,o,var8,g]]

#percent rural vs. mental health providers
data = mhp_corr
sns.regplot(x=data['% Rural'],y=data['Mental Health Provider Rate'])

plt.yticks([0,50,100,150,200,250,300,350,400])
plt.xlabel('% Rural Population')
plt.ylabel('Mental Health Providers (per 100k)')
plt.title('% Rural Population vs. Mental Health Providers')

plt.show()

mental_health_df[mental_health_df['Mental Health Provider Rate']>280]

data = mhp_corr[[o,var1]]
grid = sns.pairplot(data) 

for ax in grid.axes.flat[:2]:
    ax.tick_params(axis='x', labelrotation=130)

# Create the correlogram
data = mhp_corr[[o,var1,var2]]
sns.pairplot(data)
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

#scatterplot mental health vs providers
data = mhp_corr
sns.regplot(x=data['Average Number of Mentally Unhealthy Days'],y=data['Mental Health Provider Rate'])

plt.xlabel('Avg # Mentally Unhealthy Days')
plt.ylabel('Mental Health Providers (per 100k)')
plt.title('Mentally Unhealthy Days vs. Providers')

plt.show()
#lower rates of mental health providers is associated with more frequent mentally unhealthy days

#scatterplot for mental health days vs. rural
data = mhp_corr
sns.regplot(x=data['Average Number of Mentally Unhealthy Days'],y=data['% Rural'],ci=None,color='tab:blue')

plt.xlabel('Avg # Mentally Unhealthy Days')
plt.ylabel('% Rural')
plt.title('Mentally Unhealthy Days vs. % Rural')

plt.show()

mhp_corr

#scatterplot for mental health factor correlations
data = mhp_corr
sns.regplot(x=data['% Frequent Mental Distress'],y=data['Mental Health Provider Rate'],ci=None)

plt.yticks([0,50,100,150,200,250,300,350,400])
plt.xlabel('% Frequent Mental Distress')
plt.ylabel('Mental Health Providers (per 100k)')
plt.title('Frequency of Mental Distress vs. Providers')

plt.show()

#scatterplot for mental health factor correlations
data = mhp_corr
sns.regplot(x=data['% Frequent Mental Distress'],y=data['% Rural'])

plt.xlabel('% Frequent Mental Distress')
plt.ylabel('% Rural')
plt.title('Frequency of Mental Distress vs. % Rural')

plt.show()

#scatterplot for mental health factor correlations
data = mhp_corr
sns.regplot(x=data['Suicide Rate (Age-Adjusted)'],y=data['Mental Health Provider Rate'])

plt.yticks([0,50,100,150,200,250,300,350,400])
plt.xlabel('Suicide Rate (Age-Adjusted)')
plt.ylabel('Mental Health Providers (per 100k)')
plt.title('Suicide Rate vs. Providers')

plt.show()

#scatterplot for mental health factor correlations
data = mhp_corr
sns.regplot(x=data['Suicide Rate (Age-Adjusted)'],y=data['% Rural'])

plt.xlabel('Suicide Rate (Age-Adjusted)')
plt.ylabel('% Rural')
plt.title('Suicide Rate vs. % Rural')

plt.show()



#### Disconnected Youth

disc_youth = mental_health_df[[c,g,h,i,j,k,l,o,p,var1,var2,var3,var8]]
disc_youth

#scatterplot for disconnected youth
data = disc_youth
sns.regplot(x=data['% Disconnected Youth'],y=data['% Frequent Mental Distress'])

plt.xlabel('% Disconnected Youth')
plt.ylabel('% Frequent Mental Distress')
plt.title('% Disconnected Youth vs. % Frequent Mental Distress')

plt.show()


#scatterplot for disconnected youth
data = disc_youth
sns.regplot(x=data['% Disconnected Youth'],y=data['Average Number of Mentally Unhealthy Days'])

plt.xlabel('% Disconnected Youth')
plt.ylabel('Average Number of Mentally Unhealthy Days')
plt.title('% Disconnected Youth vs. Avg # Mentally Unhealthy Days')

plt.show()

- % Disconnected Youth and % Frequent Mental Distress (0.7420026506)
- Average Number of Mentally Unhealthy Days and % Disconnected Youth (0.6838628183)
- Life Expectancy and % Disconnected Youth (-0.6950208498)
- % Disconnected Youth and % Rural (0.6803646905) 
- [[c,g,h,i,j,k,l,o,p,var1,var2,var3,var8]]

#scatterplot for disconnected youth
data = disc_youth
sns.regplot(x=data['% Disconnected Youth'],y=data['Life Expectancy'])

plt.xlabel('% Disconnected Youth')
plt.ylabel('Life Expectancy')
plt.title('% Disconnected Youth vs. Life Expectancy')

plt.show()

#scatterplot for disconnected youth
data = disc_youth
sns.regplot(x=data['% Disconnected Youth'],y=data['% Rural'])

plt.xlabel('% Disconnected Youth')
plt.ylabel('% Rural')
plt.title('% Disconnected Youth vs. % Rural')

plt.show()

# mhp_per_rural_youth
data = mhp_per_rural_youth
sns.set_theme(style="whitegrid")
sns.scatterplot(
    data=data,
    x='% Disconnected Youth',
    y="% Rural",
    size="Mental Health Provider Rate",
    legend=False,
    sizes=(1,1000)
)
#plt.xlabel('% Users of Telehealth')
#plt.ylabel(f"{''}")
plt.title('Mental Health Providers per 100k people')

# show the graph
plt.show()

broadband_disc_rural = mental_health_df[[a,b,var1,p,var8,o]]
broadband_disc_rural = broadband_disc_rural.sort_values(by='% Households with Broadband Access')
broadband_disc_rural_low_broad_high_rural = broadband_disc_rural[broadband_disc_rural['% Households with Broadband Access']<75].sort_values(by='% Rural', ascending=False)


#scatterplot for broadband
data = broadband_disc_rural
sns.regplot(x=data['% Disconnected Youth'],y=data['% Households with Broadband Access'])

plt.xlabel('% Disconnected Youth')
plt.ylabel('% Households with Broadband Access')
plt.title('% Disconnected Youth vs. % Households with Broadband Access')

plt.show()

#scatterplot for broadband
data = broadband_disc_rural
sns.regplot(x=data['% Rural'],y=data['% Households with Broadband Access'])

plt.xlabel('% Rural')
plt.ylabel('% Households with Broadband Access')
plt.title('% Rural vs. % Households with Broadband Access')

plt.show()



# Create the correlogram
data = broadband_disc_rural[[p,o,var8]]
sns.pairplot(data)
#plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

#broadband_disc_rural
broadband_disc_rural = mental_health_df[[a,b,p,var8,o]]
data = broadband_disc_rural
sns.set_theme(style="whitegrid")
sns.scatterplot(
    data=data,
    x='% Rural',
    y="% Households with Broadband Access",
    size='% Disconnected Youth',
    legend=False,
    sizes=(1,100)
)
#plt.xlabel('% Users of Telehealth')
#plt.ylabel(f"{''}")
plt.title('Broadband Access Among Rural Disconnected Youth')

# show the graph
plt.show()

### B. Obesity 

Clinical Care: 
- Primary Care Physicians

Quality of Life/Health Factors:
- Diabetes Prevalence* 
- Adult Obesity
- Physical Inactivity

Health Factors:
- Food Environment Index
- Food Insecurity*
- Limited Access to Healthy Foods*







## 4. Telehealth Challenges 
- [Overview](https://www.countyhealthrankings.org/strategies-and-solutions/what-works-for-health/strategies/telemedicine)



- Broadband Access
- Uninsured (overall, adults, and children)
- Tech literacy (esp by age group) - no data set for this

# Part IV: Insights

Overall:
- National trend after covid is higher than before, although it has trended downward over last 3 years. 

Telehealth Types: 
- Live video/audio are by far the most used methods, however if we remove that variable you can see upwards trends in store & forward as well as remote patient monitoring. This suggests the focus is not just in how patients use technology but also how providers incorporate Telehealth into their available services. Adoption strategies may benefit from a focus on provider's technical literacy (not just patients)
- Policy may also play a role. Many rural populations rely on medicaid or medicare for health care. A 2020 report outlines state medicaid policies as one of the challenges in using telehealth to address opioid addiction. The report includes a state readiness assessment for implementing telehealth delivery methods. An assessment of TN’s readiness reveals a few gaps in coverage. For example, the CCHP (Center for Connected Health Policy) has a helpful policy map comparison that shows TN is behind other states in allowing medicaid reimbursement for remote patient monitoring as well as “Store and Forward” medical diagnostic processing. 

Macon, Overton, Haywood, and Polk Counties have a great need for Mental Health Providers
[here](https://www.countyhealthrankings.org/health-data/tennessee?year=2023&measure=Mental+Health+Providers&tab=0)
#Haywood, Macon, Overton, Polk, and Morgan counties are the lowest 5 counties for Mental Health Provider Rates


*Out of all the demographic variables, the rate of mental health providers is most strongly correlated with higher percentages of rural populations*

*rural populations are also strongly correlated with higher percentages of "disconnected youths" or children who are neither working nor in school.*

*If we look at other factors such as: 
    - Children in Single Parent Household
    - Children in Poverty
	- Uninsured Children
    
##### Consider that (at least according to the medicare data) younger people are much more likely to use telehealth services. This suggests that targeting high-risk rural youths for telehealth mental health services has a potential for high impact.  



However, its important to put these findings into context. Telelmedicine is not a replacement for in-person mental health treatment. Rather it is a good candidate for adding flexibility, continuity, and shoring up existing support-systems. 

For example:
- Leverage behavioral health apps and social media groups/sites. For example, there is an app called Smart Recovery, that is geared towards recovering addicts. It provides self-help articles, a chat community, local support group information and scheduling (available both in person and online), a catalog of recovery tools and tips, and an interactive section that provides motivaitonal quotes and directed tasks to distract from addictive urges. There are also apps like betterhelp that provide counseling. Compiling a lsit of similar apps and distributing it to places such as schools, community centers, libraries, trade schools, community colleges, websites, etc, could help raise awareness of the available resources and improve outcomes. 
 - Help providers improve their technical literacy and integrate telehealth techniques into their practice. For example, one barrier ot entry is scheduling an initial meeting with a provider. A lot of youth don't necessarily know how to set up their own appointments, and may become frustrated with phone scheduling. Online scheduling is much easier, and would allow patients to plan out future visits based on provider's transparent availability. 
 - Another idea is allowing consultations that accomodate both live and store and forward techniques, so that those with limited internet can initiate treatment with a lower barrier
 - Promotion of Health Portals to facilitate provider and patient communication. This could include sending surveys, diagnostics, articles or resources related to mental health, prescription management, and case management (for patients with complex needs and multiple providers). 
 - Having the option to switch between in person and telemedicine appointments as needed. Sometimes people cant get to the in person office. These shouldnt be pushed as the main method of treatment however, as in person treatment has distinct advantages. 

[Categories of Telemedicine](https://mdportal.com/education/telemedicine-categories/) 

- 'Mental health is another area that is rapidly adopting remote patient monitoring. Often, anomalies in patient’s physical movement patterns, and even shifts in patient’s mobile phone usage can act as a proxy for changes to their mental condition. With these methods, some suggest it is possible to signal early signs of depression and allow medical providers and family members to quickly respond with interventions.'

[Geographies of Opportunity](https://measureofamerica.org/congressional-districts-2015/) *Published APRIL 22, 2015*
- 'There are over 5.5 million disconnected youth in the United States—young people between the ages of 16 and 24 who are out of work and out of school. In 32 districts, at least one in every five youth are disconnected. These districts are concentrated in the South and Southwest.'

[Ensuring an Equitable Recovery: Addressing Covid-19’s Impact on Education](https://measureofamerica.org/youth-disconnection-2023/) *Published OCTOBER 3, 2023*

[Also see the interactive charts here](https://www.measureofamerica.org/DYinteractive/)

- '**Disconnected youth are young people between the ages of 16 and 24 who are not in school and not working.** The youth disconnection rate tells us a lot about the opportunities available to teens and young adults...Society pays a price in terms of reduced competitiveness, lower tax revenues, and higher health, social services, and criminal justice costs, to name just a few.'
- Rural counties have a youth disconnection rate of 17.3 percent, on average, compared to 11.2 percent in urban centers and 9.9 percent in suburbs

Note that TN is in the higher range nationally, with a 13.5% Disconnected Youth rate





[A Rural Youth Consumer Perspective of Technology to Enhance Face-to-Face Mental Health Services](https://link.springer.com/article/10.1007/s10826-016-0472-z)
*Published 18 June 2016*

 - 'The consumer-based perspectives and experiences reported in the current study are in line with a growing body of literature which advocates for the applicability of **a mix of on and offline mental health support** for some consumers. As such, the term **“blended care”** (i.e. a combination of online and offline components coordinated in a face-to-face mental health setting) has now entered the literature (Wentzel et al., 2016)'

 - 'applications that enhance or promote and don’t seek to replace the desired personal connection are more likely to be better received and utilised by youth mental health consumers...Low cost and limited internet dependent alternatives should be a focus for future research and design, **for example teleconsultations that allow for live and ‘store and forward’ modes** to accommodate those with limited or unreliable internet access (Gillis 2015).'

 - 'If designed sensitively and inclusively, **technology-based additions to care could offer welcome opportunities for young people to participate more meaningfully in their care.** **For example, flexibility and shared decision-making**, two attributes often linked to technology-based additions to care, have been linked to improved patient satisfaction and overall health outcomes (Clever et al. 2006; Swanson et al. 2007)."

[Telehealth for Social Interventions With Adolescents and Young Adults: Diverse Perspectives
](https://www.tandfonline.com/doi/full/10.1080/0312407X.2022.2077120?casa_token=wO8mGMnA1S8AAAAA%3AgraGoshpJxA8i4r_3ulRYZvtLvfH_9Hp_HfAZgbnKNVoP6cHoCw4o0u6LTcjcck_YwIkkwy7UzSr)*Published 17 Jul 2022*
- 'Telehealth can be seen as an additional modality to be carefully considered, rather than as a substitute for care as usual.'

[Medicare & Medicaid Telehealth policies](https://telehealth.hhs.gov/providers/telehealth-policy/medicare-and-medicaid-policies)

# Further Research

Many factors can play a role in health outcomes. The TN Department of Health has a webpage that outlines many contributing factors to poor health outcomes in rural populations.  Any of these could be explored further in the context of this analysis: 
 - average proximity to hospitals
  - food scarcity 
  - physical activeness trends 
  - internet or broadband access
  - health and nutrition literacy
  - social isolation levels
  - socioeconomic status by county

#### Additional Articles & Resources
- [Telehealth.HHS.gov](https://telehealth.hhs.gov/research-articles) - Additional research articles and papers on the impact and uses of telehealth, maintained by the HRSA Office for Advancement of Telehealth
- [CHRR Strategies for Telehealth](https://www.countyhealthrankings.org/strategies-and-solutions/what-works-for-health/strategies?keywords=telemedicine&sort_by=search_api_relevance) - Strategy guides with extensive research notes and citations. Includes evidence-informed strategies to create communities where everyone can thrive.
- [Additional CHRR Tennessee Data and Resources](https://www.countyhealthrankings.org/health-data/tennessee/data-and-resources) - These data sources provide information for communities looking for more local data. These sources provide either unique, local data; more information on demographic breakdowns (e.g., age, sex, race/ethnicity); or data for sub-county geographic units (e.g., cities, zipcodes or school districts).

[Reconnecting Youth](https://youth.gov/youth-topics/opportunity-youth/reconnecting-youth?q=)

## References (for me only)

#Treemap (not very useful but keeping it for reference)
service_counts_by_type = medicaid_trends_df.groupby(['TelehealthType'])['ServiceCount'].sum().reset_index(name='sum').sort_values(by='TelehealthType', ascending=True)

# Create a data frame with fake data
#df = pd.DataFrame({'nb_people':[8,3,4,2], 'group':["group A", "group B", "group C", "group D"] })

# plot it
squarify.plot(sizes=service_counts_by_type['sum'], label=service_counts_by_type['TelehealthType'], alpha=.8 )
plt.axis('off')
plt.show()

# doughnut chart example
names = medicare_trends_by_age['Bene_Age_Desc']
size = medicare_trends_by_age['average_pct']
 
# Create a circle at the center of the plot
my_circle = plt.Circle( (0,0), 0.7, color='white')

# Give color names
plt.pie(size, labels=names)
p = plt.gcf()
p.gca().add_artist(my_circle)

# Show the graph
plt.show()

Groupby Example:
#Count the number of times each country occurs in the data frame using a groupby
#First indicate the column to group by (country)
#Then indicate the column to perform the calculation on (also country)
#Use value counts to count number of times each category occurs in df
#reset the index to turn the result into a dataframe and name the new column 'count'
#Finally sort values to get the list in order of count descending
top_countries = gourds.groupby(['country'])['country'].value_counts().reset_index(name='count').sort_values(by='count', ascending=False)
top_countries
