<h2>Research Question:</h2>
<p>What are the prevalence rates of ADHD in children by age and ethnicity, and how does treatment vary across these groups? Can culturally modified treatments improve outcomes, and if so, how?</p>

<h2>Introduction</h2>
<h4><b>What is ADHD?</b></h4>

<p><b>Attention Deficit Hyperactivity Disorder (ADHD)</b> is a neurodevelopmental condition marked by inattention, hyperactivity, and impulsivity, affecting daily life and development. </p>

<h5><b>Data Sources:</b></h5>
<ul>
    <li>Journal Article (Figures & data): <a href="https://www.tandfonline.com/doi/figure/10.1080/15374416.2024.2335625?scroll=top&needAccess=true" target="_blank">ADHD Prevalence Among U.S. Children and Adolescents in 2022</a></li>
    <li>Clinical Trial: <a href="https://www.clinicaltrials.gov/study/NCT02317692?cond=ADHD&term=latino&limit=100&page=1&rank=1" target="_blank">ADHD Treatment for Latino Families</a></li>
</ul>

<h5><b>First, we will start with the Journal Article</b></h5>

Creating a DataFrame from table 1:

In [3]:
import pandas as pd

# Reference: https://stackoverflow.com/questions/45652772/pandas-read-csv-is-shifting-columns
df_table1 = pd.read_csv("table_1_prevalence_of_adhd.csv", sep=',', index_col=False)

# Reference: https://stackoverflow.com/questions/11346283/renaming-column-names-in-pandas
df_table1.rename(columns={'Unnamed: 0': 'Category'}, inplace=True)
df_table1.rename(columns={'Unnamed: 3': 'Ever PR (Prevalence Ratio)'}, inplace=True)
df_table1.rename(columns={'Unnamed: 5': 'Has PR (Prevalence Ratio)'}, inplace=True)
df_table1

ImportError: Unable to import required dependencies:
numpy: Error importing numpy: you should not try to import numpy from
        its source directory; please exit the numpy source tree, and relaunch
        your python interpreter from there.

Creating a DataFrame from table 2: 

In [None]:
df_table2 = pd.read_csv("table_2_distribution_of_adhd_severity.csv", sep=',', index_col=False)
df_table2.rename(columns={'Unnamed: 0': 'Category'}, inplace=True)
df_table2.rename(columns={'Severe ADHD.1': 'Severe ADHD PR '}, inplace=True)
df_table2

Creating a DataFrame from table 5:

In [None]:
df_table5 = pd.read_csv("table_5_adhd_treatment_types.csv", sep=',', index_col=False)

df_table5.rename(columns={'Unnamed: 0': 'Category'}, inplace=True)
df_table5.rename(columns={'Unnamed: 5': 'PR'}, inplace=True)
df_table5

<h5>Extracting Data from <b>Table 1</b> based on:</h5>

- Race
- Current ADHD Diagnosis

In [None]:
# Reference: https://www.geeksforgeeks.org/get-a-specific-row-in-a-given-pandas-dataframe/
# Reference: https://www.geeksforgeeks.org/pandas-dataframe-get_value/

import re 

race = []
percentages = []
prevalence_list = []

print("Current ADHD Diagnosis by Race:\n")
rows = [10, 11, 12, 13, 14, 15]

for row in rows: 
    category = df_table1._get_value(row, 0, takeable=True).strip()  # Extract category
    percentage = df_table1._get_value(row, 4, takeable=True).split(' ')[0]  # Extract percentage
    # Reference: https://stackoverflow.com/questions/13682044/remove-unwanted-parts-from-strings-in-a-column
    # Reference: https://www.w3schools.com/python/python_regex.asp
    percentage = re.match(r'(^\d+(\.\d+)?(?=[^\d]|$))',percentage).group().strip()
    prevalence  = df_table1._get_value(row, 5, takeable=True).split(' ')[0].strip()  # Extract prevalence 
    race.append(category)
    percentages.append(float(percentage))
    if prevalence == 'Ref.':
        reference_group = category
        prevalence = 1
    prevalence_list.append(float(prevalence))
    print(f"{category}: {percentage}%, PR: {prevalence}")

Sorting the data: 

In [None]:
# Reference: https://stackoverflow.com/questions/36622977/sorting-one-list-based-on-the-results-of-python-in-descending-order
z = list(zip(race, percentages, prevalence_list))
z = sorted(z, key=lambda i: i[1], reverse=True)
race, percentages, prevalence_list = list(zip(*z))

print(f"Sorted: \n {race}, \n {percentages}, \n {prevalence_list}")

<h5><b>Bar Chart:</b> ADHD Prevalence by Race</h5>

In [None]:
# Reference: https://matplotlib.org/stable/gallery/lines_bars_and_markers/bar_colors.html#sphx-glr-gallery-lines-bars-and-markers-bar-colors-py
# Reference: https://matplotlib.org/stable/gallery/color/named_colors.html
# Reference: https://stackoverflow.com/questions/10998621/rotate-axis-tick-labels
# Reference: https://www.geeksforgeeks.org/how-to-annotate-bars-in-barplot-with-matplotlib-in-python/

import matplotlib.pyplot as plt

fig, ax = plt.subplots(figsize=(10, 6))

ax.set_ylabel('CI Percentage (%)')
ax.set_title('ADHD Prevalence Among U.S. Children and Adolescents in 2022 by Race')
plt.xticks(rotation=45, ha='right')

# Reference: https://www.geeksforgeeks.org/how-to-annotate-bars-in-barplot-with-matplotlib-in-python/
bars = plt.bar(race, percentages, color='skyblue')

ax.margins(y=0.2)  # Add 20% margin above the tallest bar

# Annotate above bars (prevalence ratios)
for bar, pr in zip(bars, prevalence_list):
    plt.text(
        bar.get_x() + bar.get_width() / 2, 
        bar.get_height() + 0.5, 
        f'PR: {pr:.2f}', 
        ha='center', 
        va='bottom', 
        fontsize=10
    )

# Reference: https://stackoverflow.com/questions/63799934/how-to-annotate-a-bar-plot-and-add-a-custom-legend

# Add a legend
plt.legend(title=f'PR = Prevalence Ratio\nReference group = {reference_group}', loc='upper right', fontsize=12, frameon=True)

# Show the plot
plt.tight_layout()  # Ensure everything fits within the figure
plt.show()

<h5><b>Extracting Data from Table 1 based on:</b></h5>

- Ethnicity
- Current ADHD Diagnosis

In [None]:
ethnicity = []
e_percentages = []
e_prevalence_list = []

print("Current ADHD Diagnosis by Ethnicity:\n")

e_rows = [17,18]

for row in e_rows: 
    e_category = df_table1._get_value(row, 0, takeable=True).strip()  # Extract category
    e_percentage = df_table1._get_value(row, 4, takeable=True).split(' ')[0]  # Extract percentage
    e_percentage = re.match(r'(^\d+(\.\d+)?(?=[^\d]|$))', e_percentage).group().strip() # Extract percentage
    e_prevalence  = df_table1._get_value(row, 5, takeable=True).split(' ')[0].strip()  # Extract prevalence 
    ethnicity.append(e_category)
    e_percentages.append(float(e_percentage))
    if e_prevalence == 'Ref.':
        reference_group2 = e_category
        e_prevalence = 1
    e_prevalence_list.append(float(e_prevalence))
    print(f"{e_category}: {e_percentage}%, PR: {e_prevalence}")

Sorting the data: 

In [None]:
z = list(zip(ethnicity, e_percentages, e_prevalence_list))
z = sorted(z, key=lambda i: i[1], reverse=True)
ethnicity, e_percentages, e_prevalence_list = list(zip(*z))

print(f"Sorted: \n {ethnicity}, \n {e_percentages}, \n {e_prevalence_list}")

<h5><b>Bar Chart:</b> ADHD Prevalence by Ethnicity</h5>

In [None]:
fig, ax = plt.subplots(figsize=(7, 5))

ax.set_ylabel('CI Percentage (%)')
ax.set_title('ADHD Prevalence Among U.S. Children and Adolescents in 2022 by Ethnicity')
ax.margins(y=0.12)  # Add 12% margin above the tallest bar

bars = plt.bar(ethnicity, e_percentages, color='skyblue', width = 0.5)

# Annotate above bars (prevalence ratios)
for bar, pr in zip(bars, e_prevalence_list):
    plt.text(
        bar.get_x() + bar.get_width() / 2, 
        bar.get_height() + 0.5, 
        f'PR: {pr:.2f}', 
        ha='center', 
        va='bottom', 
        fontsize=10
    )

# Add a legend
plt.legend(title=f'PR = Prevalence Ratio\nReference group = {reference_group2}', loc='upper right', fontsize=12, frameon=True)
# Show the plot
plt.tight_layout()  # Ensure everything fits within the figure
plt.show()

<h5>Extracting Data from <b>Table 2</b> based on:</h5>

- ADHD Severity
- Ethnicity 

In [None]:
ethnicity = []
e2_percentages = []
e2_severe_prevalence_list = []

print("ADHD Severity by Ethnicity:\n")
e2_rows = [17, 18]

for row in e2_rows: 
    e_category = df_table2._get_value(row, 0, takeable=True).strip()  # Extract category
    mild = float(df_table2._get_value(row, 2, takeable=True).split(' ')[0]) # Extract mild percentage
    moderate = float(df_table2._get_value(row, 3, takeable=True).split(' ')[0])  # Extract moderate percentage
    severe = float(df_table2._get_value(row, 4, takeable=True).split(' ')[0])  # Extract severe percentage
    severe_adhd_prevalence = df_table2._get_value(row, 5, takeable=True).split(' ')[0].strip()  # Extract prevalence 
    ethnicity.append(e_category)
    e2_percentages.append([mild, moderate, severe])
    if severe_adhd_prevalence == 'Ref.':
        severe_adhd_prevalence = 1
    e2_severe_prevalence_list.append(float(severe_adhd_prevalence))
    print(f"{e_category}: Mild: {mild}%, Moderate: {moderate}%, Severe: {severe}%, Severe PR: {severe_adhd_prevalence}")

Sorting the data:

In [None]:
# Sort data by 'mild' percentages 
z = list(zip(ethnicity, e2_percentages, e2_severe_prevalence_list))
z = sorted(z, key=lambda i: i[1][0], reverse=True)  
ethnicity, e2_percentages, e2_prevalence_list = list(zip(*z))

print(f"Sorted: \n {ethnicity}, \n {e2_percentages}, \n {e2_severe_prevalence_list}")

<h5><b>Bar Chart:</b> ADHD Severity by Ethnicity</h5>


In [None]:
# Reference: https://www.geeksforgeeks.org/create-a-grouped-bar-plot-in-matplotlib/

import matplotlib.pyplot as plt 
import numpy as np 

fig, ax = plt.subplots(figsize=(8, 6))
ax.margins(y=0.30)  # Add 30% margin above the tallest bar
ax.set_title('ADHD Severity Among U.S. Children and Adolescents in 2022 by Ethnicity')
x = np.arange(len(ethnicity)) 

mild = e2_percentages[0][0], e2_percentages[1][0]
moderate = e2_percentages[0][1], e2_percentages[1][1]
severe = e2_percentages[0][2], e2_percentages[1][2]
width = 0.2

# Plot data in grouped manner of bar type 
bars_mild = plt.bar(x - 0.2, mild, width, color='cyan', label="Mild") 
bars_moderate = plt.bar(x, moderate, width, color='orange', label="Moderate") 
bars_severe = plt.bar(x + 0.2, severe, width, color='green', label="Severe") 
plt.xticks(x, ethnicity) 
plt.xlabel("Ethnicity") 
plt.ylabel("CI: 95% Confidence interval") 
# Reference: https://www.geeksforgeeks.org/change-the-legend-position-in-matplotlib/
plt.legend(loc='upper right') 

all_bars = bars_mild, bars_moderate, bars_severe

# Reference: https://www.geeksforgeeks.org/how-to-annotate-bars-in-barplot-with-matplotlib-in-python/
# Annotate bars
for bars in all_bars:
    for bar in bars:
        plt.text(
            bar.get_x() + bar.get_width() / 2,  # x-coordinate
            bar.get_height(),                  # y-coordinate
            f'{bar.get_height():.2f}',         # Annotation text
            ha='center', va='bottom', fontsize=10  # Text alignment
        )

plt.tight_layout()  # Ensure everything fits within the figure
plt.show() 

<h5>Extracting Data from <b>Table 5</b> based on:</h5>

- ADHD Treatment Type
- Ethnicity 

In [None]:
ethnicity = []
e3_rows = [17, 18]
treatment_percentages = []

print("ADHD Treatment Type by Ethnicity: \n")

for row in e3_rows: 
    e_category = df_table5._get_value(row, 0, takeable=True).strip()  # Extract category
    both = float(df_table5._get_value(row, 1, takeable=True).split(' ')[0]) # Extract mild percentage
    medication_only = float(df_table5._get_value(row, 2, takeable=True).split(' ')[0])  # Extract moderate percentage
    behavioral_treatment_only = float(df_table5._get_value(row, 3, takeable=True).split(' ')[0])  # Extract severe percentage
    neither = float(df_table5._get_value(row, 4, takeable=True).split(' ')[0])  # Extract severe percentage
    ethnicity.append(e_category)
    treatment_percentages.append([both, medication_only, behavioral_treatment_only, neither])
    print(f"{e_category}: Both: {both}%, Medication only: {medication_only}%, Behavioral Treatment only: {behavioral_treatment_only}%, Neither: {neither}%")

Sorting the data: 

In [None]:
z = list(zip(ethnicity, treatment_percentages))
z = sorted(z, key=lambda i: i[1][0], reverse=True)  
ethnicity, treatment_percentages = list(zip(*z))

print(f"Sorted: \n {ethnicity},\n {treatment_percentages}")

<h5><b>Bar Chart:</b> ADHD Treatment Type by Ethnicity</h5>

In [None]:
import matplotlib.pyplot as plt 
import numpy as np 

fig, ax = plt.subplots(figsize=(10, 6))
ax.margins(y=0.50)  # Add 50% margin above the tallest bar
ax.set_title('ADHD Treatment Type Among U.S. Children and Adolescents in 2022 by Ethnicity')
x = np.arange(len(ethnicity)) 

both = treatment_percentages[0][0], treatment_percentages[1][0]
medication_only = treatment_percentages[0][1], treatment_percentages[1][1]
behavioral_treatment_only = treatment_percentages[0][2], treatment_percentages[1][2]
neither = treatment_percentages[0][3], treatment_percentages[1][3]
width = 0.2
  
# Plot data in grouped manner of bar type 
bars_both = plt.bar(x - 0.2, both, width, color='cyan', label="Both") 
bars_medication_only = plt.bar(x, medication_only, width, color='orange', label="Medication only") 
bars_behavioral_treatment_only = plt.bar(x + 0.2, behavioral_treatment_only, width, color='green', label="Behavioral Treatment only") 
bars_neither = plt.bar(x + 0.4, neither, width, color='red', label="Neither") 

plt.xticks(x, ethnicity) 
plt.xlabel("Ethnicity") 
plt.ylabel("CI: 95% Confidence interval") 
plt.legend(loc='upper right') 

all_bars_treatment = bars_both, bars_medication_only, bars_behavioral_treatment_only, bars_neither

for bars in all_bars_treatment:
    for bar in bars:
        plt.text(
            bar.get_x() + bar.get_width() / 2,  # x-coordinate
            bar.get_height(),                  # y-coordinate
            f'{bar.get_height():.2f}',         # Annotation text
            ha='center', va='bottom', fontsize=10  # Text alignment
        )

plt.tight_layout()  # Ensure everything fits within the figure
plt.show() 

<h5><b>Second, we will focus on the Clinical Trial: ADHD Treatment for Latino Families</b></h5>


In [None]:
# Reference: https://www.clinicaltrials.gov/data-api/api
# Reference: https://stackoverflow.com/questions/59306252/importing-json-file-url-to-pandas-data-frame
import pandas as pd
import csv
import json
import requests

url = "https://clinicaltrials.gov/api/v2/studies/NCT02317692"
response = requests.get(url)
clinical_trials = response.json()
clinical_trials

Accessing the Keys

In [None]:
for key in clinical_trials.keys():
    print(key)

Creating a DataFrame from the Clinical Trials Results Section

In [None]:
df_ct_results_section = pd.DataFrame(clinical_trials['resultsSection'])
df_ct_results_section

In [None]:
outcomeMeasures = df_ct_results_section["outcomeMeasuresModule"][7]
outcomeMeasures

In [None]:
df_normalize_outcomes_measures = pd.json_normalize(outcomeMeasures)
df_normalize_outcomes_measures

Converting Outcome Measure Titles to a DataFrame

In [None]:
title = df_normalize_outcomes_measures["title"]
title

# Reference: https://stackoverflow.com/questions/26097916/convert-pandas-series-to-dataframe
title_df = pd.DataFrame({'Title':title.values})
title_df

Converting Outcome Measure Descriptions to a DataFrame

In [None]:
descriptions = df_normalize_outcomes_measures["description"]
descriptions

descriptions_df = pd.DataFrame({'Descriptions':descriptions.values})
descriptions_df

Extracting Groups from Outcome Measures as a List

In [None]:
groups = list(df_normalize_outcomes_measures["groups"])
groups

Normalizing and Converting Groups Data into a DataFrame

In [None]:
groups_df = pd.json_normalize(groups)
groups_df

Extracting and Normalizing Group-Specific Data into DataFrames

In [None]:
groups_df_standard = pd.json_normalize(groups_df[0])
groups_df_standard

groups_df_standard_title = groups_df_standard["title"][0]
groups_df_standard_title

groups_df_standard_description = groups_df_standard["description"][0]
groups_df_standard_description

groups_df_culturally = pd.json_normalize(groups_df[1])
groups_df_culturally_title = groups_df_culturally["title"][0]
groups_df_culturally_title

groups_df_culturally_description = groups_df_culturally["description"][0]
groups_df_culturally_description

Extracting Denominator Data from Outcome Measures as a List

In [None]:
denoms = list(df_normalize_outcomes_measures["denoms"])
denoms

Normalizing and Converting Denominator Data into a DataFrame

In [None]:
denoms_df = pd.json_normalize(denoms)
denoms_df

Normalizing and Converting Participant Denominator Data into a DataFrame

In [None]:
denoms_df_participants = pd.json_normalize(denoms_df[0])
denoms_df_participants

Normalizing and Converting Participant Counts Data into a DataFrame

In [None]:
denoms_df_participants_counts = pd.json_normalize(denoms_df_participants["counts"])
denoms_df_participants_counts

Processing and Refining Standard Participant Counts DataFrame

In [None]:
denoms_df_participants_counts_standard = pd.json_normalize(denoms_df_participants_counts[0])
denoms_df_participants_counts_standard.rename(columns={"value": 'Standard Participant Count'}, inplace=True)
denoms_df_participants_counts_standard  = denoms_df_participants_counts_standard.drop('groupId', axis=1)
denoms_df_participants_counts_standard = denoms_df_participants_counts_standard.astype(float)
denoms_df_participants_counts_standard

Processing and Refining Culturally Modified Participant Counts DataFrame

In [None]:
denoms_df_participants_counts_culturally = pd.json_normalize(denoms_df_participants_counts[1])
denoms_df_participants_counts_culturally.rename(columns={"value": 'Culturally Modified Participant Count'}, inplace=True)
denoms_df_participants_counts_culturally = denoms_df_participants_counts_culturally.drop('groupId', axis=1)
denoms_df_participants_counts_culturally = denoms_df_participants_counts_culturally.astype(float)
denoms_df_participants_counts_culturally

Normalizing and Converting Family Denominator Data into a DataFrame

In [None]:
denoms_df_families = (pd.json_normalize(pd.json_normalize(denoms_df[1])['counts']))
denoms_df_families

Processing and Refining Standard Families DataFrame

In [None]:
denoms_df_families_standard = pd.json_normalize(denoms_df_families[0])
denoms_df_families_standard.rename(columns={"value": 'Standard Families'}, inplace=True)
denoms_df_families_standard = denoms_df_families_standard.drop('groupId', axis=1)
denoms_df_families_standard = denoms_df_families_standard.astype(float)
denoms_df_families_standard

Processing and Refining Culturally Modified Families DataFrame

In [None]:
denoms_df_families_culturally = pd.json_normalize(denoms_df_families[1])
denoms_df_families_culturally.rename(columns={"value": 'Culturally Modified Families'}, inplace=True)
denoms_df_families_culturally = denoms_df_families_culturally.drop('groupId', axis=1)
denoms_df_families_culturally = denoms_df_families_culturally.astype(float)
denoms_df_families_culturally

Normalizing Classes Data for Outcome Measures

In [None]:
classes = df_normalize_outcomes_measures["classes"]
classes_normalized = pd.json_normalize(classes)
classes_normalized 

classes_normalized_2 = pd.json_normalize(classes_normalized[0])
classes_normalized_2

classes_normalized_3 = pd.json_normalize(classes_normalized_2["categories"])
classes_normalized_3

classes_normalized_4 = pd.json_normalize(classes_normalized_3[0])
classes_normalized_4

classes_normalized_5 = pd.json_normalize(classes_normalized_4["measurements"])
classes_normalized_5

Processing and Refining Standard ADHD Parent Training Results DataFrame

In [None]:
classes_normalized_standard = pd.json_normalize(classes_normalized_5[0])
classes_normalized_standard.rename(columns={"value": 'Standard ADHD Parent Training Results'}, inplace=True)
classes_normalized_standard = classes_normalized_standard.drop('groupId', axis=1)
classes_normalized_standard = classes_normalized_standard.drop('spread', axis=1)
classes_normalized_standard = classes_normalized_standard.astype(float)
classes_normalized_standard

Processing and Refining Culturally-Modified ADHD Parent Training Results DataFrame

In [None]:
classes_normalized_culturally = pd.json_normalize(classes_normalized_5[1])
classes_normalized_culturally.rename(columns={"value": 'Culturally-modified ADHD Parent Training Results'}, inplace=True)
classes_normalized_culturally = classes_normalized_culturally.drop('groupId', axis=1)
classes_normalized_culturally = classes_normalized_culturally.drop('spread', axis=1)
classes_normalized_culturally = classes_normalized_culturally.astype(float)
classes_normalized_culturally

Final Output: Summarizing Groups with Titles and Descriptions

In [None]:
print(" ---------- Groups ---------- \n")
print(f"Standard:\n\n {groups_df_standard_title} \n {groups_df_standard_description}\n")
print(f"Culturally:\n\n {groups_df_culturally_title}, \n {groups_df_culturally_description} ")

Final Output: Displaying Titles and Descriptions DataFrames

In [None]:
print(" ----------------- DataFrames for ----------------- \n")
print(f"Titles:\n\n {title_df}\n")
print(f"Descriptions:\n\n {descriptions_df} ")

Final Output: Displaying Denominator Data for Participants and Families

In [None]:
print(" ------------------------ Denoms ------------------------ \n")
print(f"Standard Participants:\n\n {denoms_df_participants_counts_standard} \n")
print(f"Culturally Participants:\n\n {denoms_df_participants_counts_culturally} \n ")

print(f"Standard Families:\n\n {denoms_df_families_standard} \n")
print(f"Culturally Families:\n\n {denoms_df_families_culturally} ")

Final Output: Displaying Results for Standard and Culturally-Modified Classes

In [None]:
print(" ----------- Classes (Results) ----------- \n ")
print(f"Standard:\n\n {classes_normalized_standard} \n")
print(f"Culturally:\n\n {classes_normalized_culturally} ")

Merging and Creating Final Clinical Trials DataFrame

In [None]:
import pandas as pd
import matplotlib.pyplot as plt

# Reference: https://www.geeksforgeeks.org/python-pandas-merging-joining-and-concatenating/
frames = [title_df, descriptions_df, denoms_df_participants_counts_standard, 
          denoms_df_participants_counts_culturally, denoms_df_families_standard, 
          denoms_df_families_culturally, classes_normalized_standard, classes_normalized_culturally]

clinical_trials_final_df = pd.concat(frames, axis=1, join='inner')
clinical_trials_final_df = clinical_trials_final_df.fillna(0)
clinical_trials_final_df

Creating Treatment DataFrame by Removing the Last N Rows

In [None]:
# Reference: https://www.geeksforgeeks.org/remove-last-n-rows-of-a-pandas-dataframe/

# Number of rows to drop
n = 3

# Removing last n rows
treatment_df = clinical_trials_final_df.iloc[:-n]
treatment_df

Creating Treatment Results DataFrame by Dropping Non-Results Columns

In [None]:
columns_to_drop = ['Standard Participant Count', 'Culturally Modified Participant Count', 'Standard Families', 'Culturally Modified Families']

treatment_results_only_df = treatment_df.drop(columns=columns_to_drop, axis=1)
treatment_results_only_df

<h5><b>Grouped Bar Chart:</b> Visualizing ADHD Treatment Data for Latino Families</h5>


In [None]:
# Reference: https://www.youtube.com/watch?v=1h0LvhDg9NA: Plot Grouped Bar Graph With Python and Pandas
# Reference: https://www.geeksforgeeks.org/create-a-grouped-bar-plot-in-matplotlib/

import matplotlib.pyplot as plt
import pandas as pd
import textwrap

ax = treatment_results_only_df.plot(x='Title', 
        kind='bar', 
        stacked=False, 
        title='ADHD Treatment for Latino Families', figsize=(10,8)) 

x_axis=range(3)
ax.margins(y=0.25)  # Add 25% margin above the tallest bar

# Reference: https://stackoverflow.com/questions/10998621/rotate-axis-tick-labels
plt.xticks(rotation='horizontal')
plt.legend(loc='upper right')

# Reference: https://stackoverflow.com/questions/57473450/split-tick-labels-or-wrap-tick-labels/57473887
wrapped_labels = [textwrap.fill(title, 10) for title in treatment_df['Title']]
plt.xticks(range(len(wrapped_labels)), wrapped_labels, fontsize=10, ha='center')

plt.tight_layout()
plt.show()

Creating DataFrame by Removing the First F Rows

In [None]:
# Number of rows to drop
f = 4
# Removing first f rows
change_df =clinical_trials_final_df.iloc[f:-1]
change_df

Creating Change Results DataFrame by Dropping Non-Results Columns

In [None]:
columns_to_drop2 = ['Standard Participant Count', 'Culturally Modified Participant Count', 'Standard Families', 'Culturally Modified Families']

change_results_only_df = change_df.drop(columns=columns_to_drop2, axis=1)
change_results_only_df

Scatter Plot: Comparing ADHD Treatment Outcomes for Latino Families

In [None]:
# Reference: https://matplotlib.org/stable/gallery/lines_bars_and_markers/scatter_with_legend.html#sphx-glr-gallery-lines-bars-and-markers-scatter-with-legend-py
# Reference: https://stackoverflow.com/questions/39712767/how-to-set-size-for-scatter-plot

import matplotlib.pyplot as plt

plt.figure(figsize=(6, 6))

# Reference: https://www.geeksforgeeks.org/matplotlib-pyplot-scatter-in-python/
# dataset1
x1 = change_results_only_df["Title"]
y1 = change_results_only_df["Standard ADHD Parent Training Results"]	
 
# dataset2
x2 = change_results_only_df["Title"]
y2 = change_results_only_df["Culturally-modified ADHD Parent Training Results"]	
 
# Reference: https://matplotlib.org/stable/api/markers_api.html
plt.scatter(x1, y1, c ="blue", 
            linewidths = 2, 
            marker ="o", 
            edgecolor ="blue", 
            s = 50, label="Standard ADHD Parent Training Results")
 
plt.scatter(x2, y2, c ="orange",
            linewidths = 2,
            marker ="o", 
            edgecolor ="orange", 
            s = 50, label= "Culturally-modified ADHD Parent Training Results")

wrapped_labels2 = [textwrap.fill(title, 15) for title in change_results_only_df['Title']]
plt.xticks(
    range(len(wrapped_labels2)),  # Set positions
    wrapped_labels2,             # Wrapped labels
    fontsize=10, 
    ha='center', 
    rotation=0  
)

# Reference: https://stackoverflow.com/questions/42223587/how-to-add-title-and-xlabel-and-ylabel
plt.ylabel("Change in Scores (Closer to 0 = Worse)")
plt.legend(loc='lower left')
plt.title("ADHD Treatment for Latino Families")

# Reference: https://stackoverflow.com/questions/66446687/how-do-i-make-a-dashed-horizontal-line-with-matplotlib
plt.axhline(0, linestyle='--', color="black", linewidth=0.8)

# Adjust layout
plt.tight_layout()
plt.show()

Creating a DataFrame Without Descriptions Column

In [None]:
change_results_only_df_dropped_descriptions = change_results_only_df.drop('Descriptions', axis=1)
change_results_only_df_dropped_descriptions

Creating and Displaying a DataFrame as a Table Using Matplotlib

In [None]:
import pandas as pd
import matplotlib.pyplot as plt

# Reference: https://www.scaler.com/topics/matplotlib/matplotlib-table/
# Reference: https://stackoverflow.com/questions/32137396/how-do-i-plot-only-a-table-in-matplotlib

# Create a figure and axes
fig, ax = plt.subplots()

# Hide the axes
ax.axis('off')

# Create the table
table = ax.table(cellText=change_results_only_df_dropped_descriptions.values, colLabels=change_results_only_df_dropped_descriptions.columns, loc='center')

# Adjust table properties 
table.auto_set_font_size(False)
table.set_fontsize(12)
table.scale(5.5, 5.5) 

# Show the plot
plt.show()

## Findings: Effectiveness of Culturally Modified ADHD Treatments

* **Improved Outcomes**:

   - Culturally modified treatments lead to better outcomes for Latino families, including increased engagement, improved acceptability, and reduced stress levels.


* **Comparative Analysis**:

   - Scatter plots and bar charts demonstrate the superior effectiveness of culturally adapted treatments over standard methods.

* **Potential Applications**:

   - These findings support extending culturally modified treatments to diverse populations and exploring their effectiveness in other demographic groups.

## Key Insights

* <b> ADHD Prevalence: </b>

    - Lower among Hispanic children, but severe ADHD cases are proportionally higher compared to Non-Hispanic children.

* <b>Treatment Disparities:</b>

    - Hispanic children are less likely to receive medication or behavioral treatment (37.3%) compared to Non-Hispanic children (28.2%).

* <b>Culturally Modified Treatment: </b>

    - Showed significant benefits, including:
    
        - Improved engagement in treatment
        - Greater acceptability of treatment
        - Reduction in ADHD symptoms
        - Decrease in maternal parenting stress

These findings emphasize the need for culturally tailored interventions to reduce disparities and improve outcomes.
