# Introduction

Provide a brief introduction that will motivate your report for the reader. This is especially important if your topic, dataset, or the focus of your investigation has evolved or changed as you continued your project. Why is your topic significant or interesting? Why did you want to work on this?

Dataset and Guiding Questions

Describe your dataset, especially key features (columns) or things you thought were particularly interesting about your dataset. 

Reiterate your guiding questions, highlighting where they may have evolved or changed in response to feedback or your own discoveries in working with the dataset.

Analysis

Provide answers to those questions appropriately backed up by your exploratory analysis. You may include appropriate cells (both Markdown and code) to walk your reader through the different wrangling and visualization tasks you performed.

Discussion and Conclusion

Finally, take a moment to discuss your process and your results. What were your results, and did they align with what you expected to find? How did the presentations or other feedback you received affected the process? What were the major contributions of each team member? Finally, what would you want to do or learn next with this dataset? Are there new guiding questions you've thought of or something else you want to try?

References and AI Use

All references should be cited appropriately (using quotations and paraphrasing where it makes sense) using an appropriate and consistent format, and included with your report. Include any background that information that contextualizes or supports the analysis of your dataset, technical resources which helped you to solve a problem in your analysis, and any libraries which were not covered in the class (a link to the library's documentation is sufficient).

If you have used generative AI tools to support any part of this analysis, remember that you must present as an appendix an ethical position for your AI use, as well as a listing of all the prompts you used (and a description of what you used them for).

In [4]:
#Make sure all appropriate libraries loaded
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import statsmodels.api as sm
from statsmodels.formula.api import ols
import plotly.express as px
from pydataset import data

In [7]:
#Load dataset attached to submission
df=pd.read_csv('pumf_cchs.csv')

In [15]:
#This should filter all of the variables we want on the general level. You will need to determine which variables you will use
#IMPORTANT I did not include all the CCC values for chronic disease here. I can look and see what we need, and try to adjust shortly
#If you think anything is missing here, feel free to change
df_general=df.filter(items=[
    'ALC_015',
    'ALC_020',
    'GEN_010',
    'GEN_015',
    'GEN_020',
    'GEN_025',
    'CCC_195',
    'GEN_005',
    'HWTDGISW',
    'CAN_015',
    'SMK_005',
    'SMK_060',
    'PAADVACV',
    'PAA_030',
    'PAA_060',
    'PAA_095',
    'SBE_005',
    'SBE_010',
    'PAA_005',
    'DHHGAGE',
    'DHH_SEX',
    'GEOGPRV',
    'EHG2DVH3',
    'HWT_050',
    'PEX_005',
    'ADM_RNO1',
    'GENDVHDI'
    ])
print(df_general.head())

   ALC_015  ALC_020  GEN_010  GEN_015  GEN_020  GEN_025  CCC_195  GEN_005  \
0      5.0      3.0      9.0      3.0      2.0      2.0      2.0      3.0   
1      1.0      1.0      4.0      3.0      3.0      6.0      1.0      3.0   
2     96.0     96.0      7.0      3.0      3.0      6.0      2.0      2.0   
3     96.0     96.0      8.0      3.0      3.0      6.0      2.0      3.0   
4     96.0     96.0      0.0      5.0      4.0      6.0      2.0      5.0   

   HWTDGISW  CAN_015  ...  SBE_010  PAA_005  DHHGAGE  DHH_SEX  GEOGPRV  \
0       1.0      2.0  ...      6.0      2.0      3.0      2.0     47.0   
1       2.0      2.0  ...      6.0      2.0      5.0      1.0     47.0   
2       2.0      2.0  ...      1.0      6.0      5.0      2.0     59.0   
3       2.0      2.0  ...      6.0      6.0      5.0      1.0     13.0   
4       2.0      2.0  ...      6.0      6.0      4.0      1.0     46.0   

   EHG2DVH3  HWT_050  PEX_005  ADM_RNO1  GENDVHDI  
0       3.0      3.0     96.0      1000 

## Analysis

### Health Drivers Vs. Health Barriers

#### Question _ - How does alcohol consumption affect mental health among different age groups? 

#### Question _ -    How does cannabis use impact stress levels? 

In [None]:
df_q4=df_general.copy()
df_q4=df.copy()
df_q4=df_q4.filter(items=['GEN_020','GEN_025','CAN_015'])
print(df_q4.head())

In [None]:
df_q4.to_csv('Question 4.csv',index=False)

In [None]:
df_q4.info()

In [None]:
df_q4.describe()

In [None]:
df_q4.dtypes

In [None]:
df_q4.isnull().sum() #check for missing values

In [None]:
df_q4[['CAN_015','GEN_020','GEN_025']]=df_q4[['CAN_015','GEN_020','GEN_025']].astype(int) #Converts Cannabis Use CAN_015, GEN_020, GEN_025 into integer type.
df_q4.info()

In [None]:
df_q4=df_q4.drop(df_q4[df_q4['GEN_020']==7].index)
df_q4=df_q4.drop(df_q4[df_q4['GEN_020']==8].index)

In [None]:
df_q4=df_q4.drop(df_q4[df_q4['GEN_025']==6].index)
df_q4=df_q4.drop(df_q4[df_q4['GEN_025']==7].index)
df_q4=df_q4.drop(df_q4[df_q4['GEN_025']==8].index)
df_q4=df_q4.drop(df_q4[df_q4['GEN_025']==9].index)

In [None]:
df_q4=df_q4.drop(df_q4[df_q4['CAN_015']==7].index)
df_q4=df_q4.drop(df_q4[df_q4['CAN_015']==8].index)
df_q4=df_q4.drop(df_q4[df_q4['CAN_015']==9].index)

In [None]:
df_q4.describe()

In [None]:
df4_stress=df_q4.groupby('GEN_020', as_index=False)['CAN_015'].value_counts(normalize=True,sort=False)
display(df4_stress)

In [None]:
plt.figure(figsize=(12, 6))
plt.bar(df4_stress['GEN_020'].unique(),df4_stress['proportion'][::2],width=0.3,label='Poeple Used Cannabis in Past 12 Months')
plt.bar(df4_stress['GEN_020'].unique()+0.3,df4_stress['proportion'][1::2],width=0.3, label="People Didn't Use Cannabis in Past 12 Months")
plt.xticks(df4_stress['GEN_020'].unique()+0.3/2,('Not at all stressful','Not very stressful','A bit stressful','Quite bit stressful','Extremely stressful'))
plt.xlabel('Preceived Life Stress by Canabis Use')
plt.ylabel('Percent')
plt.legend()
plt.show()

In [None]:
df4_stress=df_q4.groupby('GEN_025', as_index=False)['CAN_015'].value_counts(normalize=True,sort=False)
display(df4_stress)

In [None]:
plt.figure(figsize=(12, 6))
plt.bar(df4_stress['GEN_025'].unique(),df4_stress['proportion'][::2],width=0.3,label='Poeple Used Cannabis in Past 12 Months')
plt.bar(df4_stress['GEN_025'].unique()+0.3,df4_stress['proportion'][1::2],width=0.3, label="People Didn't Use Cannabis in Past 12 Months")
plt.xticks(df4_stress['GEN_025'].unique()+0.3/2,('Not at all stressful','Not very stressful','A bit stressful','Quite bit stressful','Extremely stressful'))
plt.xlabel('Preceived Work Stress by Canabis Use')
plt.ylabel('Percent')
plt.legend()
plt.show()

#### Question _ - What is the relationship between smoking and physical health outcomes

#### Question _ - Which demographic groups (age, sex, region) are most affected by health barriers? 

In [None]:
df_q5=df.copy()
df_q5=df_q5.filter(items=['CAN_015','SMK_005','SMMK_060','ALC_015','ALC_020','DHHGAGE','DHH_SEX','GEOGPRV','EHG2DVH3'])
print(df_q5.head())

Columns of interested varibels were taken out from the dataset to answer the guidlding questions. 

In [None]:
df_q5.to_csv('Question 5.csv',index=False)

In [None]:
df_q5.info()

In [None]:
df_q5.describe()

In [None]:
df_q5.dtypes

In [None]:
df_q5.isnull().sum() #check for missing values

## Data Cleaning
To conduct data cleaning for the dataset, we focused on removing the unrelated values from the variables of interest that may skew the data analysis such as "don't know", "refusal" and "not state". 

In [None]:
df_q5=df_q5.drop(df_q5[df_q5['CAN_015']==7].index)
df_q5=df_q5.drop(df_q5[df_q5['CAN_015']==8].index)
df_q5=df_q5.drop(df_q5[df_q5['CAN_015']==9].index)

In [None]:
df_q5=df_q5.drop(df_q5[df_q5['SMK_005']==7].index)
df_q5=df_q5.drop(df_q5[df_q5['SMK_005']==8].index)
df_q5=df_q5.drop(df_q5[df_q5['SMK_005']==9].index)

In [None]:
df_q5=df_q5.drop(df_q5[df_q5['ALC_015']==96].index)
df_q5=df_q5.drop(df_q5[df_q5['ALC_015']==97].index)
df_q5=df_q5.drop(df_q5[df_q5['ALC_015']==98].index)
df_q5=df_q5.drop(df_q5[df_q5['ALC_015']==99].index)

In [None]:
df_q5=df_q5.drop(df_q5[df_q5['ALC_020']==96].index)
df_q5=df_q5.drop(df_q5[df_q5['ALC_020']==97].index)
df_q5=df_q5.drop(df_q5[df_q5['ALC_020']==98].index)
df_q5=df_q5.drop(df_q5[df_q5['ALC_020']==99].index)

In [None]:
df_q5=df_q5.drop(df_q5[df_q5['EHG2DVH3']==9].index)

In [None]:
df_q5['CAN_015'] = df_q5['CAN_015'].astype(int)
df_q5['CAN_015'].dtypes

In [None]:
df_q5['SMK_005'] = df_q5['SMK_005'].astype(int)
df_q5['SMK_005'].dtypes

In [None]:
df_q5['ALC_015'] = df_q5['ALC_015'].astype(int)
df_q5['ALC_015'].dtypes

In [None]:
df_q5['ALC_020'] = df_q5['ALC_020'].astype(int)
df_q5['ALC_020'].dtypes

In [None]:
df_q5['DHH_SEX'] = df_q5['DHH_SEX'].astype(int)
df_q5['DHH_SEX'].dtypes

In [None]:
df_q5['DHHGAGE'] = df_q5['DHHGAGE'].astype(int)
df_q5['DHHGAGE'].dtypes

In [None]:
df_q5['GEOGPRV'] = df_q5['GEOGPRV'].astype(int)
df_q5['GEOGPRV'].dtypes

In [None]:
df_q5['EHG2DVH3'] = df_q5['EHG2DVH3'].astype(int)
df_q5['EHG2DVH3'].dtypes

In [None]:
df_q5.describe()

In [None]:
df5_cansex=df_q5.groupby('DHH_SEX', as_index=False)['CAN_015'].value_counts(normalize=True,sort=False)
display(df5_cansex)

In [None]:
plt.figure(figsize=(10,7))
plt.bar(df5_cansex['DHH_SEX'].unique(),df5_cansex['proportion'][::2],width=0.3,label='Poeple Used Cannabis in Past 12 Months')
plt.bar(df5_cansex['DHH_SEX'].unique()+0.3,df5_cansex['proportion'][1::2],width=0.3, label="People Didn't Use Cannabis in Past 12 Months")
plt.xticks(df5_cansex['DHH_SEX'].unique()+0.3/2,('Male','Female'))
plt.xlabel('Gender Impacted by Canabis')
plt.ylabel('Percent')
plt.legend()
plt.show()

In [None]:
cannabis_sex_data=df_q5[['CAN_015', 'DHH_SEX']].dropna()
cannabis_sex=cannabis_sex_data.groupby('DHH_SEX')['CAN_015'].sum()
DHH_SEX_labels={
    1: "Male",
    2: "Female"
}
cannabis_sex.index=cannabis_sex.index.map(DHH_SEX_labels)

plt.figure(figsize=(5,5))
cannabis_sex.plot(kind='pie',autopct='%1.1f%%',colors=['blue','orange'],startangle=90,textprops={'fontsize':12})
plt.title('Cannabis Use by Sex')
plt.ylabel('')  
plt.show()

In [None]:
smoke_sex=df_q5[['SMK_005', 'DHH_SEX']].dropna()
smoke_sex=smoke_sex.groupby('DHH_SEX')['SMK_005'].mean()
DHH_SEX_labels={
    1: "Male",
    2: "Female"
}
smoke_sex.index=smoke_sex.index.map(DHH_SEX_labels)

plt.figure(figsize=(8,6))
plt.ylim(0,3)
smoke_sex.plot(kind='bar', color=['skyblue', 'pink'])
plt.title('Average Smoke Frequency by Gender')
plt.xlabel('Sex')
plt.ylabel('Average Smoke Frequency')
plt.xticks(rotation=0)  
plt.grid(axis='y',linestyle='--',alpha=0.7)
plt.show()

In [None]:
smoking_sex=df_q5[['SMK_005', 'DHH_SEX']].dropna()
smoking_bysex=smoking_sex.groupby('DHH_SEX')['SMK_005'].sum()
DHH_SEX_labels={
    1: "Male",
    2: "Female"
}
smoking_bysex.index=smoking_bysex.index.map(DHH_SEX_labels)

plt.figure(figsize=(5, 5))
smoking_bysex.plot(kind='pie', autopct='%1.1f%%', colors=['skyblue', 'pink'], startangle=90, textprops={'fontsize': 12})
plt.title('Smoke Frequency by Sex')
plt.ylabel('')  
plt.show()

In [None]:
alcohol_sex=df_q5[['ALC_015', 'ALC_020', 'DHH_SEX']].dropna()
alc_015_bysex=alcohol_sex.groupby('DHH_SEX')['ALC_015'].mean()
alc_020_bysex=alcohol_sex.groupby('DHH_SEX')['ALC_020'].mean()
DHH_SEX_labels={
    1: "Male",
    2: "Female"
}
alc_015_bysex.index=alc_015_bysex.index.map(DHH_SEX_labels)
alc_020_bysex.index=alc_020_bysex.index.map(DHH_SEX_labels)

plt.figure(figsize=(14,6))

plt.subplot(1,2,1)  
alc_015_bysex.plot(kind='bar',color=['blue','orange'])
plt.title('Average Alcohol Consumption Frequency by Sex')
plt.xlabel('Sex')
plt.ylabel('Average Alcohol Consumption Frequency')
plt.xticks(rotation=0)  
plt.grid(axis='y',linestyle='--',alpha=0.7)

plt.subplot(1,2,2) 
alc_020_bysex.plot(kind='bar',color=['blue','orange'])
plt.title('Average Alcohol Consumption (Drink 4+/5+ One Occasion) by Sex')
plt.xlabel('Sex')
plt.ylabel('Average Alcohol Consumption (Drink 4+/5+ One Occasion')
plt.xticks(rotation=0)  
plt.grid(axis='y',linestyle='--',alpha=0.7)

In [None]:
df5_smksex=df_q5.groupby('DHH_SEX', as_index=False)['SMK_005'].value_counts(normalize=True,sort=False)
display(df5_smksex)

In [None]:
data_filtered=df_q5[['CAN_015', 'DHHGAGE']].dropna()
cannabis_use_different_age=data_filtered.groupby('DHHGAGE')['CAN_015'].mean()
DHHGAGE_labels={
    1: '12 to 17 years',
    2: '18 to 34 years',
    3: '35 to 49 years',
    4: '50 to 64 years',
    5: '65 years and older'
}
cannabis_use_different_age.index=cannabis_use_different_age.index.map(DHHGAGE_labels)

plt.figure(figsize=(10,6))
cannabis_use_different_age.plot(kind='bar',color='olive')
plt.title('Average Cannabis Use by Different Age Group')
plt.xlabel('Age Group')
plt.ylabel('Average Cannabis Use')
plt.xticks(rotation=45, ha='right')
plt.grid(axis='y',linestyle='--',alpha=0.7)
plt.show()

In [None]:
smoke_data_filtered=df_q5[['SMK_005', 'DHHGAGE']].dropna()
smoke_byage=smoke_data_filtered.groupby('DHHGAGE')['SMK_005'].mean()
DHHGAGE_labels={
    1: "12 to 17 years",
    2: "18 to 34 years",
    3: "35 to 49 years",
    4: "50 to 64 years",
    5: "65 years and older"
}
smoke_byage.index=smoke_byage.index.map(DHHGAGE_labels)

plt.figure(figsize=(10,6))
smoke_byage.plot(kind='bar',color='purple')
plt.title('Average Smoke Frequency by Age Group')
plt.xlabel('Age Group')
plt.ylabel('Average Smoke Frequency')
plt.xticks(rotation=45,ha='right')
plt.grid(axis='y',linestyle='--',alpha=0.7)
plt.show()

In [None]:
alcohol_data_filtered=df_q5[['ALC_015', 'ALC_020', 'DHHGAGE']].dropna()
alc_015_by_age=alcohol_data_filtered.groupby('DHHGAGE')['ALC_015'].mean()
alc_020_by_age=alcohol_data_filtered.groupby('DHHGAGE')['ALC_020'].mean()
DHHGAGE_labels={
    1: "12 to 17 years",
    2: "18 to 34 years",
    3: "35 to 49 years",
    4: "50 to 64 years",
    5: "65 years and older"
}
alc_015_by_age.index=alc_015_by_age.index.map(DHHGAGE_labels)
alc_020_by_age.index=alc_020_by_age.index.map(DHHGAGE_labels)

plt.figure(figsize=(14,6))
plt.subplot(1,2,1)  
alc_015_by_age.plot(kind='bar',color='coral')
plt.title('Average Alcohol Consumption Frequency in Past 12 Months')
plt.xlabel('Age Group')
plt.ylabel('Average Alcohol Consumption Frequency')
plt.xticks(rotation=45,ha='right')
plt.grid(axis='y',linestyle='--',alpha=0.7)

plt.subplot(1,2,2) 
plt.ylim(0,2.5)
alc_020_by_age.plot(kind='bar',color='mediumblue')
plt.title('Average Alcohol Consumption (Drink 4+/5+ One Occasion)')
plt.xlabel('Age Group')
plt.ylabel('Average Alcohol Consumption (Drink 4+/5+ One Occasion)')
plt.xticks(rotation=45,ha='right')
plt.grid(axis='y',linestyle='--',alpha=0.7)

In [None]:
cannabis_region=df_q5[['CAN_015', 'GEOGPRV']].dropna()
GEOGPRV_labels={
    10: "NEWFOUNDLAND AND LABRADOR",
    11: "PRINCE EDWARD ISLAND",
    12: "NOVA SCOTIA",
    13: "NEW BRUNSWICK",
    24: "QUEBEC",
    35: "ONTARIO",
    46: "MANITOBA",
    47: "SASKATCHEWAN",
    48: "ALBERTA",
    59: "BRITISH COLUMBIA",
    60: "YUKON/NORTHWEST/NUNAVUT TERRITORIES"
}
cannabis_region['GEOGPRV']=cannabis_region['GEOGPRV'].map(GEOGPRV_labels)
cannabis_byregion=cannabis_region.groupby('GEOGPRV')['CAN_015'].mean()

plt.figure(figsize=(10,7))
plt.ylim(0,2)
cannabis_byregion.plot(kind='bar', color='olive')
plt.title('Average Cannabis Use by Region')
plt.xlabel('Region')
plt.ylabel('Average Cannabis Use')
plt.xticks(rotation=45, ha='right')
plt.grid(axis='y',linestyle='--',alpha=0.7)
plt.tight_layout()
plt.show()

In [None]:
smoke_region=df_q5[['SMK_005', 'GEOGPRV']].dropna()
GEOGPRV_labels={
    10: "NEWFOUNDLAND AND LABRADOR",
    11: "PRINCE EDWARD ISLAND",
    12: "NOVA SCOTIA",
    13: "NEW BRUNSWICK",
    24: "QUEBEC",
    35: "ONTARIO",
    46: "MANITOBA",
    47: "SASKATCHEWAN",
    48: "ALBERTA",
    59: "BRITISH COLUMBIA",
    60: "YUKON/NORTHWEST/NUNAVUT TERRITORIES"
}
smoke_region['GEOGPRV']=smoke_region['GEOGPRV'].map(GEOGPRV_labels)
smoke_byregion=smoke_region.groupby('GEOGPRV')['SMK_005'].mean()

plt.figure(figsize=(10,7))
smoke_byregion.plot(kind='bar',color='purple')
plt.title('Average Smoke Frequency by Region')
plt.xlabel('Region')
plt.ylabel('Average Smoke Frequency')
plt.xticks(rotation=45,ha='right')
plt.grid(axis='y',linestyle='--',alpha=0.7)
plt.tight_layout()
plt.show()

In [None]:
alcohol_data_filtered=df_q5[['ALC_015', 'ALC_020', 'GEOGPRV']].dropna()
alc_015_region=alcohol_data_filtered.groupby('GEOGPRV')['ALC_015'].mean()
alc_020_region=alcohol_data_filtered.groupby('GEOGPRV')['ALC_020'].mean()
GEOGPRV_labels={
    10: "NEWFOUNDLAND AND LABRADOR",
    11: "PRINCE EDWARD ISLAND",
    12: "NOVA SCOTIA",
    13: "NEW BRUNSWICK",
    24: "QUEBEC",
    35: "ONTARIO",
    46: "MANITOBA",
    47: "SASKATCHEWAN",
    48: "ALBERTA",
    59: "BRITISH COLUMBIA",
    60: "YUKON/NORTHWEST/NUNAVUT TERRITORIES"
}
alc_015_region.index=alc_015_region.index.map(GEOGPRV_labels)
alc_020_region.index=alc_020_region.index.map(GEOGPRV_labels)

plt.figure(figsize=(14,6))

plt.subplot(1,2,1)  
alc_015_region.plot(kind='bar',color='coral')
plt.title('Average Alcohol Consumption Frequency (Past 12 Months) By Region')
plt.xlabel('Region Group')
plt.ylabel('Average Alcohol Consumption Frequency')
plt.xticks(rotation=45,ha='right')
plt.grid(axis='y',linestyle='--',alpha=0.7)

plt.subplot(1,2,2)  
alc_020_region.plot(kind='bar',color='mediumblue')
plt.title('Average Alcohol Consumption (Drink 4+/5+ One Occasion) By Region')
plt.xlabel('Region Group')
plt.ylabel('Average Alcohol Consumption (Drink 4+/5+ One Occasion)')
plt.xticks(rotation=45,ha='right')
plt.grid(axis='y',linestyle='--',alpha=0.7)

In [None]:
cannabis_education=df_q5[['CAN_015', 'EHG2DVH3']].dropna()
EHG2DVH3_labels={
    1: "Less than secondary school graduation",
    2: "Secondary school graduation, no post-secondary education",
    3: "Post-secondary certificate/diploma/university degree"
}
cannabis_education['EHG2DVH3']=cannabis_education['EHG2DVH3'].map(EHG2DVH3_labels)
cannabis_education=cannabis_education.groupby('EHG2DVH3')['CAN_015'].mean()

plt.figure(figsize=(8.5,7))
cannabis_education.plot(kind='bar',color='olive',width=0.25)
plt.title('Average Cannabis Use by Education Level')
plt.xlabel('Education Level')
plt.ylabel('Average Cannabis Use')
plt.xticks(rotation=45, ha='right')
plt.grid(axis='y',linestyle='--',alpha=0.7)
plt.tight_layout()
plt.show()

In [None]:
smoke_education=df_q5[['SMK_005', 'EHG2DVH3']].dropna()
EHG2DVH3_labels={
    1: "Less than secondary school graduation",
    2: "Secondary school graduation, no post-secondary education",
    3: "Post-secondary certificate/diploma/university degree"
}
smoke_education['EHG2DVH3']=smoke_education['EHG2DVH3'].map(EHG2DVH3_labels)
smoke_education=smoke_education.groupby('EHG2DVH3')['SMK_005'].mean()

plt.figure(figsize=(8, 7))
plt.ylim(0,3)
smoke_education.plot(kind='bar', color='purple', width=0.3)
plt.title('Average Smoke Frequency by Education Level')
plt.xlabel('Education Level')
plt.ylabel('Average Smoke Frequency')
plt.xticks(rotation=45,ha='right')
plt.grid(axis='y',linestyle='--',alpha=0.7)
plt.tight_layout()
plt.show()

In [None]:
alcohol_education=df_q5[['ALC_015', 'ALC_020', 'EHG2DVH3']].dropna()
alc_015_education=alcohol_education.groupby('EHG2DVH3')['ALC_015'].mean()
alc_020_education=alcohol_education.groupby('EHG2DVH3')['ALC_020'].mean()

EHG2DVH3_labels={
    1: "Less than secondary school graduation",
    2: "Secondary school graduation, no post-secondary education",
    3: "Post-secondary certificate/diploma/university degree"
}
alc_015_education.index=alc_015_education.index.map(EHG2DVH3_labels)
alc_020_education.index=alc_020_education.index.map(EHG2DVH3_labels)

plt.figure(figsize=(14,8))

plt.subplot(1,2,1)
alc_015_education.plot(kind='bar',color='coral')
plt.title('Average Alcohol Consumption by Education Level')
plt.xlabel('Education Level')
plt.ylabel('Average Alcohol Consumption')
plt.xticks(rotation=45, ha='right')
plt.grid(axis='y',linestyle='--',alpha=0.7)

plt.subplot(1,2,2)
alc_020_education.plot(kind='bar',color='mediumblue')
plt.title('Average Alcohol Consumption by Education Level')
plt.xlabel('Education Level')
plt.ylabel('Average Alcohol Consumption')
plt.xticks(rotation=45,ha='right')
plt.grid(axis='y',linestyle='--',alpha=0.7)
plt.tight_layout()
plt.show()

### Health Drivers Vs. Health Improvements

#### Question _ - What is the relationship between regular exercise and self-reported health status? 

#### Question _ - How do health improvement activities influence the maintenance or improvement of mental health? 