# 📊 U.S. Chronic Disease Indicators Analysis

## Objective:
This project analyzes data from the U.S. Chronic Disease Indicators dataset with a focus on alcohol-related metrics among **females in 2022**, using the **Behavioral Risk Factor Surveillance System (BRFSS)** data source.

The goal is to explore trends, missing data, and key statistics to understand how alcohol use is reported among different states and demographics.


#import libraries
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns


#load dataset
df = pd.read_csv(r'C:\Users\pc\Downloads\U.S._Chronic_Disease_Indicators.csv')
df.head()

#Filter dataset for Alcohol topic, 2022, BRFSS source, Female
alcohol_females_2022=df[
(df['DataSource']=='BRFSS') & 
(df['Topic']=='Alcohol') & 
(df['YearStart']==2022) &
(df['Stratification1']=='Female')]
alcohol_females_2022.head()

#Data overview
alcohol_females_2022.info()

#data null checks
alcohol_females_2022.isnull().sum()

#loading only the essential columns
alcohol_females_2022=alcohol_females_2022[['YearStart', 'LocationDesc', 'Topic', 'Question', 'DataValueType', 'DataValue','Stratification1']]
alcohol_females_2022.head()

#drop rows with missing values in key columns
alcohol_females_2022.dropna(subset=['DataValue'], inplace=True)

#statistical describing of datavalue column
alcohol_females_2022['DataValue'].describe()

# the 5 largest states in prevalence
top_states=alcohol_females_2022.groupby('LocationDesc')['DataValue'].mean().nlargest(5)
top_states.head()

# viaulaization of the top 5 states in prevalence
plt.figure(figsize=(20,5))
sns.barplot(x=top_states.index, y=top_states.values, palette="Blues_d")
plt.title("Top 5 States: Alchol Prevalence in Females")
plt.ylabel("prevalence(%)")
plt.xticks(rotation=45)
plt.show()

#Average alcohol use reported by state
plt.figure(figsize=(20,5))
sns.barplot(x='LocationDesc', y='DataValue', data=alcohol_females_2022,errorbar=None)
plt.title("Alchol Prevalence in Females by state")
plt.xticks(rotation=90)
plt.ylabel('DataValue(%)')
plt.xlabel('State')
plt.tight_layout()
plt.show()

## 📝 Conclusion

- The BRFSS dataset provides a comprehensive overview of alcohol use across U.S. states.
- In 2022, we observed variability in alcohol usage rates among females.
- I found that the top 5 states in alchol prevalence wasDistrict of Columbia, Montana, North Dakota, Minnesota, Wisconsin with average prevalence rate 10.100000, 8.883333, 8.550000, 8.216667, 8.166667 respectively.
