# 📊 Exploratory Data Analysis on Drugs, Side Effects, and Medical Conditions

This project explores relationships between drugs, their side effects, medical conditions, and user reviews.

## Import Dependencies

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

: 

## Load DataSet

In [None]:
# Replace with your actual file path
df = pd.read_csv('path_to_your_dataset.csv')
df.head()

## 🧹 Data Cleaning

In [None]:
# Check for missing values
df.isnull().sum()

In [None]:
# Option 1: Drop missing values
df_cleaned = df.dropna()

# Option 2: Fill missing values
df_filled = df.fillna('Unknown')

## 🔍 Basic Data Exploration

In [None]:
df.describe(include='all')
df.info()

## 📈 Univariate Analysis

In [None]:
plt.figure(figsize=(10, 6))
sns.histplot(df['rating'], bins=10, kde=True)
plt.title('Distribution of Drug Ratings')
plt.xlabel('Rating')
plt.ylabel('Frequency')
plt.show()

In [None]:
df['medical_condition'].value_counts().head(10).plot(kind='barh')
plt.title('Top 10 Medical Conditions')
plt.xlabel('Count')
plt.show()

## 🔗 Bivariate and Multivariate Analysis

In [None]:
# Top Drugs by Condition
top_drugs = df.groupby('medical_condition')['drug_name'].value_counts().groupby(level=0).head(1)
print(top_drugs)

In [None]:
# Ratings by Drug Class
plt.figure(figsize=(12, 6))
sns.boxplot(x='drug_classes', y='rating', data=df)
plt.xticks(rotation=90)
plt.title('Drug Ratings by Class')
plt.show()

In [None]:
# Most Common Side Effects
df['side_effects'].value_counts().head(10).plot(kind='bar')
plt.title('Top 10 Reported Side Effects')
plt.show()

## 🔬 Optional: Advanced Analysis

In [None]:
plt.figure(figsize=(10, 6))
sns.heatmap(df.corr(numeric_only=True), annot=True, cmap='coolwarm')
plt.title("Correlation Matrix")
plt.show()

## 📌 Summary of Findings
- Key patterns observed.
- Conditions with most drugs.
- Common side effects.
- Drug class-wise rating differences.