# Exploratory Data Analysis

This notebook contains code for performing exploratory data analysis (EDA) on the dataset. The goal of EDA is to summarize the main characteristics of the data, often with visual methods.

## Steps to Follow:
1. Load the dataset
2. Perform basic data exploration
3. Visualize data distributions
4. Analyze relationships between features
5. Identify missing values and outliers


In [None]:
# Step 1: Load the dataset

import pandas as pd

data = pd.read_csv('path/to/your/dataset.csv')  # Update with your dataset path
data.head()

In [None]:
# Step 2: Basic data exploration

print(data.info())
print(data.describe())

In [None]:
# Step 3: Visualize data distributions

import matplotlib.pyplot as plt
import seaborn as sns

sns.histplot(data['feature_name'], bins=30)  # Replace 'feature_name' with your feature
plt.title('Distribution of Feature Name')
plt.show()

In [None]:
# Step 4: Analyze relationships between features

sns.pairplot(data)
plt.show()

In [None]:
# Step 5: Identify missing values and outliers

missing_values = data.isnull().sum()
print('Missing values per feature:\n', missing_values)

sns.boxplot(data['feature_name'])  # Replace 'feature_name' with your feature
plt.title('Boxplot for Feature Name')
plt.show()