# 📝 Basic Data Exploration Report

This notebook contains a simple exploratory data analysis on a sample dataset.

## Table of Contents
1. Introduction
2. Loading the Data
3. Data Overview
4. Missing Values
5. Descriptive Statistics
6. Univariate Analysis
7. Bivariate Analysis
8. Observations & Insights

## 1. Loading the Data

In [None]:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

# Sample data
data = {
    'age': [19, 18, 28, 45, 34],
    'gender': ['female', 'male', 'male', 'female', 'male'],
    'bmi': [27.9, 33.77, 33.0, 28.5, 31.0],
    'children': [0, 1, 3, 2, 1],
    'smoker': ['yes', 'no', 'no', 'yes', 'no'],
    'region': ['west', 'south-east', 'south-west', 'north', 'east'],
    'charges': [16884.92, 1725.55, 4449.46, 21984.5, 12365.0]
}

df = pd.DataFrame(data)

## 2. Data Overview

In [None]:
print('Shape:', df.shape)
print('\nData types:\n', df.dtypes)
df.head()

## 3. Missing Values

In [None]:
df.isnull().sum()

## 4. Descriptive Statistics

In [None]:
df.describe()

## 5. Univariate Analysis

In [None]:
sns.histplot(df['charges'], kde=True)
plt.title('Distribution of Charges')
plt.show()

In [None]:
sns.countplot(x='smoker', data=df)
plt.title('Smoker Count')
plt.show()

## 6. Bivariate Analysis

In [None]:
sns.scatterplot(x='age', y='charges', hue='smoker', data=df)
plt.title('Age vs Charges by Smoker')
plt.show()

In [None]:
sns.boxplot(x='smoker', y='charges', data=df)
plt.title('Charges by Smoking Status')
plt.show()

## 7. Observations & Insights
- Smokers tend to have higher charges.
- Age and number of children may impact charges.
- No missing values detected.
- BMI does not show a strong trend with charges in this small sample.