# 📊 Notebook 01: Basic Data Inspection

## Learning Objectives
By the end of this notebook, you will be able to:
1. Load data from various file formats
2. Understand DataFrame structure and properties
3. Perform quick data profiling
4. Identify data quality issues
5. Get statistical summaries

## Real-World Scenario
You've just joined a company as a Data Analyst. Your first task is to analyze the employee dataset to understand the company's workforce composition.

---

## Setup

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

pd.set_option('display.max_columns', None)
sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (12, 6)

print("✅ Setup complete!")

## 1. Loading Data

In [None]:
# Load employee data
df = pd.read_csv('../datasets/employees.csv')
print(f"✅ Loaded {len(df)} rows")

In [None]:
# First look
df.head()

## 2. DataFrame Structure

In [None]:
# Basic info
df.info()

In [None]:
# Shape and size
print(f"Shape: {df.shape}")
print(f"Rows: {df.shape[0]}")
print(f"Columns: {df.shape[1]}")

## 3. Statistical Summary - describe()

In [None]:
# Numerical summary
df.describe()

In [None]:
# All columns
df.describe(include='all')

## 4. Data Quality Checks

In [None]:
# Missing values
print("Missing Values:")
print(df.isna().sum())
print(f"\nTotal: {df.isna().sum().sum()}")

In [None]:
# Duplicates
print(f"Duplicate rows: {df.duplicated().sum()}")

In [None]:
# Unique values
for col in df.columns:
    print(f"{col}: {df[col].nunique()} unique values")

## 5. Practice Exercises

### Exercise 1: Load sales_data.csv and inspect it

In [None]:
# Your code here


### Exercise 2: Find columns with missing values in customers dataset

In [None]:
# Your code here


### Exercise 3: Calculate salary statistics

In [None]:
# Calculate mean, median, min, max of salary


## Key Takeaways

- Always start with `head()`, `info()`, `describe()`
- Check for missing values with `isna().sum()`
- Check for duplicates with `duplicated().sum()`
- Understand data types before analysis

**Next**: Notebook 02 - Data Cleaning