# Simple Data Loading Tutorial

## What You'll Learn
In this tutorial, you'll learn how to read three common file formats:
- **CSV files** (.csv)
- **Excel files** (.xlsx)
- **JSON files** (.json)

## Why These Formats?
- **CSV**: Simple, widely used for data tables
- **Excel**: Popular in business, supports multiple sheets
- **JSON**: Common for web data and APIs

## Step 1: Import Required Libraries

We need pandas to read our data files easily.

In [None]:
import pandas as pd
import json

print("Libraries loaded successfully!")

## Step 2: Create Sample Data Files

Let's create some sample files to practice with.

In [None]:
# Create sample data
sample_data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'Diana'],
    'Age': [25, 30, 35, 28],
    'City': ['New York', 'London', 'Tokyo', 'Paris'],
    'Salary': [50000, 60000, 70000, 55000]
}

# Create a DataFrame
df = pd.DataFrame(sample_data)
print("Sample data created:")
print(df)

## Step 3: Reading CSV Files

CSV (Comma Separated Values) files are the most common data format.

In [None]:
# First, let's create a CSV file
df.to_csv('sample_data.csv', index=False)
print("CSV file created: sample_data.csv")

# Now let's read it back
csv_data = pd.read_csv('sample_data.csv')
print("\nReading CSV file:")
print(csv_data)
print(f"\nFile shape: {csv_data.shape} (rows, columns)")

## Step 4: Reading Excel Files

Excel files (.xlsx) can contain multiple sheets and formatting.

In [None]:
# Create an Excel file
df.to_excel('sample_data.xlsx', index=False, sheet_name='Employees')
print("Excel file created: sample_data.xlsx")

# Read the Excel file
excel_data = pd.read_excel('sample_data.xlsx', sheet_name='Employees')
print("\nReading Excel file:")
print(excel_data)
print(f"\nFile shape: {excel_data.shape} (rows, columns)")

## Step 5: Reading JSON Files

JSON (JavaScript Object Notation) files store data in a structured format.

In [None]:
# Create a JSON file
df.to_json('sample_data.json', orient='records', indent=2)
print("JSON file created: sample_data.json")

# Method 1: Read JSON with pandas
json_data = pd.read_json('sample_data.json')
print("\nReading JSON file with pandas:")
print(json_data)
print(f"\nFile shape: {json_data.shape} (rows, columns)")

In [None]:
# Method 2: Read JSON with built-in json library
with open('sample_data.json', 'r') as file:
    json_raw = json.load(file)

print("Reading JSON file with json library:")
print("Type:", type(json_raw))
print("First record:", json_raw[0])

# Convert to DataFrame
json_df = pd.DataFrame(json_raw)
print("\nConverted to DataFrame:")
print(json_df)

## Step 6: Quick Comparison

Let's compare the three methods side by side.

In [None]:
print("=== COMPARISON ===")
print("\n1. CSV Data:")
print(csv_data.head())

print("\n2. Excel Data:")
print(excel_data.head())

print("\n3. JSON Data:")
print(json_data.head())

print("\n=== FILE SIZES ===")
import os
print(f"CSV file size: {os.path.getsize('sample_data.csv')} bytes")
print(f"Excel file size: {os.path.getsize('sample_data.xlsx')} bytes")
print(f"JSON file size: {os.path.getsize('sample_data.json')} bytes")

## Step 7: Basic Data Information

Once you load data, here are some useful commands to explore it.

In [None]:
# Using the CSV data as an example
data = csv_data

print("=== BASIC DATA INFO ===")
print(f"Shape: {data.shape}")
print(f"Columns: {list(data.columns)}")
print(f"Data types:\n{data.dtypes}")

print("\n=== FIRST FEW ROWS ===")
print(data.head())

print("\n=== BASIC STATISTICS ===")
print(data.describe())

## Step 8: Practice Exercise

Try these exercises to practice what you've learned!

In [None]:
# Exercise 1: Create your own data
my_data = {
    'Product': ['Laptop', 'Phone', 'Tablet', 'Watch'],
    'Price': [1000, 800, 500, 300],
    'Brand': ['Apple', 'Samsung', 'iPad', 'Apple']
}

# TODO: Convert this to a DataFrame and save as CSV
# Your code here:


# TODO: Read the CSV file back and display it
# Your code here:


## Summary

You've learned how to:

✅ **Read CSV files** with `pd.read_csv()`  
✅ **Read Excel files** with `pd.read_excel()`  
✅ **Read JSON files** with `pd.read_json()` or `json.load()`  
✅ **Compare different file formats**  
✅ **Get basic information about your data**  

### Key Commands to Remember:
```python
# Reading files
df = pd.read_csv('file.csv')
df = pd.read_excel('file.xlsx')
df = pd.read_json('file.json')

# Exploring data
df.head()      # First 5 rows
df.shape       # (rows, columns)
df.columns     # Column names
df.dtypes      # Data types
df.describe()  # Basic statistics
```

### Next Steps:
- Try loading your own data files
- Learn about data cleaning and preprocessing
- Explore data visualization with matplotlib or seaborn