# 📊 04 - Data Input and Output

Most of data science is about **loading data, exploring it, and saving results**.  
In this notebook you will learn:
- Reading CSV files with Pandas
- Inspecting DataFrames (`head`, `info`, `describe`)
- Writing data back to CSV
- Reading Excel files (if available)


## 1. Reading CSV Files

In [6]:
import pandas as pd

# Create a small CSV for demo
data = """Name,Age,Score
Alice,23,90
Bob,25,85
Charlie,22,95
"""
with open("students.csv", "w") as f:
    f.write(data)

# Read CSV into DataFrame
df = pd.read_csv("students.csv")
df

Unnamed: 0,Name,Age,Score
0,Alice,23,90
1,Bob,25,85
2,Charlie,22,95


✅ **Your Turn**: Create your own small CSV (3–5 rows) and read it into Pandas.

In [7]:
data = """Name,Age,Major
Ethan,21,Computer Science
Isaiah,21,Computer Science
Phoenix,22,Business
"""

with open("students2.csv", "w") as f:
    f.write(data)

df = pd.read_csv("students2.csv")
df

Unnamed: 0,Name,Age,Major
0,Ethan,21,Computer Science
1,Isaiah,21,Computer Science
2,Phoenix,22,Business


## 2. Exploring DataFrames

In [8]:
# Look at the first rows
df.head()

Unnamed: 0,Name,Age,Major
0,Ethan,21,Computer Science
1,Isaiah,21,Computer Science
2,Phoenix,22,Business


In [9]:
# Info about columns and data types
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   Name    3 non-null      object
 1   Age     3 non-null      int64 
 2   Major   3 non-null      object
dtypes: int64(1), object(2)
memory usage: 204.0+ bytes


In [10]:
# Summary statistics
df.describe()

Unnamed: 0,Age
count,3.0
mean,21.333333
std,0.57735
min,21.0
25%,21.0
50%,21.0
75%,21.5
max,22.0


✅ **Your Turn**: Use `.head()` to preview your dataset and `.describe()` to see basic statistics.

In [11]:
df.head()


Unnamed: 0,Name,Age,Major
0,Ethan,21,Computer Science
1,Isaiah,21,Computer Science
2,Phoenix,22,Business


In [14]:
df.describe()

Unnamed: 0,Age
count,3.0
mean,21.333333
std,0.57735
min,21.0
25%,21.0
50%,21.0
75%,21.5
max,22.0


## 3. Writing CSV Files

In [15]:
# Save the DataFrame to a new CSV
df.to_csv("students_copy.csv", index=False)
%ls *.csv

students2.csv  students_copy.csv  students.csv


✅ **Your Turn**: Save your DataFrame to a CSV file called `my_data.csv`.

In [16]:
df.to_csv("students2_copy.csv", index=False)
%ls *.csv

students2_copy.csv  students2.csv  students_copy.csv  students.csv


## 4. Reading Excel Files

In [17]:
# Optional: Requires 'openpyxl' installed
# Save to Excel
df.to_excel("students.xlsx", index=False)

# Read Excel file
pd.read_excel("students.xlsx")

Unnamed: 0,Name,Age,Major
0,Ethan,21,Computer Science
1,Isaiah,21,Computer Science
2,Phoenix,22,Business


✅ **Your Turn**: Try saving your dataset to Excel and reading it back (if supported in your environment).

In [18]:
df.to_excel("students2.xlsx", index=False)
pd.read_excel("students2.xlsx")

Unnamed: 0,Name,Age,Major
0,Ethan,21,Computer Science
1,Isaiah,21,Computer Science
2,Phoenix,22,Business


---
### Summary
- `pd.read_csv` loads data into a DataFrame.
- Use `.head()`, `.info()`, `.describe()` to explore quickly.
- `df.to_csv` and `df.to_excel` save results.
- Pandas makes file I/O simple and reliable.
