<a href="https://colab.research.google.com/github/TarisMajor/TarisMajor-DataScience-2025/blob/main/Completed/05-Foundations/04-data_input_output.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 📊 04 - Data Input and Output

Most of data science is about **loading data, exploring it, and saving results**.  
In this notebook you will learn:
- Reading CSV files with Pandas
- Inspecting DataFrames (`head`, `info`, `describe`)
- Writing data back to CSV
- Reading Excel files (if available)


## 1. Reading CSV Files

In [1]:
import pandas as pd

# Create a small CSV for demo
data = """Name,Age,Score
Alice,23,90
Bob,25,85
Charlie,22,95
"""
with open("students.csv", "w") as f:
    f.write(data)

# Read CSV into DataFrame
df = pd.read_csv("students.csv")
df

Unnamed: 0,Name,Age,Score
0,Alice,23,90
1,Bob,25,85
2,Charlie,22,95


✅ **Your Turn**: Create your own small CSV (3–5 rows) and read it into Pandas.

In [2]:
pets = """Name,Age,Pet
Candy,23,Dog
Mr. Whiskers,25,Cat
Crystal,22,Dog
"""
with open("pets.csv", "w") as f:
    f.write(pets)

frame = pd.read_csv("pets.csv")
frame

Unnamed: 0,Name,Age,Pet
0,Candy,23,Dog
1,Mr. Whiskers,25,Cat
2,Crystal,22,Dog


## 2. Exploring DataFrames

In [3]:
# Look at the first rows
df.head()

Unnamed: 0,Name,Age,Score
0,Alice,23,90
1,Bob,25,85
2,Charlie,22,95


In [4]:
# Info about columns and data types
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   Name    3 non-null      object
 1   Age     3 non-null      int64 
 2   Score   3 non-null      int64 
dtypes: int64(2), object(1)
memory usage: 204.0+ bytes


In [5]:
# Summary statistics
df.describe()

Unnamed: 0,Age,Score
count,3.0,3.0
mean,23.333333,90.0
std,1.527525,5.0
min,22.0,85.0
25%,22.5,87.5
50%,23.0,90.0
75%,24.0,92.5
max,25.0,95.0


✅ **Your Turn**: Use `.head()` to preview your dataset and `.describe()` to see basic statistics.

In [6]:
frame.head()

Unnamed: 0,Name,Age,Pet
0,Candy,23,Dog
1,Mr. Whiskers,25,Cat
2,Crystal,22,Dog


In [7]:
frame.describe()

Unnamed: 0,Age
count,3.0
mean,23.333333
std,1.527525
min,22.0
25%,22.5
50%,23.0
75%,24.0
max,25.0


## 3. Writing CSV Files

In [8]:
# Save the DataFrame to a new CSV
df.to_csv("students_copy.csv", index=False)
%ls *.csv

pets.csv  students_copy.csv  students.csv


✅ **Your Turn**: Save your DataFrame to a CSV file called `my_data.csv`.

In [10]:
frame.to_csv("my_data.csv", index=False)
%ls *.csv

my_data.csv  pets.csv  students_copy.csv  students.csv


## 4. Reading Excel Files

In [11]:
# Optional: Requires 'openpyxl' installed
# Save to Excel
df.to_excel("students.xlsx", index=False)

# Read Excel file
pd.read_excel("students.xlsx")

Unnamed: 0,Name,Age,Score
0,Alice,23,90
1,Bob,25,85
2,Charlie,22,95


✅ **Your Turn**: Try saving your dataset to Excel and reading it back (if supported in your environment).

In [13]:
frame.to_excel("pets.xlsx", index=False)
pd.read_excel("pets.xlsx")

pd.read_excel("pets.xlsx")

Unnamed: 0,Name,Age,Pet
0,Candy,23,Dog
1,Mr. Whiskers,25,Cat
2,Crystal,22,Dog


---
### Summary
- `pd.read_csv` loads data into a DataFrame.
- Use `.head()`, `.info()`, `.describe()` to explore quickly.
- `df.to_csv` and `df.to_excel` save results.
- Pandas makes file I/O simple and reliable.
