# ðŸ“Š 04 - Data Input and Output

Most of data science is about **loading data, exploring it, and saving results**.  
In this notebook you will learn:
- Reading CSV files with Pandas
- Inspecting DataFrames (`head`, `info`, `describe`)
- Writing data back to CSV
- Reading Excel files (if available)


## 1. Reading CSV Files

In [6]:
import pandas as pd

# Create a small CSV for demo
data = """Name,Age,Score
Alice,23,90
Bob,25,85
Charlie,22,95
"""
with open("students.csv", "w") as f:
    f.write(data)

# Read CSV into DataFrame
df = pd.read_csv("students.csv")
df

Unnamed: 0,Name,Age,Score
0,Alice,23,90
1,Bob,25,85
2,Charlie,22,95


In [14]:
stats = """TeamName,GP,W,L
Inter,15,12,3
Milan,15,9,6
Juventus,15,10,5
Roma,15,14,1
"""
with open("stats.csv", "w") as f:
  f.write(stats)

df1 = pd.read_csv("stats.csv")
df1

Unnamed: 0,TeamName,GP,W,L
0,Inter,15,12,3
1,Milan,15,9,6
2,Juventus,15,10,5
3,Roma,15,14,1


âœ… **Your Turn**: Create your own small CSV (3â€“5 rows) and read it into Pandas.

## 2. Exploring DataFrames

In [7]:
# Look at the first rows
df.head()

Unnamed: 0,Name,Age,Score
0,Alice,23,90
1,Bob,25,85
2,Charlie,22,95


In [8]:
# Info about columns and data types
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   Name    3 non-null      object
 1   Age     3 non-null      int64 
 2   Score   3 non-null      int64 
dtypes: int64(2), object(1)
memory usage: 204.0+ bytes


In [9]:
# Summary statistics
df.describe()

Unnamed: 0,Age,Score
count,3.0,3.0
mean,23.333333,90.0
std,1.527525,5.0
min,22.0,85.0
25%,22.5,87.5
50%,23.0,90.0
75%,24.0,92.5
max,25.0,95.0


âœ… **Your Turn**: Use `.head()` to preview your dataset and `.describe()` to see basic statistics.

In [15]:
df1.head()

Unnamed: 0,TeamName,GP,W,L
0,Inter,15,12,3
1,Milan,15,9,6
2,Juventus,15,10,5
3,Roma,15,14,1


In [16]:
df1.describe()

Unnamed: 0,GP,W,L
count,4.0,4.0,4.0
mean,15.0,11.25,3.75
std,0.0,2.217356,2.217356
min,15.0,9.0,1.0
25%,15.0,9.75,2.5
50%,15.0,11.0,4.0
75%,15.0,12.5,5.25
max,15.0,14.0,6.0


## 3. Writing CSV Files

In [17]:
# Save the DataFrame to a new CSV
df.to_csv("students_copy.csv", index=False)
%ls *.csv

stats.csv  students_copy.csv  students.csv


âœ… **Your Turn**: Save your DataFrame to a CSV file called `my_data.csv`.

In [18]:
df1.to_csv("stats_copy.csv", index=False)
%ls *.csv

stats_copy.csv  stats.csv  students_copy.csv  students.csv


## 4. Reading Excel Files

In [19]:
# Optional: Requires 'openpyxl' installed
# Save to Excel
df.to_excel("students.xlsx", index=False)

# Read Excel file
pd.read_excel("students.xlsx")

Unnamed: 0,TeamName,GP,W,L
0,Inter,15,12,3
1,Milan,15,9,6
2,Juventus,15,10,5
3,Roma,15,14,1


âœ… **Your Turn**: Try saving your dataset to Excel and reading it back (if supported in your environment).

---
### Summary
- `pd.read_csv` loads data into a DataFrame.
- Use `.head()`, `.info()`, `.describe()` to explore quickly.
- `df.to_csv` and `df.to_excel` save results.
- Pandas makes file I/O simple and reliable.
