# 📊 Day 2: Introduction to DataFrames in Pandas

Welcome to Day 2 of our 45-day Data Science with AI Challenge!

Today, we’ll learn about **DataFrames** — one of the most important data structures in Python for data analysis.

---

## 💡 What is a DataFrame?

A **DataFrame** is like a table — with **rows** and **columns**, just like an Excel sheet.

It comes from the **Pandas** library, which is a powerful tool for working with data in Python. It has functions for analysing, cleaning, exploring, and manipulating data.


In [40]:
# 🧪 Step 1: Importing pandas
import pandas as pd


---

## 🏗 Step 2: Create a DataFrame

There are many ways to create a DataFrame. Let's start with a simple and common way: using a **dictionary**.


In [54]:
# Creating a dictionary with sample data
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
    'Age': [25, 30, 35, 40, 45],
    'City': ['New York', 'Los Angeles', 'Chicago', 'Houston', 'Miami']
}

# Creating a DataFrame from the dictionary
df = pd.DataFrame(data)

# Display the DataFrame
df


Unnamed: 0,Name,Age,City
0,Alice,25,New York
1,Bob,30,Los Angeles
2,Charlie,35,Chicago
3,David,40,Houston
4,Eve,45,Miami


---

## 📋 What do we see here?

- Each **column** has a name (`Name`, `Age`, `City`)
- Each **row** represents one person
- Pandas automatically gives **index numbers** (0, 1, 2...) to each row

---

## 🆚 `print(df)` vs `df`

When you just type `df` in a Jupyter Notebook, it shows a nice, clean table.

But if you use `print(df)`, it looks more like plain text. Try it:


In [57]:
print(df)

      Name  Age         City
0    Alice   25     New York
1      Bob   30  Los Angeles
2  Charlie   35      Chicago
3    David   40      Houston
4      Eve   45        Miami


In [59]:
df

Unnamed: 0,Name,Age,City
0,Alice,25,New York
1,Bob,30,Los Angeles
2,Charlie,35,Chicago
3,David,40,Houston
4,Eve,45,Miami


In [61]:
df.info()#it gives over all data

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   Name    5 non-null      object
 1   Age     5 non-null      int64 
 2   City    5 non-null      object
dtypes: int64(1), object(2)
memory usage: 248.0+ bytes


In [63]:
df.tail(2)

Unnamed: 0,Name,Age,City
3,David,40,Houston
4,Eve,45,Miami


In [65]:
df.columns#.tolist()

Index(['Name', 'Age', 'City'], dtype='object')

In [73]:
df.columns[:3].tolist()

['Name', 'Age', 'City']

In [75]:
for column in df.columns:
    print(column)

Name
Age
City


In [77]:
df.Name

0      Alice
1        Bob
2    Charlie
3      David
4        Eve
Name: Name, dtype: object

In [79]:
df[['Name', 'Age']]

Unnamed: 0,Name,Age
0,Alice,25
1,Bob,30
2,Charlie,35
3,David,40
4,Eve,45


In [81]:
del df['City']
df

Unnamed: 0,Name,Age
0,Alice,25
1,Bob,30
2,Charlie,35
3,David,40
4,Eve,45


In [83]:
df["Gender"] = "Male"
df

Unnamed: 0,Name,Age,Gender
0,Alice,25,Male
1,Bob,30,Male
2,Charlie,35,Male
3,David,40,Male
4,Eve,45,Male


In [85]:
df["Married"] = True
df

Unnamed: 0,Name,Age,Gender,Married
0,Alice,25,Male,True
1,Bob,30,Male,True
2,Charlie,35,Male,True
3,David,40,Male,True
4,Eve,45,Male,True


In [87]:
df["College"] = pd.Series(["Harvard", "IIT"],
                           index=[1, 3])

df

Unnamed: 0,Name,Age,Gender,Married,College
0,Alice,25,Male,True,
1,Bob,30,Male,True,Harvard
2,Charlie,35,Male,True,
3,David,40,Male,True,IIT
4,Eve,45,Male,True,


---

## ✅ Summary

- A **DataFrame** is like an Excel table in Python
- We used a dictionary to create it
- It's part of the `pandas` library
- Use `df` (not `print(df)`) to see a nice table in Jupyter

---

🚀 That’s it for Day 1! You’ve just created your first DataFrame. Tomorrow, we’ll learn how to explore and analyze it!

👉 Don’t forget to share your progress and tag #45DaysOfDataScience
