# 🐼 Pandas DataFrame Notes with Examples

### 1️⃣ What is a DataFrame?
A **DataFrame** is a **2D tabular data structure** (like an Excel sheet or SQL table).

It consists of **rows and columns.**

Each column is a **Series.**

### 2️⃣ Creating DataFrames
#### From a Dictionary

In [None]:
import numpy as np
import pandas as pd

data= {
    
    "Name":["Pratamesh","Yashraj","Shreyas","Kaif","Rushi"],
    "Age":[20,25,30,23,40],
    "Salary":[50000,35000,15000,30000,40000]
}
df=pd.DataFrame(data)
print(df)

        Name  Age  Salary
0  Pratamesh   20   50000
1    Yashraj   25   35000
2    Shreyas   30   15000
3       Kaif   23   30000
4      Rushi   40   40000


#### From a List of Dictionaries

In [None]:
data2=[
    
    {"Name":"Pratham","Age":22,"Salary":60000},
    {"Name":"Yash","Age":23,"Salary":50000},
    {"Name":"Rushi","Age":24,"Salary":40000},
    {"Name":"Shreyas","Age":25,"Salary":30000}
    
]
df2=pd.DataFrame(data2)
print(df2)

      Name  Age  Salary
0  Pratham   22   60000
1     Yash   23   50000
2    Rushi   24   40000
3  Shreyas   25   30000


#### From a List of Lists

In [None]:
data3=[
    ["Amit",21,90],
    ["Amit",21,90],
    ["Amit",21,90]
]
df3=pd.DataFrame(data3,columns=["Name","Age","Marks"])
print(df3)

   Name  Age  Marks
0  Amit   21     90
1  Amit   21     90
2  Amit   21     90


#### From NumPy Array

In [None]:
arr = np.array([[1,2,3],[4,5,6],[7,8,9]])
df4=pd.DataFrame(arr,columns=["A","B","C"])
print(df4)

   A  B  C
0  1  2  3
1  4  5  6
2  7  8  9


### 2️⃣ Selection & Indexing of column


In [None]:

data4= {
    
    "Name":["Pratamesh","Yashraj","Shreyas","Kaif","Rushi"],
    "Age":[20,25,30,23,40],
    "Salary":[50000,35000,15000,30000,40000],
    "City":["Pune","Nagpur","Mumbai","Mumbai","Jaipur"]
}
df5=pd.DataFrame(data4)
print(df5)

print('\n',df5["Name"])

print('\n',df5[["Age","City"]])



        Name  Age  Salary    City
0  Pratamesh   20   50000    Pune
1    Yashraj   25   35000  Nagpur
2    Shreyas   30   15000  Mumbai
3       Kaif   23   30000  Mumbai
4      Rushi   40   40000  Jaipur

 0    Pratamesh
1      Yashraj
2      Shreyas
3         Kaif
4        Rushi
Name: Name, dtype: object

    Age    City
0   20    Pune
1   25  Nagpur
2   30  Mumbai
3   23  Mumbai
4   40  Jaipur


### 3️⃣ Creating a new column


In [None]:
df5

Unnamed: 0,Name,Age,Salary,City
0,Pratamesh,20,50000,Pune
1,Yashraj,25,35000,Nagpur
2,Shreyas,30,15000,Mumbai
3,Kaif,23,30000,Mumbai
4,Rushi,40,40000,Jaipur


In [None]:
# Creating Column, its new column
df5["Designation"]=["Data Scientist","Sr Developer","Jr Developer","Tester","Cloud Administrator"]
print(df5)

# Its a derived column from Salary
df5["Salary+1000"]= df5["Salary"] + 1000
print(df5)

        Name  Age  Salary    City          Designation
0  Pratamesh   20   50000    Pune       Data Scientist
1    Yashraj   25   35000  Nagpur         Sr Developer
2    Shreyas   30   15000  Mumbai         Jr Developer
3       Kaif   23   30000  Mumbai               Tester
4      Rushi   40   40000  Jaipur  Cloud Administrator
        Name  Age  Salary    City          Designation  Salary+1000
0  Pratamesh   20   50000    Pune       Data Scientist        51000
1    Yashraj   25   35000  Nagpur         Sr Developer        36000
2    Shreyas   30   15000  Mumbai         Jr Developer        16000
3       Kaif   23   30000  Mumbai               Tester        31000
4      Rushi   40   40000  Jaipur  Cloud Administrator        41000


### 4️⃣ Removing Column

In [None]:
df5


Unnamed: 0,Name,Age,Salary,City,Designation,Salary+1000
0,Pratamesh,20,50000,Pune,Data Scientist,51000
1,Yashraj,25,35000,Nagpur,Sr Developer,36000
2,Shreyas,30,15000,Mumbai,Jr Developer,16000
3,Kaif,23,30000,Mumbai,Tester,31000
4,Rushi,40,40000,Jaipur,Cloud Administrator,41000


In [None]:
# With drop we can delete rows or column.
# axis=1 ----> for column
# axis=0 -----> for row
print(df5)

df5.drop('Salary+1000',axis=1, inplace=True)  
# Use "inplace" for permannantly delete. If we dont use "inplace" it will
# be drop and will show deleted copy but will not be permanantly deleted from memoru
print(df5)


        Name  Age  Salary    City          Designation  Salary+1000
0  Pratamesh   20   50000    Pune       Data Scientist        51000
1    Yashraj   25   35000  Nagpur         Sr Developer        36000
2    Shreyas   30   15000  Mumbai         Jr Developer        16000
3       Kaif   23   30000  Mumbai               Tester        31000
4      Rushi   40   40000  Jaipur  Cloud Administrator        41000
        Name  Age  Salary    City          Designation
0  Pratamesh   20   50000    Pune       Data Scientist
1    Yashraj   25   35000  Nagpur         Sr Developer
2    Shreyas   30   15000  Mumbai         Jr Developer
3       Kaif   23   30000  Mumbai               Tester
4      Rushi   40   40000  Jaipur  Cloud Administrator


In [None]:
df5

Unnamed: 0,Name,Age,Salary,City,Designation
0,Pratamesh,20,50000,Pune,Data Scientist
1,Yashraj,25,35000,Nagpur,Sr Developer
2,Shreyas,30,15000,Mumbai,Jr Developer
3,Kaif,23,30000,Mumbai,Tester
4,Rushi,40,40000,Jaipur,Cloud Administrator


### For removing row


In [None]:
df5

Unnamed: 0,Name,Age,Salary,City,Designation
0,Pratamesh,20,50000,Pune,Data Scientist
1,Yashraj,25,35000,Nagpur,Sr Developer
2,Shreyas,30,15000,Mumbai,Jr Developer
3,Kaif,23,30000,Mumbai,Tester
4,Rushi,40,40000,Jaipur,Cloud Administrator


In [None]:
df5.drop(2,axis=0)
# Its not permanantly deleted because we dont use "inplace" so if we again print the the "df5" we will see the deleted row.

Unnamed: 0,Name,Age,Salary,City,Designation
0,Pratamesh,20,50000,Pune,Data Scientist
1,Yashraj,25,35000,Nagpur,Sr Developer
3,Kaif,23,30000,Mumbai,Tester
4,Rushi,40,40000,Jaipur,Cloud Administrator
