## 1. Why Learn Pandas?
## What is it?
Pandas is a Python library for data manipulation and analysis. It provides Series (1D) and DataFrames (2D tabular data).

## Why we need it?
1.Simplifies data cleaning, exploration, and manipulation.
2.Handles large datasets efficiently.
3.Built on top of NumPy for fast computations.

1.Data Analysis

2.Machine Learning & Deep Learning

3.Image Processing (OpenCV, PIL use NumPy arrays)

5.Scientific Computing & Simulations

Where is it used?
1.Data Analysis & Reporting
2.Machine Learning (preprocessing datasets)
3.Finance, Healthcare, Marketing data analysis
4.CSV/Excel/SQL data handling
 
### 2. Importing Pandas
"import pandas as pd"
 

In [2]:
import pandas as pd

## 3.Series & DataFrames

In [None]:

# Series 
s = pd.Series([10, 20, 30, 40]) 
print("Series:\n", s) 
# DataFrame from dict 
data = {'Name': ['Alice','Bob','Charlie'], 'Age':[25,30,35]} 
df = pd.DataFrame(data) 
print("\nDataFrame:\n", df)

Series:
 0    10
1    20
2    30
3    40
dtype: int64

DataFrame:
       Name  Age
0    Alice   25
1      Bob   30
2  Charlie   35


## 4. Reading & Exploring Data

In [None]:

# Read CSV (replace with your file path)
df = pd.read_csv("tested.csv")
 
# Example DataFrame
#data = {'Name':['Alice','Bob','Charlie'],'Age':[25,30,35],'Sex':['F','M','M']}
#df = pd.DataFrame(data)
 
# Explore
print(df.head())
print(df.tail())
print(df.info())
print(df.describe())
print(df.shape)

   PassengerId  Survived  Pclass  \
0          892         0       3   
1          893         1       3   
2          894         0       2   
3          895         0       3   
4          896         1       3   

                                           Name     Sex   Age  SibSp  Parch  \
0                              Kelly, Mr. James    male  34.5      0      0   
1              Wilkes, Mrs. James (Ellen Needs)  female  47.0      1      0   
2                     Myles, Mr. Thomas Francis    male  62.0      0      0   
3                              Wirz, Mr. Albert    male  27.0      0      0   
4  Hirvonen, Mrs. Alexander (Helga E Lindqvist)  female  22.0      1      1   

    Ticket     Fare Cabin Embarked  
0   330911   7.8292   NaN        Q  
1   363272   7.0000   NaN        S  
2   240276   9.6875   NaN        Q  
3   315154   8.6625   NaN        S  
4  3101298  12.2875   NaN        S  
     PassengerId  Survived  Pclass                          Name     Sex  \
413       

## 5. Selecting Columns & Rows

In [None]:

# Select column
print(df['Name'])
 
# Select multiple columns
print(df[['Name','Age']])
 
# Select row by index
print(df.iloc[0])       # first row
print(df.loc[0])        # first row using label
 
# Select subset of rows and columns
print(df.loc[0:1, ['Name','Sex']])

0                                  Kelly, Mr. James
1                  Wilkes, Mrs. James (Ellen Needs)
2                         Myles, Mr. Thomas Francis
3                                  Wirz, Mr. Albert
4      Hirvonen, Mrs. Alexander (Helga E Lindqvist)
                           ...                     
413                              Spector, Mr. Woolf
414                    Oliva y Ocana, Dona. Fermina
415                    Saether, Mr. Simon Sivertsen
416                             Ware, Mr. Frederick
417                        Peter, Master. Michael J
Name: Name, Length: 418, dtype: object
                                             Name   Age
0                                Kelly, Mr. James  34.5
1                Wilkes, Mrs. James (Ellen Needs)  47.0
2                       Myles, Mr. Thomas Francis  62.0
3                                Wirz, Mr. Albert  27.0
4    Hirvonen, Mrs. Alexander (Helga E Lindqvist)  22.0
..                                            ...   .

## 6. Filtering & Sorting

In [None]:

# Filter rows
print(df[df['Age'] > 28])
 
# Sort by Age
print(df.sort_values('Age', ascending=False))

     PassengerId  Survived  Pclass  \
0            892         0       3   
1            893         1       3   
2            894         0       2   
6            898         1       3   
11           903         0       1   
..           ...       ...     ...   
404         1296         0       1   
407         1299         0       1   
411         1303         1       1   
414         1306         1       1   
415         1307         0       3   

                                                Name     Sex   Age  SibSp  \
0                                   Kelly, Mr. James    male  34.5      0   
1                   Wilkes, Mrs. James (Ellen Needs)  female  47.0      1   
2                          Myles, Mr. Thomas Francis    male  62.0      0   
6                               Connolly, Miss. Kate  female  30.0      0   
11                        Jones, Mr. Charles Cresson    male  46.0      0   
..                                               ...     ...   ...    ...   
404 

## 7. Adding & Removing Columns

In [21]:

# Add new column
df['Age_in_5yrs'] = df['Age'] + 5
print(df)
 
# Drop column
df = df.drop('Age_in_5yrs', axis=1)
print(df)
 
 

      Name        Age  Age_in_5yrs
0    Alice  25.000000    30.000000
1      Bob  33.333333    38.333333
2  Charlie  35.000000    40.000000
3    David  40.000000    45.000000
      Name        Age
0    Alice  25.000000
1      Bob  33.333333
2  Charlie  35.000000
3    David  40.000000


### 8. GroupBy & Aggregation

In [19]:

data = {'Name':['Alice','Bob','Charlie','Alice','Bob'],
        'Score':[85,90,95,80,70]}
df = pd.DataFrame(data)
 
# Group by Name and calculate mean score
grouped = df.groupby('Name').mean()
print(grouped)

         Score
Name          
Alice     82.5
Bob       80.0
Charlie   95.0


### 9. Handling Missing Data

In [20]:

data = {'Name':['Alice','Bob','Charlie','David'],
        'Age':[25, None, 35, 40]}
df = pd.DataFrame(data)
 
# Fill missing value
df['Age'] = df['Age'].fillna(df['Age'].mean())
print(df)
 
# Drop rows with missing values
# df = df.dropna()
 

      Name        Age
0    Alice  25.000000
1      Bob  33.333333
2  Charlie  35.000000
3    David  40.000000
