# Pandas Operations
**`05-operations.ipynb`**

In this notebook, we learn how to perform **basic operations** on Pandas Series and DataFrames.  
Operations include **arithmetic**, **aggregations**, **element-wise operations**, and working with **rows and columns**.

---

## Step 1: Import Libraries

In [1]:
import pandas as pd
import numpy as np



---

## Step 2: Creating a Sample DataFrame

In [2]:
data = {
    "Name": ["Alice", "Bob", "Charlie", "David"],
    "Math": [85, 90, 78, 92],
    "Physics": [80, 95, 75, 88],
    "Chemistry": [82, 88, 79, 85]
}

df = pd.DataFrame(data)
print(df)

      Name  Math  Physics  Chemistry
0    Alice    85       80         82
1      Bob    90       95         88
2  Charlie    78       75         79
3    David    92       88         85



---

## Step 3: Operations on Columns

### Arithmetic Operations

In [3]:
# Add 5 marks to all students in Math
df['Math_plus_5'] = df['Math'] + 5
print(df)

# Average marks across subjects for each student
df['Average'] = (df['Math'] + df['Physics'] + df['Chemistry']) / 3
print(df)

      Name  Math  Physics  Chemistry  Math_plus_5
0    Alice    85       80         82           90
1      Bob    90       95         88           95
2  Charlie    78       75         79           83
3    David    92       88         85           97
      Name  Math  Physics  Chemistry  Math_plus_5    Average
0    Alice    85       80         82           90  82.333333
1      Bob    90       95         88           95  91.000000
2  Charlie    78       75         79           83  77.333333
3    David    92       88         85           97  88.333333


### Using Vectorized Operations

In [4]:
# Multiply marks by 1.1 (10% bonus)
df[['Math', 'Physics', 'Chemistry']] = df[['Math', 'Physics', 'Chemistry']] * 1.1
print(df)

      Name   Math  Physics  Chemistry  Math_plus_5    Average
0    Alice   93.5     88.0       90.2           90  82.333333
1      Bob   99.0    104.5       96.8           95  91.000000
2  Charlie   85.8     82.5       86.9           83  77.333333
3    David  101.2     96.8       93.5           97  88.333333



---

## Step 4: Operations on Series

In [5]:
# Create a Series
s = pd.Series([10, 20, 30, 40])

print("Original Series:\n", s)
print("Add 5:\n", s + 5)
print("Multiply by 2:\n", s * 2)
print("Square:\n", s ** 2)
print("Square root (NumPy):\n", np.sqrt(s))

Original Series:
 0    10
1    20
2    30
3    40
dtype: int64
Add 5:
 0    15
1    25
2    35
3    45
dtype: int64
Multiply by 2:
 0    20
1    40
2    60
3    80
dtype: int64
Square:
 0     100
1     400
2     900
3    1600
dtype: int64
Square root (NumPy):
 0    3.162278
1    4.472136
2    5.477226
3    6.324555
dtype: float64


---

## Step 5: Row-wise and Column-wise Operations

In [6]:
# Sum across rows
df['Total'] = df[['Math', 'Physics', 'Chemistry']].sum(axis=1)
print(df)

# Sum across columns
column_sums = df[['Math', 'Physics', 'Chemistry']].sum(axis=0)
print("Sum of each column:\n", column_sums)


      Name   Math  Physics  Chemistry  Math_plus_5    Average  Total
0    Alice   93.5     88.0       90.2           90  82.333333  271.7
1      Bob   99.0    104.5       96.8           95  91.000000  300.3
2  Charlie   85.8     82.5       86.9           83  77.333333  255.2
3    David  101.2     96.8       93.5           97  88.333333  291.5
Sum of each column:
 Math         379.5
Physics      371.8
Chemistry    367.4
dtype: float64



---


## Step 6: Applying Functions

In [7]:
# Apply a custom function to a column
def grade(marks):
    if marks >= 90:
        return "A"
    elif marks >= 80:
        return "B"
    elif marks >= 70:
        return "C"
    else:
        return "D"

df['Math_Grade'] = df['Math'].apply(grade)
print(df)

# Apply lambda function
df['Physics_Grade'] = df['Physics'].apply(lambda x: 'Pass' if x >= 80 else 'Fail')
print(df)


      Name   Math  Physics  Chemistry  Math_plus_5    Average  Total  \
0    Alice   93.5     88.0       90.2           90  82.333333  271.7   
1      Bob   99.0    104.5       96.8           95  91.000000  300.3   
2  Charlie   85.8     82.5       86.9           83  77.333333  255.2   
3    David  101.2     96.8       93.5           97  88.333333  291.5   

  Math_Grade  
0          A  
1          A  
2          B  
3          A  
      Name   Math  Physics  Chemistry  Math_plus_5    Average  Total  \
0    Alice   93.5     88.0       90.2           90  82.333333  271.7   
1      Bob   99.0    104.5       96.8           95  91.000000  300.3   
2  Charlie   85.8     82.5       86.9           83  77.333333  255.2   
3    David  101.2     96.8       93.5           97  88.333333  291.5   

  Math_Grade Physics_Grade  
0          A          Pass  
1          A          Pass  
2          B          Pass  
3          A          Pass  



---


## Step 7: Aggregation Functions

In [8]:
# Max, min, mean for each subject
print("Maximum marks:\n", df[['Math','Physics','Chemistry']].max())
print("Minimum marks:\n", df[['Math','Physics','Chemistry']].min())
print("Mean marks:\n", df[['Math','Physics','Chemistry']].mean())

# Row-wise aggregation
print("Row-wise mean:\n", df[['Math','Physics','Chemistry']].mean(axis=1))

Maximum marks:
 Math         101.2
Physics      104.5
Chemistry     96.8
dtype: float64
Minimum marks:
 Math         85.8
Physics      82.5
Chemistry    86.9
dtype: float64
Mean marks:
 Math         94.875
Physics      92.950
Chemistry    91.850
dtype: float64
Row-wise mean:
 0     90.566667
1    100.100000
2     85.066667
3     97.166667
dtype: float64



---

## Step 8: Operations Between Series and DataFrames

In [9]:
# Create another DataFrame
df_bonus = pd.DataFrame({
    "Math": [5, 5, 5, 5],
    "Physics": [3, 2, 4, 1],
    "Chemistry": [2, 3, 1, 2]
})

# Add bonus marks
df[['Math','Physics','Chemistry']] + df_bonus

Unnamed: 0,Math,Physics,Chemistry
0,98.5,91.0,92.2
1,104.0,106.5,99.8
2,90.8,86.5,87.9
3,106.2,97.8,95.5


---


## Step 9: Operations with Alignment

In [10]:
s1 = pd.Series([1, 2, 3], index=['a', 'b', 'c'])
s2 = pd.Series([4, 5, 6], index=['b', 'c', 'd'])

# Adding aligned by index
print(s1 + s2)
# Note: index 'a' and 'd' become NaN because they don't exist in both Series


a    NaN
b    6.0
c    8.0
d    NaN
dtype: float64



---


## Step 10: Real-World Example

In [11]:
# Sales DataFrame
sales = pd.DataFrame({
    "Product": ["A", "B", "C"],
    "Jan": [1000, 1200, 900],
    "Feb": [1100, 1300, 950],
    "Mar": [1050, 1250, 970]
})

# Total sales per product
sales['Total'] = sales[['Jan','Feb','Mar']].sum(axis=1)
print(sales)

# Average sales per month
monthly_avg = sales[['Jan','Feb','Mar']].mean(axis=0)
print("Monthly Average:\n", monthly_avg)

  Product   Jan   Feb   Mar  Total
0       A  1000  1100  1050   3150
1       B  1200  1300  1250   3750
2       C   900   950   970   2820
Monthly Average:
 Jan    1033.333333
Feb    1116.666667
Mar    1090.000000
dtype: float64



---

## ✅ Summary

* Pandas supports **element-wise operations** on Series and DataFrames.
* Arithmetic can be applied to **columns, rows, or entire DataFrames**.
* Aggregation functions: `.sum()`, `.mean()`, `.max()`, `.min()`, `.apply()`.
* DataFrames align automatically during operations **based on index and columns**.
* Lambda functions and custom functions can be applied with `.apply()` for flexible operations.


---
