# Week 5: Apply Functions in Pandas

---

##  Question 1: What are functions in Python?

Write a simple function to square a number and another to find the average of two numbers.

In [1]:

def my_sq(x):
    return x ** 2

def avg_2(x, y):
    return (x + y) / 2

print(my_sq(4))
print(avg_2(10, 20))

16
15.0


---

##  Question 2: Create a sample DataFrame and apply a custom function using `.apply()`

In [11]:

import pandas as pd

df = pd.DataFrame({"col_a": [10, 20, 30], "col_b": [20, 30, 40]})
def my_sq(x):
    return x ** 2

df['col_a_sq'] = df['col_a'].apply(my_sq)
df

Unnamed: 0,col_a,col_b,col_a_sq
0,10,20,100
1,20,30,400
2,30,40,900


---

##  Question 3: Apply a function with an additional parameter (e.g., exponent)

Fill in the missing code to cube column `a`.

In [12]:

def my_exp(x, e):
    return x ** e

df['col_a_cube'] = df['col_a'].apply(my_exp, e=3)
df

Unnamed: 0,col_a,col_b,col_a_sq,col_a_cube
0,10,20,100,1000
1,20,30,400,8000
2,30,40,900,27000


---

##  Question 4: Demonstrate applying a function column-wise and row-wise

Use `.apply()` with `axis=0` and `axis=1`.

In [13]:

print("Column-wise mean:")
print(df.apply(lambda col:col.mean(), axis=0))

print("\nRow-wise mean:")
print(df.apply(lambda row: (row['col_a'] + row['col_b']) / 2, axis=1))

Column-wise mean:
col_a            20.000000
col_b            30.000000
col_a_sq        466.666667
col_a_cube    12000.000000
dtype: float64

Row-wise mean:
0    15.0
1    25.0
2    35.0
dtype: float64


---

##  Question 5: What is vectorization and how is `np.vectorize()` used?

Write a function that returns `NaN` when `x == 20`, otherwise average `(x + y)/2`. Then vectorize it.

The line `avg_2_mod_vec = np.vectorize(avg_2_mod)` takes the regular Python function `avg_2_mod` and creates a new function `avg_2_mod_vec` that can operate on entire NumPy arrays element by element. Without `np.vectorize`, you would typically need to use a loop to apply `avg_2_mod` to each pair of elements from two arrays. `np.vectorize` provides a convenient way to apply a function designed for single elements to arrays.

In [16]:

import numpy as np

def avg_2_mod(x, y):
    if x == 20:
        return np.nan
    else:
        return (x + y) / 2


print(df)

avg_2_mod_vec = np.vectorize(avg_2_mod)
print(avg_2_mod_vec(df['col_a'], df['col_b']))

   col_a  col_b  col_a_sq  col_a_cube
0     10     20       100        1000
1     20     30       400        8000
2     30     40       900       27000
[15. nan 35.]


---

##  Question 6: Create a new column using a lambda function to square values in column `a`

In [17]:

df['a_sq_lamb'] = df['col_a'].apply(lambda x: x ** 2)
df

Unnamed: 0,col_a,col_b,col_a_sq,col_a_cube,a_sq_lamb
0,10,20,100,1000,100
1,20,30,400,8000,400
2,30,40,900,27000,900


---

##  Question 7: Celsius to Fahrenheit

1. Create a DataFrame of Celsius temperatures.
2. Write a function to convert Celsius â†’ Fahrenheit.
3. Apply it using both a function and a lambda.

In [18]:

temps = pd.DataFrame({"Celsius": [0, 20, 37, 100]})

def to_fahrenheit(c):
    return (c * 9/5) + 32

temps['F1'] = temps['Celsius'].apply(to_fahrenheit)
temps['F2'] = temps['Celsius'].apply(lambda c: (c * 9/5) + 32)
temps

Unnamed: 0,Celsius,F1,F2
0,0,32.0,32.0
1,20,68.0,68.0
2,37,98.6,98.6
3,100,212.0,212.0
