Consider the following data frame
total_bill | tip | sex | smoker | day | time | size | |
---|---|---|---|---|---|---|---|
16.99 | 1.01 | Female | No | Sun | Dinner | 2 | 0 |
10.34 | 1.66 | Male | No | Sun | Dinner | 3 | 1 |
21.01 | 3.50 | Male | No | Sun | Dinner | 3 | 2 |
23.68 | 3.31 | Male | No | Sun | Dinner | 2 | 3 |
24.59 | 3.61 | Female | No | Sun | Dinner | 4 | 4 |
When we run the following code, all values of each column will be printed
df.apply(lambda x:print(x), axis=0)
0 16.99
1 10.34
2 21.01
3 23.68
4 24.59
...
Name: total_bill, Length: 244, dtype: float64
0 1.01
1 1.66
2 3.50
3 3.31
4 3.61
...
Name: tip, Length: 244, dtype: float64
0 Female
1 Male
2 Male
3 Male
4 Female
...
Name: sex, Length: 244, dtype: category
Categories (2, object): ['Male', 'Female']
0 No
1 No
2 No
3 No
4 No
...
Name: smoker, Length: 244, dtype: category
Categories (2, object): ['Yes', 'No']
0 Sun
1 Sun
2 Sun
3 Sun
4 Sun
...
Name: day, Length: 244, dtype: category
Categories (4, object): ['Thur', 'Fri', 'Sat', 'Sun']
0 Dinner
1 Dinner
2 Dinner
3 Dinner
4 Dinner
...
Name: time, Length: 244, dtype: category
Categories (2, object): ['Lunch', 'Dinner']
0 2
1 3
2 3
3 2
4 4
..
Name: size, Length: 244, dtype: int64
When we run the following code, all values of each row will be printed
df.apply(lambda x:print(x), axis=1)
total_bill 16.99
tip 1.01
sex Female
smoker No
day Sun
time Dinner
size 2
Name: 0, dtype: object
total_bill 10.34
tip 1.66
sex Male
smoker No
day Sun
time Dinner
size 3
Name: 1, dtype: object
...
Following are different ways to use apply
function to create new columns in a data frame
def multiply(d):
# print(d.shape) # (7,)
d["res"]=d["total_bill"] * d["size"]
d["res2"]=d["total_bill"] * d["size"]
return d
def sum(d):
return d["total_bill"] + d["size"]
df = df.apply(multiply, axis=1)
df["sum"]=df.apply(sum, axis=1)