### Notes on Operations between `DataFrame` and `Series`
File: `pd_02_dfops.ipynb` <br>
Xuhua Huang <br>
Last updated: July 8, 2022 <br>
Created on: July 8, 2022

In [1]:
import numpy as np
import pandas as pd

In [2]:
arr = np.arange(12.).reshape((3, 4))
arr

array([[ 0.,  1.,  2.,  3.],
       [ 4.,  5.,  6.,  7.],
       [ 8.,  9., 10., 11.]])

In [3]:
arr[0]

array([0., 1., 2., 3.])

## Broadcast Mechanism
See the operation in the following cell; when attempting to minus a `pd.Series` from an `numpy` array, each row of the 2D array performs the same operation.

In [4]:
arr - arr[0]

array([[0., 0., 0., 0.],
       [4., 4., 4., 4.],
       [8., 8., 8., 8.]])

### Another small example

In [5]:
df = pd.DataFrame(
    np.arange(12.).reshape((4, 3)),
    columns=list('bde'),
    index=['Utah', 'Ohio', 'Texas', 'Oregon']
)

_series = df.iloc[0]
df

Unnamed: 0,b,d,e
Utah,0.0,1.0,2.0
Ohio,3.0,4.0,5.0
Texas,6.0,7.0,8.0
Oregon,9.0,10.0,11.0


In [6]:
_series

b    0.0
d    1.0
e    2.0
Name: Utah, dtype: float64

In [7]:
df - _series

Unnamed: 0,b,d,e
Utah,0.0,0.0,0.0
Ohio,3.0,3.0,3.0
Texas,6.0,6.0,6.0
Oregon,9.0,9.0,9.0


## Applying a `lambda` to a `pd.DataFrame`

In [8]:
# the default axis to apply the lambda is the x axis
# i.e. axis='rows'
# the axis positional argument is optional
_f = lambda x: x.max() - x.min()
df.apply(_f, axis='rows')

b    9.0
d    9.0
e    9.0
dtype: float64

In [9]:
df.apply(_f, axis='columns')

Utah      2.0
Ohio      2.0
Texas     2.0
Oregon    2.0
dtype: float64