With binary operations between pandas data structures, there are two key points of interest: Broadcasting behavior between higher- (e.g. DataFrame) and lower-dimensional (e.g. Series) objects.

#Operations between pandas data structure

Statistics : 
Operations in general exclude missing data.

In [1]:
import numpy as np
import pandas as pd

In [2]:
dates = pd.date_range('20190101', periods=8)
df = pd.DataFrame(np.random.randn(8, 4), index=dates, columns=list('PQRS'))

In [3]:
df.mean()

P   -0.368621
Q   -0.413545
R   -0.047216
S   -0.062290
dtype: float64

In [5]:
#Same operation on the other axis:
df.mean(1)

2019-01-01   -0.101538
2019-01-02   -0.309329
2019-01-03   -0.114413
2019-01-04    0.228966
2019-01-05   -0.429535
2019-01-06   -0.519806
2019-01-07   -0.919160
2019-01-08    0.381469
Freq: D, dtype: float64

In [6]:
#Operating with objects :
s = pd.Series([1, 4, np.nan, 6, 8]).shift(2)

In [7]:
s

0    NaN
1    NaN
2    1.0
3    4.0
4    NaN
dtype: float64

In [8]:
df.sub(s,axis='index')

Unnamed: 0,P,Q,R,S
2019-01-01 00:00:00,,,,
2019-01-02 00:00:00,,,,
2019-01-03 00:00:00,,,,
2019-01-04 00:00:00,,,,
2019-01-05 00:00:00,,,,
2019-01-06 00:00:00,,,,
2019-01-07 00:00:00,,,,
2019-01-08 00:00:00,,,,
0,,,,
1,,,,


In [10]:
#Apply some functions to the data :
df.apply(np.cumsum)

Unnamed: 0,P,Q,R,S
2019-01-01,-0.332803,-0.119423,0.340465,-0.294392
2019-01-02,-1.162739,-1.401559,-0.054956,0.975785
2019-01-03,-2.650936,-1.559883,0.703077,1.406622
2019-01-04,-1.931466,-1.287555,1.47135,0.562413
2019-01-05,-1.750759,-1.651759,0.023095,0.476026
2019-01-06,-2.461776,-2.325543,-0.025601,-0.1697
2019-01-07,-3.014803,-3.295127,-1.620813,-0.728517
2019-01-08,-2.94897,-3.30836,-0.377728,-0.498323


In [11]:
df.apply(lambda x: x.max()-x.min())

P    2.207666
Q    1.554464
R    2.838296
S    2.114385
dtype: float64

In [12]:
#Histogramming :
s=pd.Series(np.random.randint(0,5,size=8))

In [13]:
s

0    4
1    4
2    3
3    2
4    4
5    0
6    0
7    4
dtype: int32

In [14]:
s.value_counts()

4    4
0    2
3    1
2    1
dtype: int64

String Methods :

Series is equipped with a set of string processing methods in the str attribute that make it easy to operate on each element of the array, here is an example:

In [15]:
s = pd.Series(['C', 'D', 'Baca', np.nan, 'CABA', 'dog', 'boy'])

In [16]:
s.str.lower()

0       c
1       d
2    baca
3     NaN
4    caba
5     dog
6     boy
dtype: object

In [17]:
s.str.upper()

0       C
1       D
2    BACA
3     NaN
4    CABA
5     DOG
6     BOY
dtype: object