Example:

<table>
    <tr>
        <td>A</td>
        <td>B</td>
        <td>C</td>
        <td>D</td>
        <td>E</td>
        <td>F</td>
    </tr>
    <tr>
        <td>a0</td>
        <td>b0</td>
        <td>1</td>
        <td>0.1</td>
        <td>10</td>
        <td>f0</td>
    </tr>
    <tr>
        <td>a1</td>
        <td>b1</td>
        <td>2</td>
        <td>10.2</td>
        <td>19</td>
        <td>f1</td>
    </tr>
    <tr>
        <td>a1</td>
        <td>b2</td>
        <td></td>
        <td>11.4</td>
        <td>32</td>
        <td>g2</td>
    </tr>
    <tr>
        <td>a2</td>
        <td>b2</td>
        <td>3</td>
        <td>8.9</td>
        <td>25</td>
        <td>f3</td>
    </tr>
    <tr>
        <td>a3</td>
        <td>b3</td>
        <td>4</td>
        <td>9.1</td>
        <td>8</td>
        <td>f4</td>
    </tr>
    <tr>
        <td>a4</td>
        <td></td>
        <td>5</td>
        <td>12</td>
        <td></td>
        <td>f5</td>
    </tr>
</table>

In [1]:
import numpy as np
import pandas as pd

In [2]:
df = pd.DataFrame({'A': ['a0', 'a1', 'a1', 'a2', 'a3', 'a4'], 
                   'B': ['b0', 'b1', 'b2', 'b2', 'b3', None], 
                   'C': [1, 2, None, 3, 4, 5], 
                   'D': [0.1, 10.2, 11.4, 8.9, 9.1, 12], 
                   'E': [10, 19, 32, 25, 8, None], 
                   'F': ['f0', 'f1', 'g2', 'f3', 'f4', 'f5']})
df

Unnamed: 0,A,B,C,D,E,F
0,a0,b0,1.0,0.1,10.0,f0
1,a1,b1,2.0,10.2,19.0,f1
2,a1,b2,,11.4,32.0,g2
3,a2,b2,3.0,8.9,25.0,f3
4,a3,b3,4.0,9.1,8.0,f4
5,a4,,5.0,12.0,,f5


In [3]:
## check the null
df.isnull()

Unnamed: 0,A,B,C,D,E,F
0,False,False,False,False,False,False
1,False,False,False,False,False,False
2,False,False,True,False,False,False
3,False,False,False,False,False,False
4,False,False,False,False,False,False
5,False,True,False,False,True,False


In [4]:
## drop the rows which occurs null value.
df.dropna()

Unnamed: 0,A,B,C,D,E,F
0,a0,b0,1.0,0.1,10.0,f0
1,a1,b1,2.0,10.2,19.0,f1
3,a2,b2,3.0,8.9,25.0,f3
4,a3,b3,4.0,9.1,8.0,f4


In [5]:
## drop specific row which occurs null value, 
#  the example shows drop the row which occure null value in column 'B'
df.dropna(subset=['B'])

Unnamed: 0,A,B,C,D,E,F
0,a0,b0,1.0,0.1,10.0,f0
1,a1,b1,2.0,10.2,19.0,f1
2,a1,b2,,11.4,32.0,g2
3,a2,b2,3.0,8.9,25.0,f3
4,a3,b3,4.0,9.1,8.0,f4


In [6]:
## check the duplicates 
df.duplicated(['A'])

0    False
1    False
2     True
3    False
4    False
5    False
dtype: bool

In [7]:
df.duplicated(['A', 'B'])

0    False
1    False
2    False
3    False
4    False
5    False
dtype: bool

To drop the duplciates, more details can be found in [here](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.drop_duplicates.html).

In [9]:
## drop the column occur the duplicates
df.drop_duplicates(['A'], keep=False)

Unnamed: 0,A,B,C,D,E,F
0,a0,b0,1.0,0.1,10.0,f0
3,a2,b2,3.0,8.9,25.0,f3
4,a3,b3,4.0,9.1,8.0,f4
5,a4,,5.0,12.0,,f5


In [11]:
df.fillna('b*')

Unnamed: 0,A,B,C,D,E,F
0,a0,b0,1,0.1,10,f0
1,a1,b1,2,10.2,19,f1
2,a1,b2,b*,11.4,32,g2
3,a2,b2,3,8.9,25,f3
4,a3,b3,4,9.1,8,f4
5,a4,b*,5,12.0,b*,f5


In [12]:
df.fillna(df['E'].mean())

Unnamed: 0,A,B,C,D,E,F
0,a0,b0,1.0,0.1,10.0,f0
1,a1,b1,2.0,10.2,19.0,f1
2,a1,b2,18.8,11.4,32.0,g2
3,a2,b2,3.0,8.9,25.0,f3
4,a3,b3,4.0,9.1,8.0,f4
5,a4,18.8,5.0,12.0,18.8,f5


The function `interpolate()` can __ONLY__ be used in __Series__.

In [13]:
## 
df['E'].interpolate()

0    10.0
1    19.0
2    32.0
3    25.0
4     8.0
5     8.0
Name: E, dtype: float64

In [14]:
pd.Series([1,None, 4, 5, 20]).interpolate()

0     1.0
1     2.5
2     4.0
3     5.0
4    20.0
dtype: float64

The more detail of `method` paramater can be found in [here]().

In [15]:
df['E'].interpolate(method='spline', order=3)

0    10.000000
1    19.000000
2    32.000000
3    25.000000
4     8.000000
5   -20.143603
Name: E, dtype: float64

In [18]:
## check D
upper_q = df['D'].quantile(0.75)
lower_q = df['D'].quantile(0.25)

q_int = upper_q - lower_q
k = 1.5

df[df['D']>lower_q - k*q_int][df['D']<upper_q + k*q_int]

  


Unnamed: 0,A,B,C,D,E,F
1,a1,b1,2.0,10.2,19.0,f1
2,a1,b2,,11.4,32.0,g2
3,a2,b2,3.0,8.9,25.0,f3
4,a3,b3,4.0,9.1,8.0,f4
5,a4,,5.0,12.0,,f5


Check F, 

In [19]:
df.drop(2)

Unnamed: 0,A,B,C,D,E,F
0,a0,b0,1.0,0.1,10.0,f0
1,a1,b1,2.0,10.2,19.0,f1
3,a2,b2,3.0,8.9,25.0,f3
4,a3,b3,4.0,9.1,8.0,f4
5,a4,,5.0,12.0,,f5


In [21]:
df[[True if item.startswith('f') else False for item in list(df['F'].values)]]

Unnamed: 0,A,B,C,D,E,F
0,a0,b0,1.0,0.1,10.0,f0
1,a1,b1,2.0,10.2,19.0,f1
3,a2,b2,3.0,8.9,25.0,f3
4,a3,b3,4.0,9.1,8.0,f4
5,a4,,5.0,12.0,,f5
