##  Data Manupulation using Pandas


#### Pandas is an open-source Python Library providing high-performance data manipulation and analysis tool using its powerful data structures.

##### Note: 
1. First Clean the Evironment (Go to "Kernel" Menu --> "Restart & Clean Output"
2. To execute the code --> Click on a cell and press cntrl + enter key


## Key Features of Pandas
- Fast and efficient DataFrame object with default and customized indexing.
- Tools for loading data into in-memory data objects from different file formats.
- Data alignment and integrated handling of missing data.
- Reshaping and pivoting of date sets.
- Label-based slicing, indexing and subsetting of large data sets.
- Columns from a data structure can be deleted or inserted.
- Group by data for aggregation and transformations.
- High performance merging and joining of data.
- Time Series functionality.


## Working with Pandas

## 1. Import pandas library

In [12]:
#This command imports all the methods related to pandas.

import pandas as pd


## 2 Let's start with Series

#### Series is a one-dimensional labeled array

### 2.1 A Series is created with data from 1 to 9

In [13]:
#import pandas as pd

a = [1, 3, 5, 7, 9, 2, 4, 6, 8]
a1 = pd.Series(a)

print(a1)


0    1
1    3
2    5
3    7
4    9
5    2
6    4
7    6
8    8
dtype: int64


### 2.2 A Series has been created with Data along with it's Index

In [14]:
import pandas as pd

a1 = [1,3,5,7,9,2,4,6,8]
a2 = ['a','arun','b','c','d','e','f','g','h']
a3 = pd.Series(a1,a2)

# print(a3)
#a3[0]

print(a3)
print(a3['g'])
print(a3[-2])




a       1
arun    3
b       5
c       7
d       9
e       2
f       4
g       6
h       8
dtype: int64
6
6


### 2.3 Creating a series with the help of a dictionary

In [15]:
import pandas as pd

dict1 = {'Oranges':3, 'Apples':4, 'Mangoes':2, 'Banana':12}
dict2 = pd.Series(dict1)

print (dict2)
print (type(dict1))
print(dict2['Apples'])

Oranges     3
Apples      4
Mangoes     2
Banana     12
dtype: int64
<class 'dict'>
4


### 2.4 Creating a series with the help of Nested List

In [16]:
import pandas as pd

Array1 = [[1,3,5],[2,4,6]]
Array2 = pd.Series(Array1)

print (Array2)
type(Array2)


0    [1, 3, 5]
1    [2, 4, 6]
dtype: object


pandas.core.series.Series

## 2 DataFrames
#### DataFrames are 2 dimensional data structure which are defined in PANDAS which has rows and columns.

### 2.1 Creating a data frame with dictionary

In [17]:
import pandas as pd

Data = {'Age':[23,33,12],'Name':['Rahul','John','Robert']}

Data1 = pd.DataFrame(Data)

print(Data1)


   Age    Name
0   23   Rahul
1   33    John
2   12  Robert


### 2.2 Creating a data frame with lists

In [18]:
# import pandas as pd

Data2 = [[4,1900],[3,1600],[2,1100],[1,850]]
Data3 = pd.DataFrame(Data2)#, columns = ['No_of_Bedrooms','Square_Feet'])
print(Data2)
print("")
print (Data3)


[[4, 1900], [3, 1600], [2, 1100], [1, 850]]

   0     1
0  4  1900
1  3  1600
2  2  1100
3  1   850


### 2.3 Assigning indexes within a data frame

In [19]:
import pandas as pd

Data4 = {'Name':['Ankit','Rishitha','Karthik','Vishnu'],'Marks':[78,67,98,56]}
Data5 = pd.DataFrame(Data4,index = ['Rank 2','Rank 3','Rank 1','Rank 4'])

print (Data5)


            Name  Marks
Rank 2     Ankit     78
Rank 3  Rishitha     67
Rank 1   Karthik     98
Rank 4    Vishnu     56


### 2.4 Creating dataframes from list of dictionaries

In [20]:
import pandas as pd

Data6 = [{'A':65,'B':66,'C':67},{'A':97,'B':98,'D':99}]
Data7 = pd.DataFrame(Data6)
a = Data7.iloc[:,1:3]
print(a)
print (Data7)
print(type(Data6))
print(type(Data7))
print(Data6[1])

    B     C
0  66  67.0
1  98   NaN
    A   B     C     D
0  65  66  67.0   NaN
1  97  98   NaN  99.0
<class 'list'>
<class 'pandas.core.frame.DataFrame'>
{'A': 97, 'B': 98, 'D': 99}


### 2.5 Creating a dataframe with the help of timestamp and categorical.

In [21]:
import numpy as np
import pandas as pd

Data8 = pd.DataFrame({'A':[1,2,3,'',5], 'B':pd.Timestamp('20190305'),'C':np.array([3]*5)
                     , 'D' : pd.Categorical(["Test","Train","Car","Bike", "Bus"])
                     , 'E':'Hello, Welcome!'})
print(Data8)


## 3 Working with data file (csv)

   A          B  C      D                E
0  1 2019-03-05  3   Test  Hello, Welcome!
1  2 2019-03-05  3  Train  Hello, Welcome!
2  3 2019-03-05  3    Car  Hello, Welcome!
3    2019-03-05  3   Bike  Hello, Welcome!
4  5 2019-03-05  3    Bus  Hello, Welcome!


### 3.1 Read csv file

In [22]:
import pandas as pd

LOL = pd.read_csv('League_of_Legends.csv')
LOL


Unnamed: 0,gameId,blueWins,blueWardsPlaced,blueWardsDestroyed,blueFirstBlood,blueKills,blueDeaths,blueAssists,blueEliteMonsters,blueDragons,...,redTowersDestroyed,redTotalGold,redAvgLevel,redTotalExperience,redTotalMinionsKilled,redTotalJungleMinionsKilled,redGoldDiff,redExperienceDiff,redCSPerMin,redGoldPerMin
0,4519157822,0,28,2,1,9,6,11,0,0,...,0,16567,6.8,17047,197,55,-643,8,19.7,1656.7
1,4523371949,0,12,1,0,5,5,5,0,0,...,1,17620,6.8,17438,240,52,2908,1173,24.0,1762.0
2,4521474530,0,15,0,0,7,11,4,1,1,...,0,17285,6.8,17254,203,28,1172,1033,20.3,1728.5
3,4524384067,0,43,1,0,4,5,5,1,0,...,0,16478,7.0,17961,235,47,1321,7,23.5,1647.8
4,4436033771,0,75,4,0,6,6,6,0,0,...,0,17404,7.0,18313,225,67,1004,-230,22.5,1740.4
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
9874,4527873286,1,17,2,1,7,4,5,1,1,...,0,15246,6.8,16498,229,34,-2519,-2469,22.9,1524.6
9875,4527797466,1,54,0,0,6,4,8,1,1,...,0,15456,7.0,18367,206,56,-782,-888,20.6,1545.6
9876,4527713716,0,23,1,0,6,7,5,0,0,...,0,18319,7.4,19909,261,60,2416,1877,26.1,1831.9
9877,4527628313,0,14,4,1,2,3,3,1,1,...,0,15298,7.2,18314,247,40,839,1085,24.7,1529.8


### 3.2 Get the dimention of the dataset

In [23]:

LOL.shape


(9879, 40)

### 3.3 Top 5 rows of the Data Set

In [24]:

LOL.head(10)


Unnamed: 0,gameId,blueWins,blueWardsPlaced,blueWardsDestroyed,blueFirstBlood,blueKills,blueDeaths,blueAssists,blueEliteMonsters,blueDragons,...,redTowersDestroyed,redTotalGold,redAvgLevel,redTotalExperience,redTotalMinionsKilled,redTotalJungleMinionsKilled,redGoldDiff,redExperienceDiff,redCSPerMin,redGoldPerMin
0,4519157822,0,28,2,1,9,6,11,0,0,...,0,16567,6.8,17047,197,55,-643,8,19.7,1656.7
1,4523371949,0,12,1,0,5,5,5,0,0,...,1,17620,6.8,17438,240,52,2908,1173,24.0,1762.0
2,4521474530,0,15,0,0,7,11,4,1,1,...,0,17285,6.8,17254,203,28,1172,1033,20.3,1728.5
3,4524384067,0,43,1,0,4,5,5,1,0,...,0,16478,7.0,17961,235,47,1321,7,23.5,1647.8
4,4436033771,0,75,4,0,6,6,6,0,0,...,0,17404,7.0,18313,225,67,1004,-230,22.5,1740.4
5,4475365709,1,18,0,0,5,3,6,1,1,...,0,15201,7.0,18060,221,59,-698,-101,22.1,1520.1
6,4493010632,1,18,3,1,7,6,7,1,1,...,0,14463,6.4,15404,164,35,-2411,-1563,16.4,1446.3
7,4496759358,0,16,2,0,5,13,3,0,0,...,0,17920,6.6,16938,157,54,2615,800,15.7,1792.0
8,4443048030,0,16,3,0,7,7,8,0,0,...,0,18380,7.2,19298,240,53,1979,771,24.0,1838.0
9,4509433346,1,13,1,1,4,5,5,1,1,...,0,16605,6.8,18379,247,43,1548,1574,24.7,1660.5


### 3.4 Bottom 5 rows of the Data Set

In [25]:

LOL.tail()


Unnamed: 0,gameId,blueWins,blueWardsPlaced,blueWardsDestroyed,blueFirstBlood,blueKills,blueDeaths,blueAssists,blueEliteMonsters,blueDragons,...,redTowersDestroyed,redTotalGold,redAvgLevel,redTotalExperience,redTotalMinionsKilled,redTotalJungleMinionsKilled,redGoldDiff,redExperienceDiff,redCSPerMin,redGoldPerMin
9874,4527873286,1,17,2,1,7,4,5,1,1,...,0,15246,6.8,16498,229,34,-2519,-2469,22.9,1524.6
9875,4527797466,1,54,0,0,6,4,8,1,1,...,0,15456,7.0,18367,206,56,-782,-888,20.6,1545.6
9876,4527713716,0,23,1,0,6,7,5,0,0,...,0,18319,7.4,19909,261,60,2416,1877,26.1,1831.9
9877,4527628313,0,14,4,1,2,3,3,1,1,...,0,15298,7.2,18314,247,40,839,1085,24.7,1529.8
9878,4523772935,1,18,0,1,6,6,5,0,0,...,0,15339,6.8,17379,201,46,-927,58,20.1,1533.9


### 3.5 Get all column names of the Data Set

In [26]:

LOL.columns


Index(['gameId', 'blueWins', 'blueWardsPlaced', 'blueWardsDestroyed',
       'blueFirstBlood', 'blueKills', 'blueDeaths', 'blueAssists',
       'blueEliteMonsters', 'blueDragons', 'blueHeralds',
       'blueTowersDestroyed', 'blueTotalGold', 'blueAvgLevel',
       'blueTotalExperience', 'blueTotalMinionsKilled',
       'blueTotalJungleMinionsKilled', 'blueGoldDiff', 'blueExperienceDiff',
       'blueCSPerMin', 'blueGoldPerMin', 'redWardsPlaced', 'redWardsDestroyed',
       'redFirstBlood', 'redKills', 'redDeaths', 'redAssists',
       'redEliteMonsters', 'redDragons', 'redHeralds', 'redTowersDestroyed',
       'redTotalGold', 'redAvgLevel', 'redTotalExperience',
       'redTotalMinionsKilled', 'redTotalJungleMinionsKilled', 'redGoldDiff',
       'redExperienceDiff', 'redCSPerMin', 'redGoldPerMin'],
      dtype='object')

### 3.6 Get the statistical summary of the data

In [27]:

LOL.describe()


Unnamed: 0,gameId,blueWins,blueWardsPlaced,blueWardsDestroyed,blueFirstBlood,blueKills,blueDeaths,blueAssists,blueEliteMonsters,blueDragons,...,redTowersDestroyed,redTotalGold,redAvgLevel,redTotalExperience,redTotalMinionsKilled,redTotalJungleMinionsKilled,redGoldDiff,redExperienceDiff,redCSPerMin,redGoldPerMin
count,9879.0,9879.0,9879.0,9879.0,9879.0,9879.0,9879.0,9879.0,9879.0,9879.0,...,9879.0,9879.0,9879.0,9879.0,9879.0,9879.0,9879.0,9879.0,9879.0,9879.0
mean,4500084000.0,0.499038,22.288288,2.824881,0.504808,6.183925,6.137666,6.645106,0.549954,0.36198,...,0.043021,16489.041401,6.925316,17961.730438,217.349226,51.313088,-14.414111,33.620306,21.734923,1648.90414
std,27573280.0,0.500024,18.019177,2.174998,0.500002,3.011028,2.933818,4.06452,0.625527,0.480597,...,0.2169,1490.888406,0.305311,1198.583912,21.911668,10.027885,2453.349179,1920.370438,2.191167,149.088841
min,4295358000.0,0.0,5.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,11212.0,4.8,10465.0,107.0,4.0,-11467.0,-8348.0,10.7,1121.2
25%,4483301000.0,0.0,14.0,1.0,0.0,4.0,4.0,4.0,0.0,0.0,...,0.0,15427.5,6.8,17209.5,203.0,44.0,-1596.0,-1212.0,20.3,1542.75
50%,4510920000.0,0.0,16.0,3.0,1.0,6.0,6.0,6.0,0.0,0.0,...,0.0,16378.0,7.0,17974.0,218.0,51.0,-14.0,28.0,21.8,1637.8
75%,4521733000.0,1.0,20.0,4.0,1.0,8.0,8.0,9.0,1.0,1.0,...,0.0,17418.5,7.2,18764.5,233.0,57.0,1585.5,1290.5,23.3,1741.85
max,4527991000.0,1.0,250.0,27.0,1.0,22.0,22.0,29.0,2.0,1.0,...,2.0,22732.0,8.2,22269.0,289.0,92.0,10830.0,9333.0,28.9,2273.2


### 3.7 Get the information related to the Data Frame

In [28]:

LOL.info()


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 9879 entries, 0 to 9878
Data columns (total 40 columns):
 #   Column                        Non-Null Count  Dtype  
---  ------                        --------------  -----  
 0   gameId                        9879 non-null   int64  
 1   blueWins                      9879 non-null   int64  
 2   blueWardsPlaced               9879 non-null   int64  
 3   blueWardsDestroyed            9879 non-null   int64  
 4   blueFirstBlood                9879 non-null   int64  
 5   blueKills                     9879 non-null   int64  
 6   blueDeaths                    9879 non-null   int64  
 7   blueAssists                   9879 non-null   int64  
 8   blueEliteMonsters             9879 non-null   int64  
 9   blueDragons                   9879 non-null   int64  
 10  blueHeralds                   9879 non-null   int64  
 11  blueTowersDestroyed           9879 non-null   int64  
 12  blueTotalGold                 9879 non-null   int64  
 13  blu

### 3.8 Transposing the Dataframe

In [29]:

LOL.T.head()


Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,9869,9870,9871,9872,9873,9874,9875,9876,9877,9878
gameId,4519158000.0,4523372000.0,4521475000.0,4524384000.0,4436034000.0,4475366000.0,4493011000.0,4496759000.0,4443048000.0,4509433000.0,...,4527875000.0,4527811000.0,4527716000.0,4527650000.0,4527878000.0,4527873000.0,4527797000.0,4527714000.0,4527628000.0,4523773000.0
blueWins,0.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,1.0,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,0.0,0.0,1.0
blueWardsPlaced,28.0,12.0,15.0,43.0,75.0,18.0,18.0,16.0,16.0,13.0,...,12.0,46.0,12.0,12.0,18.0,17.0,54.0,23.0,14.0,18.0
blueWardsDestroyed,2.0,1.0,0.0,1.0,4.0,0.0,3.0,2.0,3.0,1.0,...,1.0,2.0,2.0,0.0,2.0,2.0,0.0,1.0,4.0,0.0
blueFirstBlood,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,...,0.0,1.0,0.0,1.0,1.0,1.0,0.0,0.0,1.0,1.0


### 3.9 Get columns using column names

In [30]:

LOL.loc[:,['gameId','redKills','blueKills']].tail(10)


Unnamed: 0,gameId,redKills,blueKills
9869,4527875317,12,9
9870,4527811425,3,5
9871,4527715781,5,4
9872,4527650398,7,7
9873,4527878058,6,12
9874,4527873286,4,7
9875,4527797466,4,6
9876,4527713716,7,6
9877,4527628313,3,2
9878,4523772935,6,6


### 3.10 Get columns using position

In [31]:

LOL.iloc[:,-1]


0       1656.7
1       1762.0
2       1728.5
3       1647.8
4       1740.4
         ...  
9874    1524.6
9875    1545.6
9876    1831.9
9877    1529.8
9878    1533.9
Name: redGoldPerMin, Length: 9879, dtype: float64

### 3.11 Get the mean of the all the columns present in the dataset

In [32]:

LOL.mean()


gameId                          4.500084e+09
blueWins                        4.990384e-01
blueWardsPlaced                 2.228829e+01
blueWardsDestroyed              2.824881e+00
blueFirstBlood                  5.048082e-01
blueKills                       6.183925e+00
blueDeaths                      6.137666e+00
blueAssists                     6.645106e+00
blueEliteMonsters               5.499544e-01
blueDragons                     3.619800e-01
blueHeralds                     1.879745e-01
blueTowersDestroyed             5.142221e-02
blueTotalGold                   1.650346e+04
blueAvgLevel                    6.916004e+00
blueTotalExperience             1.792811e+04
blueTotalMinionsKilled          2.166996e+02
blueTotalJungleMinionsKilled    5.050967e+01
blueGoldDiff                    1.441411e+01
blueExperienceDiff             -3.362031e+01
blueCSPerMin                    2.166996e+01
blueGoldPerMin                  1.650346e+03
redWardsPlaced                  2.236795e+01
redWardsDe

### 3.12 Get the correlation of the all the columns present in the dataset

In [33]:

LOL.corr()
 

Unnamed: 0,gameId,blueWins,blueWardsPlaced,blueWardsDestroyed,blueFirstBlood,blueKills,blueDeaths,blueAssists,blueEliteMonsters,blueDragons,...,redTowersDestroyed,redTotalGold,redAvgLevel,redTotalExperience,redTotalMinionsKilled,redTotalJungleMinionsKilled,redGoldDiff,redExperienceDiff,redCSPerMin,redGoldPerMin
gameId,1.0,0.000985,0.005361,-0.012057,-0.011577,-0.038993,-0.01316,-0.023329,0.016599,0.008962,...,0.003557,-0.010622,-0.012419,-0.021187,-0.005118,0.00604,0.01467,0.012315,-0.005118,-0.010622
blueWins,0.000985,1.0,8.7e-05,0.044247,0.201769,0.337358,-0.339297,0.276685,0.221944,0.213768,...,-0.103696,-0.411396,-0.352127,-0.387588,-0.212171,-0.110994,-0.511119,-0.489558,-0.212171,-0.411396
blueWardsPlaced,0.005361,8.7e-05,1.0,0.034447,0.003228,0.018138,-0.002612,0.033217,0.019892,0.017676,...,-0.008225,-0.005685,-0.008882,-0.013,-0.012395,0.001224,-0.0158,-0.027943,-0.012395,-0.005685
blueWardsDestroyed,-0.012057,0.044247,0.034447,1.0,0.017717,0.033748,-0.073182,0.067793,0.0417,0.040504,...,-0.023943,-0.067467,-0.05909,-0.057314,0.040023,-0.035732,-0.078585,-0.077946,0.040023,-0.067467
blueFirstBlood,-0.011577,0.201769,0.003228,0.017717,1.0,0.269425,-0.247929,0.229485,0.151603,0.134309,...,-0.069584,-0.301479,-0.182602,-0.19492,-0.156711,-0.024559,-0.378511,-0.240665,-0.156711,-0.301479
blueKills,-0.038993,0.337358,0.018138,0.033748,0.269425,1.0,0.004044,0.813667,0.17854,0.170436,...,-0.082491,-0.161127,-0.412219,-0.462333,-0.472203,-0.214454,-0.654148,-0.58373,-0.472203,-0.161127
blueDeaths,-0.01316,-0.339297,-0.002612,-0.073182,-0.247929,0.004044,1.0,-0.026372,-0.204764,-0.188852,...,0.15678,0.885728,0.433383,0.464584,-0.040521,-0.100271,0.64,0.577613,-0.040521,0.885728
blueAssists,-0.023329,0.276685,0.033217,0.067793,0.229485,0.813667,-0.026372,1.0,0.149043,0.170873,...,-0.06088,-0.133948,-0.356928,-0.396652,-0.337515,-0.160915,-0.549761,-0.437002,-0.337515,-0.133948
blueEliteMonsters,0.016599,0.221944,0.019892,0.0417,0.151603,0.17854,-0.204764,0.149043,1.0,0.781039,...,-0.052029,-0.216616,-0.169649,-0.189816,-0.074838,-0.087893,-0.281464,-0.263991,-0.074838,-0.216616
blueDragons,0.008962,0.213768,0.017676,0.040504,0.134309,0.170436,-0.188852,0.170873,0.781039,1.0,...,-0.032865,-0.192871,-0.149806,-0.159485,-0.059803,-0.098446,-0.233875,-0.211496,-0.059803,-0.192871


### 3.13 Get the maximum value of the data set present in each column

In [34]:

LOL.max().head(10)


gameId                4.527991e+09
blueWins              1.000000e+00
blueWardsPlaced       2.500000e+02
blueWardsDestroyed    2.700000e+01
blueFirstBlood        1.000000e+00
blueKills             2.200000e+01
blueDeaths            2.200000e+01
blueAssists           2.900000e+01
blueEliteMonsters     2.000000e+00
blueDragons           1.000000e+00
dtype: float64

### 3.14 Get the minimum value of the dataset of each column

In [35]:

LOL.min().tail(10)
#print(type(LOL))


redTowersDestroyed                 0.0
redTotalGold                   11212.0
redAvgLevel                        4.8
redTotalExperience             10465.0
redTotalMinionsKilled            107.0
redTotalJungleMinionsKilled        4.0
redGoldDiff                   -11467.0
redExperienceDiff              -8348.0
redCSPerMin                       10.7
redGoldPerMin                   1121.2
dtype: float64

### 3.15 Get the median of the Dataset

In [36]:

LOL.median().head(13)


gameId                 4.510920e+09
blueWins               0.000000e+00
blueWardsPlaced        1.600000e+01
blueWardsDestroyed     3.000000e+00
blueFirstBlood         1.000000e+00
blueKills              6.000000e+00
blueDeaths             6.000000e+00
blueAssists            6.000000e+00
blueEliteMonsters      0.000000e+00
blueDragons            0.000000e+00
blueHeralds            0.000000e+00
blueTowersDestroyed    0.000000e+00
blueTotalGold          1.639800e+04
dtype: float64

### 3.16 Get the standard deviation of the dataset

In [37]:

LOL.std().head(10)


gameId                2.757328e+07
blueWins              5.000244e-01
blueWardsPlaced       1.801918e+01
blueWardsDestroyed    2.174998e+00
blueFirstBlood        5.000022e-01
blueKills             3.011028e+00
blueDeaths            2.933818e+00
blueAssists           4.064520e+00
blueEliteMonsters     6.255265e-01
blueDragons           4.805974e-01
dtype: float64

### 3.17 Append the dataset with the same dataset

In [38]:

print(LOL.shape)

LOL_temp = LOL.append(LOL)

print(LOL_temp.shape)


(9879, 40)


AttributeError: 'DataFrame' object has no attribute 'append'

### 3.18 Drop the duplicates present in the dataset.

In [None]:

print(LOL_temp.shape)

LOL_temp = LOL_temp.drop_duplicates()

print(LOL_temp.shape)


### 3.19 IsNull: This returns true or false depending on the status of the cell

In [None]:
#import pandas as pd

#LOL = pd.read_csv('League_of_Legends.csv')
LOL.isnull()


### 3.20 Aggregate of all the values which are null

In [None]:

LOL.isnull().sum()


### 3.21 Drop NA values (delete rows)

In [None]:
import pandas as pd
import numpy as np

Data9 = pd.DataFrame({"Name":["Iron-Man","Wonder-Woman","Avengers", "Abc"],
                     "House":["Marvel","DC Comics","Marvel", np.NaN],
                     "Start":[pd.NaT,pd.Timestamp("2017-05-15"),pd.NaT,pd.NaT]})

Data9


In [None]:

Data9.dropna()


### 3.22 Drop the columns where there are null values

In [None]:

Data9.dropna(axis = 'columns')


### 3.23 Drop the entire row and column if ALL THE VALUES are null

In [None]:

Data9.dropna(how = 'all')


### 3.24 Drop the null values where they are present

In [None]:

Data9.dropna(how = 'any')


### 3.25 fill the null values with '0'

In [None]:
import pandas as pd
import numpy as np

Data10 = pd.DataFrame([[3,np.nan,4,2],[5,2,np.nan,9],
                       [np.nan,np.nan,7,np.nan],[4,np.nan,5,np.nan]]
                      ,columns=list('PQRS'))
Data10

In [None]:

Data10.fillna(0)


### 3.26 Replace Values

In [None]:

Replace_Values = {'P':10,'Q':11,'R':12,'S':13}

Data10.fillna(Replace_Values)


### 3.27 Fill null values only once which are specified by the user

In [None]:

Data11=Data10.fillna(Replace_Values, limit = 1)
Data11

### 3.28 Calculated the mean of column (ignore NA)

In [None]:

Mean1 = Data10['R'].mean()
Mean1


### 3.29 Filled the missing values with the calculated mean

In [None]:

Data10['R'].fillna(Mean1,inplace= False)


In [None]:
Data10


### 3.30 Describe the dataset

In [None]:

Data10.describe()


### 3.31 Describe the column of  dataset

In [None]:

Data10['P'].describe()


### 3.32 Fill the missing valuse

In [None]:
Mean1 = Data10['P'].mean()
Mean2 = Data10['Q'].mean()
Mean3 = Data10['R'].mean()
Mean4 = Data10['S'].mean()


In [None]:
Data10['P'].fillna(Mean1,inplace= True)

Data10['Q'].fillna(Mean2,inplace= True)
Data10['R'].fillna(Mean3,inplace= True)
Data10['S'].fillna(Mean4,inplace= True)


In [None]:

Data10


In [None]:
Data10['P'].mean()