![cover](cover/06.%20Fillna%20Method.png)

#### Outline
* Value
* Fill Methods
  * ffill
  * bfill
* Axis
  * Column
  * Row
* Limit
* Inplace

##### Import Pandas

In [1]:
import pandas

### The FillNA Method
`dataset.fillna(x)`

Replaces missing data in the dataset with the arguments provided.

In [2]:
zoo = pandas.read_csv("./datasets/zoo.csv")
zoo

Unnamed: 0,Exhibit,Animal,Daily_Visitors,Profit,Rating,Opening_Year
0,1,Lions,320.0,2500.0,90.0,2015.0
1,2,Elephants,400.0,,85.0,2010.0
2,3,Reptiles,290.0,1800.0,,2012.0
3,4,Birds,,1500.0,88.0,2018.0
4,5,Monkeys,370.0,2100.0,92.0,
5,6,Tigers,,2300.0,86.0,2016.0
6,7,Otters,340.0,,89.0,2012.0
7,8,Snakes,,1700.0,80.0,2011.0
8,9,Giraffes,420.0,2600.0,,2014.0
9,10,Penguins,380.0,2200.0,91.0,2017.0


### Value

`dataset.fillna(x)`

Directly replaces Missing values with the given argument

In [3]:
zoo.fillna("Missing Value")     # replace missing values in the dataset with the string "Missing Value"

Unnamed: 0,Exhibit,Animal,Daily_Visitors,Profit,Rating,Opening_Year
0,1,Lions,320.0,2500.0,90.0,2015.0
1,2,Elephants,400.0,Missing Value,85.0,2010.0
2,3,Reptiles,290.0,1800.0,Missing Value,2012.0
3,4,Birds,Missing Value,1500.0,88.0,2018.0
4,5,Monkeys,370.0,2100.0,92.0,Missing Value
5,6,Tigers,Missing Value,2300.0,86.0,2016.0
6,7,Otters,340.0,Missing Value,89.0,2012.0
7,8,Snakes,Missing Value,1700.0,80.0,2011.0
8,9,Giraffes,420.0,2600.0,Missing Value,2014.0
9,10,Penguins,380.0,2200.0,91.0,2017.0


In [4]:
zoo.fillna(7)   # fill in missing values with the number 7

Unnamed: 0,Exhibit,Animal,Daily_Visitors,Profit,Rating,Opening_Year
0,1,Lions,320.0,2500.0,90.0,2015.0
1,2,Elephants,400.0,7.0,85.0,2010.0
2,3,Reptiles,290.0,1800.0,7.0,2012.0
3,4,Birds,7.0,1500.0,88.0,2018.0
4,5,Monkeys,370.0,2100.0,92.0,7.0
5,6,Tigers,7.0,2300.0,86.0,2016.0
6,7,Otters,340.0,7.0,89.0,2012.0
7,8,Snakes,7.0,1700.0,80.0,2011.0
8,9,Giraffes,420.0,2600.0,7.0,2014.0
9,10,Penguins,380.0,2200.0,91.0,2017.0


#### Using Dictionary
`dataset.fillna({` `col1 : val,` 
`col2 : val,` 
`... ,` 
`coln : val` `})` 

Specifies which values are to be used by which column for missing data

In [5]:
zoo

Unnamed: 0,Exhibit,Animal,Daily_Visitors,Profit,Rating,Opening_Year
0,1,Lions,320.0,2500.0,90.0,2015.0
1,2,Elephants,400.0,,85.0,2010.0
2,3,Reptiles,290.0,1800.0,,2012.0
3,4,Birds,,1500.0,88.0,2018.0
4,5,Monkeys,370.0,2100.0,92.0,
5,6,Tigers,,2300.0,86.0,2016.0
6,7,Otters,340.0,,89.0,2012.0
7,8,Snakes,,1700.0,80.0,2011.0
8,9,Giraffes,420.0,2600.0,,2014.0
9,10,Penguins,380.0,2200.0,91.0,2017.0


In [6]:
zoo.fillna({ "Animal" : "Unknown", # we can individually set the replacement values for each column
            "Daily_Visitors" : 0.0, 
            "Profit" : 0.0, 
            "Rating" : 0.0, 
            "Opening_Year" : 2000.0 })

Unnamed: 0,Exhibit,Animal,Daily_Visitors,Profit,Rating,Opening_Year
0,1,Lions,320.0,2500.0,90.0,2015.0
1,2,Elephants,400.0,0.0,85.0,2010.0
2,3,Reptiles,290.0,1800.0,0.0,2012.0
3,4,Birds,0.0,1500.0,88.0,2018.0
4,5,Monkeys,370.0,2100.0,92.0,2000.0
5,6,Tigers,0.0,2300.0,86.0,2016.0
6,7,Otters,340.0,0.0,89.0,2012.0
7,8,Snakes,0.0,1700.0,80.0,2011.0
8,9,Giraffes,420.0,2600.0,0.0,2014.0
9,10,Penguins,380.0,2200.0,91.0,2017.0


### Individual Columns
`dataset[col].fillna(x)`

In [7]:
zoo

Unnamed: 0,Exhibit,Animal,Daily_Visitors,Profit,Rating,Opening_Year
0,1,Lions,320.0,2500.0,90.0,2015.0
1,2,Elephants,400.0,,85.0,2010.0
2,3,Reptiles,290.0,1800.0,,2012.0
3,4,Birds,,1500.0,88.0,2018.0
4,5,Monkeys,370.0,2100.0,92.0,
5,6,Tigers,,2300.0,86.0,2016.0
6,7,Otters,340.0,,89.0,2012.0
7,8,Snakes,,1700.0,80.0,2011.0
8,9,Giraffes,420.0,2600.0,,2014.0
9,10,Penguins,380.0,2200.0,91.0,2017.0


In [8]:
zoo["Daily_Visitors"].fillna(zoo["Daily_Visitors"].median())    # replace missing values in the daily visitors column with the median of that column

0    320.0
1    400.0
2    290.0
3    370.0
4    370.0
5    370.0
6    340.0
7    370.0
8    420.0
9    380.0
Name: Daily_Visitors, dtype: float64

In [9]:
zoo["Profit"].fillna(zoo["Profit"].mean())      # replace missing values in the profit column with the mean of that column

0    2500.0
1    2087.5
2    1800.0
3    1500.0
4    2100.0
5    2300.0
6    2087.5
7    1700.0
8    2600.0
9    2200.0
Name: Profit, dtype: float64

### Fill Method


Specifies which method is to be used to fillin missing values

In [10]:
weather = pandas.read_csv("./datasets/weather.csv")
weather

Unnamed: 0,Date,Temperature,Humidity,Wind_Speed,PM2.5,Weather
0,2025-10-25,12.0,80.0,10.0,45.0,Cloudy
1,2025-10-26,14.0,75.0,,50.0,Sunny
2,2025-10-27,,82.0,15.0,55.0,Rainy
3,2025-10-28,13.0,,18.0,60.0,Cloudy
4,2025-10-29,11.0,85.0,20.0,,Rainy
5,2025-10-30,9.0,88.0,22.0,70.0,
6,2025-10-31,8.0,90.0,,68.0,Cloudy


#### ffill

`dataset.ffill()`

Forward fill: replaces the current missing value with the first non-missing value from the previous row(s)

In [11]:
weather.ffill()     # values above are used to fill in missing values below

Unnamed: 0,Date,Temperature,Humidity,Wind_Speed,PM2.5,Weather
0,2025-10-25,12.0,80.0,10.0,45.0,Cloudy
1,2025-10-26,14.0,75.0,10.0,50.0,Sunny
2,2025-10-27,14.0,82.0,15.0,55.0,Rainy
3,2025-10-28,13.0,82.0,18.0,60.0,Cloudy
4,2025-10-29,11.0,85.0,20.0,60.0,Rainy
5,2025-10-30,9.0,88.0,22.0,70.0,Rainy
6,2025-10-31,8.0,90.0,22.0,68.0,Cloudy


#### bfill

`dataset.bfill()`

Backward fill: replaces the current missing value with the first non-missing value from the next row(s)

In [12]:
weather.bfill()     # values below are used to fill in missing values above

Unnamed: 0,Date,Temperature,Humidity,Wind_Speed,PM2.5,Weather
0,2025-10-25,12.0,80.0,10.0,45.0,Cloudy
1,2025-10-26,14.0,75.0,15.0,50.0,Sunny
2,2025-10-27,13.0,82.0,15.0,55.0,Rainy
3,2025-10-28,13.0,85.0,18.0,60.0,Cloudy
4,2025-10-29,11.0,85.0,20.0,70.0,Rainy
5,2025-10-30,9.0,88.0,22.0,70.0,Cloudy
6,2025-10-31,8.0,90.0,,68.0,Cloudy


### Axis
`axis = n`

Specifies upon which axis we want values to be filled (column, row, etc)

#### Column
`axis = 0`

#### Row
`axis = 1`

In [13]:
weather.ffill(axis=1)       # values are taken from rows in the previous column

Unnamed: 0,Date,Temperature,Humidity,Wind_Speed,PM2.5,Weather
0,2025-10-25,12.0,80.0,10.0,45.0,Cloudy
1,2025-10-26,14.0,75.0,75.0,50.0,Sunny
2,2025-10-27,2025-10-27,82.0,15.0,55.0,Rainy
3,2025-10-28,13.0,13.0,18.0,60.0,Cloudy
4,2025-10-29,11.0,85.0,20.0,20.0,Rainy
5,2025-10-30,9.0,88.0,22.0,70.0,70.0
6,2025-10-31,8.0,90.0,90.0,68.0,Cloudy


In [14]:
weather.bfill(axis=1)       # values are taken from rows in the next column

Unnamed: 0,Date,Temperature,Humidity,Wind_Speed,PM2.5,Weather
0,2025-10-25,12.0,80.0,10.0,45.0,Cloudy
1,2025-10-26,14.0,75.0,50.0,50.0,Sunny
2,2025-10-27,82.0,82.0,15.0,55.0,Rainy
3,2025-10-28,13.0,18.0,18.0,60.0,Cloudy
4,2025-10-29,11.0,85.0,20.0,Rainy,Rainy
5,2025-10-30,9.0,88.0,22.0,70.0,
6,2025-10-31,8.0,90.0,68.0,68.0,Cloudy


### Limit

`limit = n`

Specifies the maximum number of rows per/column we can fill, given by `n`


In [15]:
flights = pandas.read_csv("./datasets/flights.csv")
flights

Unnamed: 0,Flight_ID,Source,Destination,Departure_Time,Passengers,Plane_Type
0,PK101,Krakow,Paris,07:45,150.0,Airbus A320
1,PK102,Paris,London,09:15,,Boeing 737
2,PK103,London,Berlin,11:00,,Embraer 190
3,PK104,Berlin,Krakow,,180.0,Boeing 737
4,PK105,Krakow,Rome,14:25,170.0,
5,PK106,Rome,London,16:10,155.0,Airbus A320
6,PK107,London,Krakow,18:00,,Airbus A319
7,PK108,Krakow,Berlin,20:45,140.0,Embraer 190
8,PK109,Berlin,Paris,,130.0,Boeing 737
9,PK110,Paris,Krakow,23:30,160.0,


In [16]:
flights.fillna("missing", limit=2)  # only the first two missing values are replaced

Unnamed: 0,Flight_ID,Source,Destination,Departure_Time,Passengers,Plane_Type
0,PK101,Krakow,Paris,07:45,150.0,Airbus A320
1,PK102,Paris,London,09:15,missing,Boeing 737
2,PK103,London,Berlin,11:00,missing,Embraer 190
3,PK104,Berlin,Krakow,missing,180.0,Boeing 737
4,PK105,Krakow,Rome,14:25,170.0,missing
5,PK106,Rome,London,16:10,155.0,Airbus A320
6,PK107,London,Krakow,18:00,,Airbus A319
7,PK108,Krakow,Berlin,20:45,140.0,Embraer 190
8,PK109,Berlin,Paris,missing,130.0,Boeing 737
9,PK110,Paris,Krakow,23:30,160.0,missing


In [17]:
flights.fillna("missing", limit=1)      # only the first missing value is replaced

Unnamed: 0,Flight_ID,Source,Destination,Departure_Time,Passengers,Plane_Type
0,PK101,Krakow,Paris,07:45,150.0,Airbus A320
1,PK102,Paris,London,09:15,missing,Boeing 737
2,PK103,London,Berlin,11:00,,Embraer 190
3,PK104,Berlin,Krakow,missing,180.0,Boeing 737
4,PK105,Krakow,Rome,14:25,170.0,missing
5,PK106,Rome,London,16:10,155.0,Airbus A320
6,PK107,London,Krakow,18:00,,Airbus A319
7,PK108,Krakow,Berlin,20:45,140.0,Embraer 190
8,PK109,Berlin,Paris,,130.0,Boeing 737
9,PK110,Paris,Krakow,23:30,160.0,


### Inplace

Specifies if we should update the current dataset or create a new one when using the fillna method

In [18]:
flights.fillna("missing", inplace=True)

  flights.fillna("missing", inplace=True)


In [19]:
flights     # same variable but values have changed

Unnamed: 0,Flight_ID,Source,Destination,Departure_Time,Passengers,Plane_Type
0,PK101,Krakow,Paris,07:45,150.0,Airbus A320
1,PK102,Paris,London,09:15,missing,Boeing 737
2,PK103,London,Berlin,11:00,missing,Embraer 190
3,PK104,Berlin,Krakow,missing,180.0,Boeing 737
4,PK105,Krakow,Rome,14:25,170.0,missing
5,PK106,Rome,London,16:10,155.0,Airbus A320
6,PK107,London,Krakow,18:00,missing,Airbus A319
7,PK108,Krakow,Berlin,20:45,140.0,Embraer 190
8,PK109,Berlin,Paris,missing,130.0,Boeing 737
9,PK110,Paris,Krakow,23:30,160.0,missing


In [20]:
zoo

Unnamed: 0,Exhibit,Animal,Daily_Visitors,Profit,Rating,Opening_Year
0,1,Lions,320.0,2500.0,90.0,2015.0
1,2,Elephants,400.0,,85.0,2010.0
2,3,Reptiles,290.0,1800.0,,2012.0
3,4,Birds,,1500.0,88.0,2018.0
4,5,Monkeys,370.0,2100.0,92.0,
5,6,Tigers,,2300.0,86.0,2016.0
6,7,Otters,340.0,,89.0,2012.0
7,8,Snakes,,1700.0,80.0,2011.0
8,9,Giraffes,420.0,2600.0,,2014.0
9,10,Penguins,380.0,2200.0,91.0,2017.0


In [21]:
zoo.fillna("missing", inplace=True)

  zoo.fillna("missing", inplace=True)


In [22]:
zoo     # same variable but values have changed

Unnamed: 0,Exhibit,Animal,Daily_Visitors,Profit,Rating,Opening_Year
0,1,Lions,320.0,2500.0,90.0,2015.0
1,2,Elephants,400.0,missing,85.0,2010.0
2,3,Reptiles,290.0,1800.0,missing,2012.0
3,4,Birds,missing,1500.0,88.0,2018.0
4,5,Monkeys,370.0,2100.0,92.0,missing
5,6,Tigers,missing,2300.0,86.0,2016.0
6,7,Otters,340.0,missing,89.0,2012.0
7,8,Snakes,missing,1700.0,80.0,2011.0
8,9,Giraffes,420.0,2600.0,missing,2014.0
9,10,Penguins,380.0,2200.0,91.0,2017.0


### For Source code:
https://sites.google.com/view/aorbtech/programming/

#### @Aorb Tech