Handling Missing Value

There are multiple ways of handling missing data and this varies case by case. There is no universal best way in dealing with the missing data. Use your best judgement and explore different options to determine which method is best for your data set.

Drop rows or columns that have a missing value - df.dropna()
Drop rows or columns, on the basis a missing value frequency - df.dropna(how='all')
Drop rows or columns based on a threshold value -- For instance, “thresh=4” means that the rows that have at least 4 non-missing values will be kept. The other ones will be dropped. - df.dropna(thresh=4)
Drop based on a particular subset of columns - df.dropna(subset=['column1','column2'])
Fill with a constant value - df.fillna(0)
Fill with an aggregated value - df.fillna(df['column1'].mean())
Replace with the previous or next value - df.fillna(method='bfill')
Fill by using another data frame that have same columns - df.fillna(df2)
Fill value with predicted value, or generated by ML algorithm with interpolation

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Numerical Value with Mean and Median .ipynb		Numerical Value with Mean and Median .ipynb
Handeling Missing Categorical Value with Mod .ipynb		Handeling Missing Categorical Value with Mod .ipynb
Missing Value Imputation - Predicted Value.ipynb		Missing Value Imputation - Predicted Value.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Handling Missing Value

About

Releases

Packages

Languages

MohammadAnas5/Handling-Missing-Value

Folders and files

Latest commit

History

Repository files navigation

Handling Missing Value

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages