This project demonstrates how to handle missing data using Maximum Likelihood Estimation (MLE) on the NHANES dataset (2015β2016).
- Maximum_Likelihood_Estimation.ipynb β Jupyter Notebook containing the full analysis and imputation process.
- NHANES.csv β Raw dataset with missing values.
- NHANES_2015_2016_tidy_imputed.csv β Cleaned and imputed dataset after applying MLE.
-
Load & Explore Data
- Import the NHANES dataset.
- Perform exploratory data analysis (EDA) to identify missing values.
-
Imputation using Maximum Likelihood Estimation (MLE)
- Apply statistical modeling for estimating missing values.
- Validate assumptions and distributions.
-
Save Final Dataset
- Export the imputed data into a tidy CSV format.
Install the following Python libraries before running the notebook:
pip install pyampute missingno