# A CASE STUDY OF ANEAMIA DETECTION IN CHILDREN IN NIGERIA

## PROBLEM STATEMENT

Anemia is a widespread health challenge across Africa, particularly affecting young children and impeding their development, immunity, and long-term health outcomes. Sub-Saharan Africa has some of the world’s highest rates of childhood anemia due to factors like nutritional deficiencies, poverty, and limited access to healthcare. 

In Nigeria, the largest economy and most populous country in Africa, anemia affects approximately 68.9% of children under five. This study aims to explore the sociodemographic and maternal factors contributing to anemia prevalence among Nigerian children aged 6–59 months, with the goal of identifying critical areas for intervention and support.

*Citations*
1. https://iris.who.int/bitstream/handle/10665/85839/WHO_NMH_NHD_MNM_11.1_eng.pdf?sequence=22
2. https://cdn.who.int/media/docs/default-source/2021-dha-docs/ida_assessment_prevention_control.pdf


## Objectives

1. Determine the prevalence of anemia in Nigerian children aged 6–59 months using data from the Demographic Health Survey.


2. Identify sociodemographic factors associated with anemia in young children, including maternal education, family income, and household size.


3. Analyze dietary factors contributing to childhood anemia, focusing on complementary feeding practices, breastfeeding duration, and overall dietary diversity.

In [2]:
import pandas as pd
import numpy as np

In [14]:
df = pd.read_csv('../anemia_dataset.csv')
df


Unnamed: 0,Age in 5-year groups,Type of place of residence,Highest educational level,Wealth index combined,Births in last five years,Age of respondent at 1st birth,Hemoglobin level adjusted for altitude and smoking (g/dl - 1 decimal),Anemia level,Have mosquito bed net for sleeping (from household questionnaire),Smokes cigarettes,Current marital status,Currently residing with husband/partner,When child put to breast,Had fever in last two weeks,Hemoglobin level adjusted for altitude (g/dl - 1 decimal),Anemia level.1,"Taking iron pills, sprinkles or syrup"
0,40-44,Urban,Higher,Richest,1,22,,,Yes,No,Living with partner,Staying elsewhere,Immediately,No,,,Yes
1,35-39,Urban,Higher,Richest,1,28,,,Yes,No,Married,Living with her,Hours: 1,No,,,No
2,25-29,Urban,Higher,Richest,1,26,,,No,No,Married,Living with her,Immediately,No,,,No
3,25-29,Urban,Secondary,Richest,1,25,95.0,Moderate,Yes,No,Married,Living with her,105,No,114.0,Not anemic,No
4,20-24,Urban,Secondary,Richest,1,21,,,Yes,No,No longer living together/separated,,Immediately,No,,,No
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
33919,35-39,Rural,Secondary,Richer,2,19,120.0,Not anemic,Yes,No,Married,Living with her,,No,120.0,Not anemic,Yes
33920,25-29,Rural,No education,Richer,1,27,120.0,Not anemic,Yes,No,Never in union,,Hours: 1,No,120.0,Not anemic,No
33921,25-29,Rural,Higher,Richer,1,22,149.0,Not anemic,Yes,No,Married,Living with her,Hours: 1,No,119.0,Not anemic,No
33922,20-24,Rural,Secondary,Richer,1,21,123.0,Not anemic,Yes,No,Married,Living with her,Immediately,No,75.0,Moderate,Yes


In [9]:
df.tail()

Unnamed: 0,Age in 5-year groups,Type of place of residence,Highest educational level,Wealth index combined,Births in last five years,Age of respondent at 1st birth,Hemoglobin level adjusted for altitude and smoking (g/dl - 1 decimal),Anemia level,Have mosquito bed net for sleeping (from household questionnaire),Smokes cigarettes,Current marital status,Currently residing with husband/partner,When child put to breast,Had fever in last two weeks,Hemoglobin level adjusted for altitude (g/dl - 1 decimal),Anemia level.1,"Taking iron pills, sprinkles or syrup"
33919,35-39,Rural,Secondary,Richer,2,19,120.0,Not anemic,Yes,No,Married,Living with her,,No,120.0,Not anemic,Yes
33920,25-29,Rural,No education,Richer,1,27,120.0,Not anemic,Yes,No,Never in union,,Hours: 1,No,120.0,Not anemic,No
33921,25-29,Rural,Higher,Richer,1,22,149.0,Not anemic,Yes,No,Married,Living with her,Hours: 1,No,119.0,Not anemic,No
33922,20-24,Rural,Secondary,Richer,1,21,123.0,Not anemic,Yes,No,Married,Living with her,Immediately,No,75.0,Moderate,Yes
33923,40-44,Rural,Secondary,Richest,1,35,,,No,No,Married,Living with her,Immediately,,,,


In [11]:
#Data info
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 33924 entries, 0 to 33923
Data columns (total 17 columns):
 #   Column                                                                 Non-Null Count  Dtype  
---  ------                                                                 --------------  -----  
 0   Age in 5-year groups                                                   33924 non-null  object 
 1   Type of place of residence                                             33924 non-null  object 
 2   Highest educational level                                              33924 non-null  object 
 3   Wealth index combined                                                  33924 non-null  object 
 4   Births in last five years                                              33924 non-null  int64  
 5   Age of respondent at 1st birth                                         33924 non-null  int64  
 6   Hemoglobin level adjusted for altitude and smoking (g/dl - 1 decimal)  13136 non-null 

In [10]:
#Checking for missing values
df.isnull().sum()

Age in 5-year groups                                                         0
Type of place of residence                                                   0
Highest educational level                                                    0
Wealth index combined                                                        0
Births in last five years                                                    0
Age of respondent at 1st birth                                               0
Hemoglobin level adjusted for altitude and smoking (g/dl - 1 decimal)    20788
Anemia level                                                             20788
Have mosquito bed net for sleeping (from household questionnaire)            0
Smokes cigarettes                                                            0
Current marital status                                                       0
Currently residing with husband/partner                                   1698
When child put to breast                            

In [24]:
# Display duplicate rows based on all columns
duplicates = df[df.duplicated()]
print("Number of duplicate rows:", duplicates.shape[0])
# Transpose the duplicates table 
display(duplicates.T)

Number of duplicate rows: 4678


Unnamed: 0,125,153,266,269,325,331,465,486,501,502,...,33818,33835,33861,33868,33869,33870,33881,33882,33884,33885
Age in 5-year groups,25-29,20-24,25-29,25-29,30-34,25-29,25-29,25-29,25-29,25-29,...,25-29,35-39,35-39,20-24,20-24,20-24,25-29,25-29,30-34,30-34
Type of place of residence,Rural,Rural,Rural,Rural,Rural,Rural,Rural,Rural,Rural,Rural,...,Rural,Rural,Rural,Rural,Rural,Rural,Rural,Rural,Rural,Rural
Highest educational level,Secondary,Secondary,Primary,Primary,Higher,No education,No education,Secondary,No education,No education,...,Secondary,Primary,Secondary,Secondary,Secondary,Secondary,No education,No education,No education,No education
Wealth index combined,Richer,Richer,Poorer,Poorest,Richer,Middle,Poorer,Poorer,Poorest,Poorest,...,Middle,Middle,Richer,Poorer,Poorer,Poorer,Middle,Middle,Middle,Middle
Births in last five years,3,3,3,3,3,2,2,3,2,2,...,2,1,1,3,3,3,2,2,2,2
Age of respondent at 1st birth,17,16,21,17,18,19,21,24,17,17,...,22,15,26,17,17,17,22,22,20,20
Hemoglobin level adjusted for altitude and smoking (g/dl - 1 decimal),,,,,,,,,,,...,,,,,,,,,,
Anemia level,,,,,,,,,,,...,,,,,,,,,,
Have mosquito bed net for sleeping (from household questionnaire),Yes,Yes,Yes,Yes,No,Yes,Yes,Yes,Yes,Yes,...,Yes,Yes,Yes,No,No,No,Yes,Yes,Yes,Yes
Smokes cigarettes,No,No,No,No,No,No,No,No,No,No,...,No,No,No,No,No,No,No,No,No,No
