## Business Understanding

The study seeks to understand the impact of various socioeconomic factors, particularly maternal age and education, on the anemia levels of children aged 0-59 months in Nigeria. Anemia is a significant public health issue that can hinder child development and overall health outcomes. Given the diversity in socioeconomic conditions across Nigeria's 36 states and the Federal Capital Territory (FCT), insights drawn from this data can inform public health strategies, targeted interventions, and resource allocation aimed at reducing childhood anemia and improving maternal and child health

## Problem Statement

In Nigeria, a high prevalence of anemia among children aged 0-59 months poses a substantial public health challenge, impacting their growth, cognitive development, and overall wellbeing. Preliminary observations suggest that socioeconomic factors, including maternal age, education level, wealth index, and living conditions, may influence anemia levels. This study aims to explain these relationships,and to predict the severity of anemia, whether its mild, moderate or severe enabling stakeholders to develop effective policies and interventions to combat childhood anemia and enhance maternal health.

## Limitations of Using the Provided Data

While the provided dataset offers valuable insights into anemia prevalence and related socioeconomic factors, it's important to consider its limitations:

Missing Data: The presence of missing values in certain columns (e.g., hemoglobin levels, anemia levels) could impact the analysis and potentially bias the results.

Data Quality: The accuracy and reliability of the data depend on the data collection methods and the quality of the responses. Errors in data entry or measurement could lead to inaccuracies.

Limited Socioeconomic Factors: Although the dataset includes some socioeconomic variables like wealth index and education, it might not capture the full spectrum of factors that influence anemia. Other factors that could contribute to anemia—such as dietary intake, healthcare access, and sanitation—are not represented, potentially limiting the explanatory power of the analysis.

Lack of Longitudinal Data: The data appears to be cross-sectional, meaning it captures information at a single point in time. Longitudinal data, which tracks individuals over time, would provide a more comprehensive understanding of anemia trends and the impact of interventions.

Here are a few references to articles about anemia causes and prevention:

WHO Fact Sheet:

Link: https://www.who.int/health-topics/anaemia


National Institutes of Health (NIH) Publication:

Link: https://www.ncbi.nlm.nih.gov/books/NBK499994/


In [34]:
import pandas as pd

df = pd.read_csv('../anemia_dataset.csv')
df.head()

Unnamed: 0,Age in 5-year groups,Type of place of residence,Highest educational level,Wealth index combined,Births in last five years,Age of respondent at 1st birth,Hemoglobin level adjusted for altitude and smoking (g/dl - 1 decimal),Anemia level,Have mosquito bed net for sleeping (from household questionnaire),Smokes cigarettes,Current marital status,Currently residing with husband/partner,When child put to breast,Had fever in last two weeks,Hemoglobin level adjusted for altitude (g/dl - 1 decimal),Anemia level.1,"Taking iron pills, sprinkles or syrup"
0,40-44,Urban,Higher,Richest,1,22,,,Yes,No,Living with partner,Staying elsewhere,Immediately,No,,,Yes
1,35-39,Urban,Higher,Richest,1,28,,,Yes,No,Married,Living with her,Hours: 1,No,,,No
2,25-29,Urban,Higher,Richest,1,26,,,No,No,Married,Living with her,Immediately,No,,,No
3,25-29,Urban,Secondary,Richest,1,25,95.0,Moderate,Yes,No,Married,Living with her,105,No,114.0,Not anemic,No
4,20-24,Urban,Secondary,Richest,1,21,,,Yes,No,No longer living together/separated,,Immediately,No,,,No


In [35]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 33924 entries, 0 to 33923
Data columns (total 17 columns):
 #   Column                                                                 Non-Null Count  Dtype  
---  ------                                                                 --------------  -----  
 0   Age in 5-year groups                                                   33924 non-null  object 
 1   Type of place of residence                                             33924 non-null  object 
 2   Highest educational level                                              33924 non-null  object 
 3   Wealth index combined                                                  33924 non-null  object 
 4   Births in last five years                                              33924 non-null  int64  
 5   Age of respondent at 1st birth                                         33924 non-null  int64  
 6   Hemoglobin level adjusted for altitude and smoking (g/dl - 1 decimal)  13136 non-null 

In [36]:
df.isna().sum()

Age in 5-year groups                                                         0
Type of place of residence                                                   0
Highest educational level                                                    0
Wealth index combined                                                        0
Births in last five years                                                    0
Age of respondent at 1st birth                                               0
Hemoglobin level adjusted for altitude and smoking (g/dl - 1 decimal)    20788
Anemia level                                                             20788
Have mosquito bed net for sleeping (from household questionnaire)            0
Smokes cigarettes                                                            0
Current marital status                                                       0
Currently residing with husband/partner                                   1698
When child put to breast                            