# Flu Shot Learning: Predict H1N1 and Seasonal Flu Vaccines

# 1.0 Bussiness Understanding 

## 1.1 Business Overview

Vaccination is one of the most effective public health measures for preventing the spread of infectious diseases. In recent years, there has been the development of vaccines for other pandemics such as COVID-19. Vaccination not only helps individuals who have been immunised but also the community from the wider spread of the virus.

For this study, we are using data from a survey conducted in 2009 during the H1N1 influenza pandemic, also known as the "swine flu". This led to an estimated death toll worldwide in its first year of between 151,000 and 575,000. To reduce this, a vaccine was introduced in late 2009 alongside the seasonal flu that was already available.  

The survey was used to understand the uptake of both vaccines. These included respondents sharing information on their health conditions, demographics, risk perception, and behaviours. By analyzing this dataset, we can better understand which factors influenced vaccine uptake. These insights can help healthcare professionals design more effective, targeted campaigns to improve vaccine acceptance and coverage in future pandemics.

## 1.2 Problem Statement

The study aims to predict whether individuals received the H1N1 and/or seasonal flu vaccines using survey data, to identify key factors influencing uptake to inform more effective public health interventions.


## 1.3 Business Objectives

### Main Objective

Build a predictive model that estimates the probability of individuals receiving the H1N1 and seasonal flu vaccines based on features from the survey.

### Special Objectives

- Identify which demographic, behavioral, and opinion factors are most strongly associated with vaccine uptake.
- Provide actionable insights to public health decision-makers for designing targeted awareness campaigns.
- Evaluate differences between H1N1 and seasonal flu vaccine uptake patterns.

## 1.4 Success Criteria 
- Clearly show what factors make people more or less likely to get vaccinated, so healthcare professionals can improve vaccination plans in future pandemics.
- Identify which groups of people are less likely to get vaccinated, so campaigns can be directed where they are needed most.


In [1]:
# importing the libraries 
import numpy as np 
import pandas as pd

In [2]:
# Loading dataset
df1 = pd.read_csv("C:/Users/PC/Desktop/School work/Projects/Phase 3/Phase-3-Project/Data/test_set_features.csv") 
df1.head(2)

Unnamed: 0,respondent_id,h1n1_concern,h1n1_knowledge,behavioral_antiviral_meds,behavioral_avoidance,behavioral_face_mask,behavioral_wash_hands,behavioral_large_gatherings,behavioral_outside_home,behavioral_touch_face,...,income_poverty,marital_status,rent_or_own,employment_status,hhs_geo_region,census_msa,household_adults,household_children,employment_industry,employment_occupation
0,26707,2.0,2.0,0.0,1.0,0.0,1.0,1.0,0.0,1.0,...,"> $75,000",Not Married,Rent,Employed,mlyzmhmf,"MSA, Not Principle City",1.0,0.0,atmlpfrs,hfxkjkmi
1,26708,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,Below Poverty,Not Married,Rent,Employed,bhuqouqj,Non-MSA,3.0,0.0,atmlpfrs,xqwwgdyp


In [3]:
df2 = pd.read_csv("C:/Users/PC/Desktop/School work/Projects/Phase 3/Phase-3-Project/Data/training_set_features.csv") 
df2.head(2)

Unnamed: 0,respondent_id,h1n1_concern,h1n1_knowledge,behavioral_antiviral_meds,behavioral_avoidance,behavioral_face_mask,behavioral_wash_hands,behavioral_large_gatherings,behavioral_outside_home,behavioral_touch_face,...,income_poverty,marital_status,rent_or_own,employment_status,hhs_geo_region,census_msa,household_adults,household_children,employment_industry,employment_occupation
0,0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,...,Below Poverty,Not Married,Own,Not in Labor Force,oxchjgsf,Non-MSA,0.0,0.0,,
1,1,3.0,2.0,0.0,1.0,0.0,1.0,0.0,1.0,1.0,...,Below Poverty,Not Married,Rent,Employed,bhuqouqj,"MSA, Not Principle City",0.0,0.0,pxcmvdjn,xgwztkwe


In [4]:
df3 = pd.read_csv("C:/Users/PC/Desktop/School work/Projects/Phase 3/Phase-3-Project/Data/training_set_labels.csv") 
df3.head(2)

Unnamed: 0,respondent_id,h1n1_vaccine,seasonal_vaccine
0,0,0,0
1,1,0,1
