Skip to content

Latest commit

 

History

History
25 lines (11 loc) · 2.07 KB

File metadata and controls

25 lines (11 loc) · 2.07 KB

Costa Rican Household Poverty Level Prediction

Motivation

The following project has been conducted to fulfill the requirement for Statistical Learning coursework and is supposed to be a group project. Final report and presentation would be uploaded at the end of the assessment period.

Background

Many social programs have a hard time making sure the right people are given enough aid. It’s especially tricky when a program focuses on the poorest segment of the population. The world’s poorest typically can’t provide the necessary income and expense records to prove that they qualify.

In Latin America, one popular method uses an algorithm to verify income qualification. It’s called the Proxy Means Test (or PMT). With PMT, agencies use a model that considers a family’s observable household attributes like the material of their walls and ceiling, or the assets found in the home to classify them and predict their level of need. While this is an improvement, accuracy remains a problem as the region’s population grows and poverty declines. They believe that new methods beyond traditional econometrics, based on a dataset of Costa Rican household characteristics, might help improve PMT’s performance.

Beyond Costa Rica, many countries face this same problem of inaccurately assessing social need. If there is a new algorithm that could be implmented in Costa Rica then a lot of other similar countries will follow suit.

Research Question

It is extremely difficult for social programs such as this to gauge the right amount of aid that needs to be given to the right people. This problem is made exponentially more difficult when that program is dealing with the least fortunate portion of the population. This is because they cannot provide the necessary details of their income, asset or expense records to justify that they need the aid to qualify.

Hence, this paper’s defining question is: how to determine a method to effectively gauge the right amount of aid to be given to each household given the multitude of variables present in the vast dataset?