Skip to content

sam03152000/Poverty-Prediction

Repository files navigation

STAT689-Final-Project

Shoa Hung Lin, Chendong Cai

Abstract

Determing appropriate poverty reduction strategies is hard. To do this, it requires measuring poverty in the first place. The dataset provided by World Bank conducted in-depth household surveys with a subset of the country's population. To measure poverty, most of these surveys collect detailed data on household consumption in order to get a clearer picture of a household's poverty status.

The aim of this project is to build a model that can accurately predict poverty for a specific country and utilized techniques such as data preprocessing, logistic regression, gradient descent, cross validation and regularization.

Project Goal

  1. Build a logistic regression model with gradient descent

  2. Compare our model to the scikit-learn package

  3. Introduce regularization term (L1 and L2) and compare the results

  4. Compare results with different poverty prediction probability threshold

Using the Software

The fitting code is written in Python and is demonstrated in the file STAT689_Poverty Prediction.ipynb.

Performance matrics

We use log loss function to evaluate our results. https://en.wikipedia.org/wiki/Loss_functions_for_classification

Reference

DrivenData https://www.drivendata.org/competitions/50/worldbank-poverty-prediction/

Stanford cs229 http://cs229.stanford.edu/notes/cs229-notes1.pdf

Log loss function https://en.wikipedia.org/wiki/Loss_functions_for_classification

Gradient descent https://en.wikipedia.org/wiki/Gradient_descent

About

STAT689 Final Project for Poverty Prediction

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published