ML competition hosted on kaggle. Binary classification problem to predict the income of USA population.
For the Machine Learning Kaggle Competition 2022, we were given a dataset derived from the United States Census Bureau (USCB). The USBC conducts various yearly surveys; as well as the decennial census, which produces data about the U.S. population and its economy. Data obtained essentially enables federal and local governments to make educated decisions regarding the allocation of federal funds, international trade, health, housing, and other influential elements to the standard of living.
This project helped us gaining hands-on experience of the theory learned along the course. Among those aspects we put into practice, it can be highlighted:
- Data exploration
- Feature engineering
- Missing values imputation
- Model selection and training
- Hyperparameter tuning by cross-validation
- Model performance comparison by setting an unified criteria
- Stacking
Our final model reached the 10th place in the private leaderboard with an