Project for Data Mining and Text Mining course at Politecnico di Milano
Definition of the problem
In this case we'e worked with a database of customer. Our goal was to predict the risk of default for credit card users. We wanted to have the F score as high as possible, that was the metric. They didn't care about understanding why, they care about making good predictions.
We first understand how the database was structured and then we converted all values to numerical ones. Also, applying techniques as one-hot encoding. On the other hand, we delete the rows where NaNs were found.
Our final model was a voting classiffier composed by a Random Forest and a XGBoost. Further details could be read in the project documentation.
Francisco Carrillo Pérez
Jorge Ramírez Carrasco < https://github.com/jramirezc93 >