Skip to content
📈 Data Mining and Text Mining course Project at Politecnico di Milano (2017)
Jupyter Notebook
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.ipynb_checkpoints
DataKNIMEProject
dmtmProject
CourseProject_DMTM_2017_MontyPython.docx
CourseProject_DMTM_2017_MontyPython.pdf
CourseProject_DMTM_2017_MontyPython.pptx
LICENSE
README.md
dmtmProject.ipynb
my_solution_one.csv
test.csv
test_export.csv
train.csv

README.md

Project for Data Mining and Text Mining course at Politecnico di Milano

Definition of the problem

In this case we'e worked with a database of customer. Our goal was to predict the risk of default for credit card users. We wanted to have the F score as high as possible, that was the metric. They didn't care about understanding why, they care about making good predictions.

Preprocess

We first understand how the database was structured and then we converted all values to numerical ones. Also, applying techniques as one-hot encoding. On the other hand, we delete the rows where NaNs were found.

Final model

Our final model was a voting classiffier composed by a Random Forest and a XGBoost. Further details could be read in the project documentation.

Authors

Francisco Carrillo Pérez

Jorge Ramírez Carrasco < https://github.com/jramirezc93 >

You can’t perform that action at this time.