Add your fifth homework as a pull request to this folder.
Deadline 2020-05-04 EOD
Task: For a selected data set (you can use data from your project or data from Homework 1) prepare a knitr/jupiter notebook with the following points. Submit your results on GitHub to the directory Homeworks/H5.
TODO:
- For the selected data set, train at least one tree-based ensemble model (random forest, gbm, catboost or any other boosting)
- calculate permutational variable importance for the selected model,
- train three or more candidate models (different variables, different transformations, different model structures) and compare ranking of important features between these models. Are they similar or different?
- Comment on the results for points (2) and (3)
Important note:
The submitted homework should be in html format (generated from a knitter/jupiter) and should consist of two parts.
The first part is the key results and comments from points 3-4. In this part PLESE DO NOT SHOW ANY R/PYTHON CODES, RESULTS (IMAGES, COMMENTS) ONLY.
The second part should start with the word Appendix or Załącznik and should include the reproducible R/PYTHON code used to implement points 1-4.
Such division 1. will make these homework more readable, 2. will create good habits related to reporting.