Having data describing media events, the goal of this project was to predict the mean number of active discussions.
Data is described in datadescription.en.txt file.
Solution consists of following steps:
- Correlation matrix
- PCA
- Standardization
- Box plots
- Calculating outliers with Chauvenet's criterion
- Multi Layer Perceptron
- Recurrent Neural Network
- Ridge Regression
- ElasticNet regression
- XGBoost
- Box-Cox transformation
Model | RMSE | MAE | R^2 |
---|---|---|---|
XGBoost | 165.07 | 52.84 | 0.9 |
- documentation.pdf
Ula Żukowska Monika Zielińska
Project is done as a part of course "Introduction to Data Processing" at Warsaw University of Technology.
Mark: 4.5/5