Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Kaggle Blue Book for Bulldozers Competition
branch: master
Failed to load latest commit information.
README.md Update README.md
prepare_data.py Update prepare_data.py
train_and_predict.py Update train_and_predict.py

README.md

Fast-Iron

Kaggle Blue Book for Bulldozers Competition

Windows 7 64bit on Intel QuadCore with 12GB RAM, Python 2.7 with Pandas, Numpy ,Scikit-Learn 0.13.1

How to train your model

How to make predictions on a new test set.

Before train make predictions, data need to be pre-processed, step below:
1) Place the training, appendix and test data in the Data folder
2) Edit prepare_data.py and change the following line with names of training, appendix and test data

  • trainData = "Data\TrainAndValid.csv"
  • testData = "Data\Test.csv"
  • appendixData = "Data\Machine_Appendix.csv",

3) Run the script. This will create four files in DataProcessed. This step take about 10-15 minutes depending on machine and file sizes.

PREDICT on Test.csv data

Simply run train_and_predict.py will create the output named current_prediction.csv.
train_and_predict.py is already set to run to recreate the output. gradient boosting regressor are serialized and trained. random forest need to be re-trained (too big to attach). Training the random forests takes 102 minutes.

TRAIN on new data

1) Edit train_and_predict.py
To train GB models change trainGB_models to True
To train RF models change trainRF_models to True
To save the models, change dumpModels to True

Something went wrong with that request. Please try again.