Skip to content

alzmcr/allstate

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 

Allstate Purchase Prediction Challenge

Requirements

Python 2.7.5 with Scikit-Learn 0.14a1, Numpy 1.8, Pandas 0.12
Windows 8, Intel i5-3230M @ 2.60Ghz, 16GB RAM
Developed on a HP Envy 17 j100tx laptop

How to generate the solution

Type "python majorityvote_modelselection.py" in Python shell or easily double click on Windows. Watch out on memory usage, even though "should" be configured not to exceed 8 GB with the default settings.

Comments

Using the default setting, this will fit the model and creates the submission which will score 0.53705 in the private L. This is the setting which combined with Breakfast Pirate ABCEDF combination, scored 0.53715 in the private LB and .54535 in the public LB. On the above system configuration this will take approximately 3 hours. If you’re impatience, set N=10 and NS=7 and will score 0.53710 in just 30 minutes! If you think is still slow try setting N=8, NS=6, params=[(30,5,23)] and is going to be even faster scoring as my best submission 0.53705 but lower on the public LB. If still slow, get a better computer!!!

The script will perform the the following steps:

  1. Prepare the data (load the files, transformation, clean and create the engineered features)
  2. Fit the Random Forests
  3. Make the prediction of the product G
  4. Selected the best Random Forest given the train set accuracy
  5. Do a majority vote using all the N model(s) and print the score on the cross validation set
  6. Do a majority vote using the NS selected model(s) and print the score on the cross validation set

Then, if submit is set to False:
a. Records the performance of the k-fold and loop
b. Exit the loop and make the prediction on the test set, do a majority vote using the selected models, fix the product accordingly with the state rule and create the submission file

License

Please refer for LICENSE.txt file

About

Kaggle's Allstate Purchase Prediction Challenge

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages