Skip to content

MathieuCliche/Kaggle-Acquire-Valued-Shoppers-Challenge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 

Repository files navigation

Python code for the kaggle competition : Acquire valued shoppers challenge

http://www.kaggle.com/c/acquire-valued-shoppers-challenge

The data was loaded in a MySQL database. For example, the main table which contains the transaction history of all customers over a 1 year period had the following schema :

CREATE TABLE transactions (

id char(50) NOT NULL,

chain_ int NOT NULL ,

dept int NOT NULL,

category int NOT NULL,

company char(50) NOT NULL,

brand char(10) NOT NULL,

date_ date NOT NULL,

productsize int NOT NULL,

productmeasure char(10) NOT NULL,

purchasequantity int NOT NULL,

purchaseamount float NOT NULL );

ALTER TABLE transactions ADD INDEX (date_);
ALTER TABLE transactions ADD INDEX (category);
ALTER TABLE transactions ADD INDEX (company);
ALTER TABLE transactions ADD INDEX (brand);

LOAD DATA LOCAL INFILE 'C:/Users/Mathieu/Documents/shopper/transactions.csv'

INTO TABLE transactions
FIELDS TERMINATED BY ','

LINES TERMINATED BY '\n'

IGNORE 1 LINES;

Then features were generated with the featureeng.py python script. All features were generated via SQL queries using the python librairy MySQLdb which can access the MySQL database. The features were then saved in a numpy array file.

The model is then trained and tested in training.py. In this python code, the customers are first sorted with respect of the offer date so that the cross-validation respects the time structure of this problem. Then the features are scaled and each sample is weighted with respect to the number of repeat trips the customer did. After this preprocessing step a random-forest classifier is trained and cross-validated with the ROC AUC score. Finally we predict the probability of repeat trips for each customer of the test set and we make a csv file which can be submitted to Kaggle.

About

Kaggle: Shoppers

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages