Skip to content

fywalter/Killer-Queen

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

69 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Task 4

How to run

  1. Generate npy from csv: csv2npy.py
  2. Generate fft features with prepare_fft_feature.py
  3. Preprocess(Normalization and concat adjacent epoch feature) with preprocessing_ffts
  4. Model 4.1. Do cv with cnn_cv_single_epoch.py or cnn_cv_multi_epoch.py 4.2. Make prediction with cnn_pred_single_epoch.py or cnn_pred_multi_epoch.py

Current stage: CNN based model

Main schedual:

Stage 1 Stage 2 Stage 3 Stage 4 Stage 5
Visualization & Statistics Preprocessing & feature extraction SImple xgb model CNN based model RNN based model

Task 3

Current stage: Reading

Main schedual:

Stage 1 Stage 2 Stage 3 Stage 4 Stage 5 Stage 6
Visualization & Statistics Data preprocessing: abnormal data / noise (auto encoder/fft) Slicing & statistics Feature extraction(Manural, auto-encoder, fft) Aggregation model Sequence model

Task 2

Current stage: Reading

Main schedual:

Stage 1 Stage 2 Stage 3 Stage 4
Code reconstructing Deal with imbalanced-data Feature selection Model selection and hyper-param tuning

References:

https://www.kaggle.com/c/bnp-paribas-cardif-claims-management/discussion/19240#110095 https://www.kaggle.com/c/bnp-paribas-cardif-claims-management/discussion/20247#latest-356655 https://www.kaggle.com/c/bnp-paribas-cardif-claims-management/discussion/20258#latest-133476 https://www.kaggle.com/c/telstra-recruiting-network/discussion/19239#latest-381687 https://www.kaggle.com/c/prudential-life-insurance-assessment/discussion/19003#latest-229720 https://www.kaggle.com/c/otto-group-product-classification-challenge/discussion/14335#latest-622005 https://www.kaggle.com/c/airbnb-recruiting-new-user-bookings/discussion/18918#latest-627461 https://www.kaggle.com/c/mlsp-2014-mri/discussion/9854#latest-568751 https://github.com/diefimov/santander_2016/blob/master/README.pdf

Task 1

Current stage: Hyper-params tuning

Best result:0.7389

Best Params:

Filling Method pre-feature number feature selection model(2nd) Model Outlier Detection
Random forest 200 126 (lasso selection with alpha 0.02) Ensemble Ask Chen Le

Main schedual:

Stage 1 Stage 2 Stage 3 Stage 4 Stage 5
Code reconstructing Fill the missing data Outlier detection Feature selection Model selection and hyper-param tuning

Ensemble Reference: https://www.kaggle.com/serigne/stacked-regressions-top-4-on-leaderboard

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •