Skip to content

jpison/moa_kaggle2020

Repository files navigation

moa_kaggle2020: weithed blending of three models to obtain position 20th (20/4373)

In this competition, many people trained their models with SmoothBCEwLogits(smoothing =1E-3) (or with other similar smoothing values).

I did that and, I think, it was one of my biggest mistakes!.

As Chris Deotte says in 3rd Place Public - We Should Have Trusted CV - 118th Private, we Should Have Trusted CV! but I think that it was a more common mistake than usual due to the BIG difference between the Public Test and the Private Test.

First, I have been trying to understand why the hosts decided to create a public test with a different distribution to the private test, and why they did not publish 'train_drug.csv' at the beginning of the competition.

I am not complaining about that because I know it is typical in Kaggle competitions, but I would like to expose it because many competitors spent a lot of time (and wasting a lot of submissions) trying to understand the relation between the Local CV and the Public LB.

After the publication of 'train_drug.csv' and, with the excellent @Chris's code Drug and Multilabel Stratification Code, competitors were able to make better decisions.

However, in my case, the big difference observed in LB Public with different Smooth Values and the bad correspondence between CV and Public LB, laid me to the wrong decision of using SmoothBCEwLogits(smoothing =1E-3).

The next figure shows my best three models (ANN, RESNET and TABNET) trained with smooth=1E-3 and without smooth.

"4x4 10-CV LogLoss" corresponds to the logloss of the average of oofs using 4 different distribution of folds with 4 different seeds. (and with 10-fold CV). The fold distributions were obtained with the @Chris code Drug and Multilabel Stratification Code but using this version: vc1 = vc.loc[(vc==6)|(vc==12)|(vc==18)].index.sort_values().

The weighted blending of the three models with smooth=1E-3 gave me 0.01614 but I could have obtained 0.01609 without smoothing.

My second and best submission (position 135/4373 teams) was using the best model's parameters with @Chris code but trained with the original random stratified CV (the old CV). In other to combine the models, I used the same weights that minimized the CV in @Chris validation code. This weighted blending gave me my best Private LB (0.01612) but, without smoothing, I could have obtained 0.01607, close to the gold zone. (position 20/4373)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published