Skip to content
Go to file

Latest commit


Git stats


Failed to load latest commit information.



- 'models': sub-repo with h5 files containing weights of each trained network.
-  'results': plots/ tabs to summarize results and exploration
- other files (.py / .ipynb) are different versions giving submissions for the competition. See the exploration below for more             details.

Kaggle competition: data can be found here

Competition description : In this competition, Kagglers will develop models capable of classifying mixed patterns of proteins in microscope images. Proteins are “the doers” in the human cell, executing many functions that together enable life. Historically, classification of proteins has been limited to single patterns in one or a few cell types, but in order to fully understand the complexity of the human cell, models must classify mixed patterns across a range of different human cells. Images visualizing proteins in cells are commonly used for biomedical research, and these cells could hold the key for the next breakthrough in medicine. However, thanks to advances in high-throughput microscopy, these images are generated at a far greater pace than what can be manually evaluated. Therefore, the need is greater than ever for automating biomedical image analysis to accelerate the understanding of human cells and disease.

Credits to Allunia for the jupyter notebook which give a great description of this Computer Vision challenge.(protein-atlas-exploration-and-baseline.ipynb) / Credits to NikitPatel for the medical explanations (Atlas_medical_explanations.ipynb)

Problem Description: Considering 30172 images(different size provided 512* 512 & 2048* 2048) with multilabels taking values in {0;1}^28, we aim to accordingly classify 11702 images(29% of test set provided for competition first stage) whose labels are only known by Kaggle organizers. For each image, 4 separed channels (green,red,bleu,yellow) showing different proteins are given. Our goal is to maximize the 'macro F1-score' : F1 computed separately for each class over all images, then averaged.

My exploration

Baseline: As I started late the competition, 3 or 4 architectures used to solve proteins classification were already found, so I used most recent one with interesting properties ( relatively small number of parameters, input flexibility, knowledge from features of different depth): associated article My computational means forced me to train this model with 256 * 256 images. alt text

  • First, as the dataset is highly unbalanced, I tried to figure a way to improve prediction from a fixed model (Baseline- optimizer: Adam (0.001)- loss=BCE - reduceLRonPlateau - Rotation/Flip augmentation) followed by threshold optimization for all labels. With this dataset it is impossible to stratify in batches(16 or 32), then train and validation sets are created thanks to multilabel stratification ( I compared 3 approches:
    • without upsampling
    • upsampling based on weights
    • upsampling only rare classes

As evaluation metric doesn't discriminate labels and network learning is influenced by datasets balance, latest worked best ( based on LB score).

  • Secondly, I tuned the loss function to optimize validation & LB scores:
    • binary cross-entropy already investigated
    • F1 loss: aiming to directly optimize competition score by changing paradigm according to this article
    • BCE followed by F1 loss on last layers
    • Focal Loss ( which derives from alpha-balanced CE, introduces another parameter gamma (>0) reducing the relative loss for well-classified examples. I used Gamma=2 like in the article and as alpha is often set as inverse class frequency, I set alpha as mean of this criterion for each label.

Focal loss outperformed the others on validation & LB scores ! Quite a surprise as I hadn't time to grid-search Gamma and frequency of each label has high variance. More precisely I found: Focal > BCE+ F1 > BCE > F1.

After that, I tried to specify a threshold by maximizing F1 for each label instead of setting a global thresh (simple greedy method...). Even with 5-fold CV, results increased on train and validation datasets but not on LB score. Kagglers found out LB dataset wasn't accordingly balanced, so for competition first stage I considered wiser to stay with global threshold optimization.

I trained 5 similar models on different stratified datasets to ensemble them. Here I tested only two ensembling strategies and this matter deserved clearly more investigation (cf Thresholding Classifiers to Maximize F1 Score averaging probabilities given by each model + global threshold / majority vote among predictions of each (model + global threshold). My greatest score was obtained with first approach.

  • Then, I tried to crop 256 * 256 images from 512 * 512 instead of resizing, which allow greater resolution considering each piece was well labeled. (fair assumption with our samples ). With same settings of previous best model, I trained 5 models on these cropped images and ensembled them. It clearly improved LB score and show how much images size ( & resolution) can influence prediction power.

  • Finally I wanted to make use of pre-trained models. For VGG-16, RESNET50 & InceptionV3, I made a simple classification scheme:

Input -> BN -> Pre-trained model -> Conv + Relu -> Flatten -> Dropout -> Dense + Relu -> Dropout -> Dense+sigmoid.

Which I trained using cropped images, upsampling on rare classes & focal loss, for 15 epochs(batch size & augmentation adapted to pre-trained input size & 4GB GPU memory) freezing the pre-trained model, and then unfreezing it for 2(VGG16) or 5 ( RESNET50 & InceptionV3) epochs based on parameters quantities. InceptionV3 was better based on LB-score but under ensemble model with GAPNET. I couldn't train any longer these pre-trained models because I had to focus on other projects and exams so... I don't blame them.

A fascinating approach which I would have loved to have time to do, is to increase network depth by connecting an encoder-decoder mecanism ( like a mask ) based on pre-trained model, and to train it with dual losses (focal as classification & BCE/DICE as segmentation) with GAPNET as classifier. (coming soon...)

It was a part of the 4th place solution which I tried to implement before their publication but anyway I hadn't time to train it.

I join Bestfitting solution here

It was a real pleasure to participate to such competition (thanks Kaggle !), I made mistakes but definitely learned a lot thanks to it and others kagglers. This competition was a blessing to link both subjects which fascinate me the most


Computer vision - multilabel classification



No releases published


No packages published
You can’t perform that action at this time.