Human protein atlas image classification

competition link

Dataset

Interesting kernels:

fastai v1. starter (github), datablocks api

useful links data augmentation

Workflow

Things to check out: gpu-stats in python notebook

github todo lists add overview of important libaries on top of each file

Needs sorting

Data

sample dataset
Use more data: A dataset of images and morphological profiles of 30 000 small-molecule treatments using the Cell Painting assay

Preprocessing

Discarding yellow runs faster, doesn't change result. more discards?
Merge two semantically similar channels

Training

Progressively increase image size
Stratified training data https://github.com/trent-b/iterative-stratification
Cross validation example here
Other data augmentation techniques (cropping images), what would be logical?
Find optimal weight decay: link

Model

4-Channel model: swap first layer: link. I added simple ConvBlock reducing from 4 to 3 channels before pretrained model. Works good.@ryches I freeze pretrained network, and set this Convlayer and last layers to trainable. https://forums.fast.ai/t/how-to-do-transfer-learning-with-different-inputs/28395/3
Save best model
Use other pre-trained model? https://github.com/Cadene/pretrained-models.pytorch
Fastai v1 starter pack: kaggle, github, notebook
other notebook: lesson2-protein-human-protein-atlas-v1_256-resnet34.ipynb
another fastai starter: link
learn = create_cnn(data, arch, metrics=[acc_02, f_score]).to_fp16()
Replace average pooling layer with adaptive average layer
Papers: GapNet-PL paper, Cell organelle classification with fully convolutional neural networks

Score

Check if f1-score is used
Threshold selection for multi-label classification, paper
Check out code from here

other competitions

image preprocessing

filters: green filter for prediction, others for reference merging images improves score Discarding yellow runs faster, doesnt change result. more discards? Green is the protein itself. The other colors are other parts of the cell. While they are not required, they can provide useful information. yellow means endoplasmatic reticulum?

Train/Test data split: multilabel stratification python package class imbalance “huge” difference between validation and test score Use cross-validation to get better understanding of predictions on diff validation sets

f1 metric: order of ids is important! macro f1 score sklearn.metrics.f1_score with average="macro" focal loss + soft F1 and focal loss - log(soft F1) for faster convergence

LB: LB probing using all labels benchmark

Postprocessing: Threshold selection for multi-label classification

Improvement ideas:

split big image into smaller
make a network that will give me bounding boxes of cells to process. Then from the large scale images I can get smaller images of cells to train a network on.
ensembl ideas (nasnet)

if there the image contain e.g. "one" object of type A, it is has a label A. if it contains two, three, …. objects, it is has still a label A

you can always break the image into smaller image an ensemble back again. My CDiscount challenge solution gives you a clue on how to do it!

Use features from different resolutions

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
Class imbalance.ipynb		Class imbalance.ipynb
Convert_images.ipynb		Convert_images.ipynb
Create sample dataset.ipynb		Create sample dataset.ipynb
Fastai_starter.ipynb		Fastai_starter.ipynb
Fastai_starterResNet34.ipynb		Fastai_starterResNet34.ipynb
Fastai_starter_clean-focalloss.ipynb		Fastai_starter_clean-focalloss.ipynb
Fastai_starter_clean.ipynb		Fastai_starter_clean.ipynb
Predictions.ipynb		Predictions.ipynb
README.md		README.md
Test datasets.ipynb		Test datasets.ipynb
challenge_setup.py		challenge_setup.py
image.py		image.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Human protein atlas image classification

Needs sorting

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Human protein atlas image classification

Needs sorting

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages