Kaggle iMaterialist 2020

This repository is forked from https://github.com/apls777/kaggle-imaterialist2020-model for me to easily use it.

Kaggle iMaterialist 2020
- Setup
Train
Predict
- TPU
- CPU/GPU
Prediction Schema
Detection Prediction Data Flow
Check accuracy when editing training code

Setup

Set up gcloud command.

Install ctpu command.

https://github.com/tensorflow/tpu/tree/master/tools/ctpu#download

Copy and modify a TPU config file.

# at the local machine
cp tpu_configs/example.json tpu_configs/YOUR_TPU.json

Create a VM instance and TPU.

# at the local machine
export TPU_CONFIG_JSON=tpu_configs/YOUR_TPU.json
./scripts/tpu.sh create

You will automatically log in the VM via SSH.

Grant roles/storage.admin the TPU service account (e.g., service-1123456789@cloud-tpu.iam.gserviceaccount.com).

Install Python dependencies to the VM instance.

# at the remote VM
cd $HOME
git clone https://github.com/hrsma2i/kaggle-imaterialist2020-model.git

cd $HOME/kaggle-imaterialist2020-model
./scripts/setup_bashrc.sh
source ~/.bashrc
./scripts/install_requirements.sh

Setup is done. Log out from the terminal by ctrl-D, and stop the VM and TPU.

# at the local machine
./scripts/tpu.sh stop

# or
./scripts/tpu.sh stop --vm
./scripts/tpu.sh stop --tpu

You can know the other sub-commands for the VM and TPU by the following:

./scripts/tpu.sh -h

Train

Log in to the VM instance via SSH.

# at the local machine
./scripts/tpu.sh start
./scripts/tpu.sh ssh

Set up iMaterialist dataset.

Create TF Records from iMaterialist COCO format annotations.

poetry shell

./scripts/create_tf_records.sh \
    train \
    $IMAGE_DIR \
    $COCO_JSON_FILE \
    $OUTPUT_FILE_PREFIX

# e.g.,
./scripts/create_tf_records.sh \
    train \
    ~/iMaterialist/raw/train \
    ~/iMaterialist/raw/instances_attributes_train2020.json \
    gs://yourbucket/tfrecords/train

TF Records will be created like gs://yourbucket/tfrecords/train-00001-of-00050.tfrecord.

Train a model.

./scripts/train.sh $INPUT_GCS_PATTERN $OUT_GCS_DIR

# e.g.,
./scripts/train.sh \
    gs://yourbucket/tfrecords/train-* \
    gs://yourbucket/model

Training artifacts (checkpoints, hyperparmeters, and logs etc) will be dumped into gs://yourbucket/model/.

Don't forget to stop your TPU if the training finishes.

./scripts/tpu.sh stop --tpu

Warning

If the training fails, delete the training artifacts from GCS. Otherwise, the configurations of the failed training will be loaded and it will fail again. For example, tensor's shape mismatch.

Predict

TPU

Log in to the VM instance via SSH.

# at the local machine
./scripts/tpu.sh start --vm
./scripts/tpu.sh ssh

Predict for your images.

poetry shell

./scripts/predict.sh \
    $MODEL_GCS_DIR \
    $IMAGE_GCS_DIR \
    $TF_RECORD_GCS_DIR \
    $OUT_GCS_DIR

# e.g.,
./scripts/predict.sh \
    gs://yourbucket/model \
    gs://yourbucket/yourdataset/images \
    gs://yourbucket/yourdataset/tfrecords \
    gs://yourbucket/yourdataset/predictions

The TPU should be automatically shut down by scripts/predict.sh.

CPU/GPU

Example:

poetry run kaggle_imaterialist2020_model/cmd/segment.py \
  --config-file gs://bucket/model/config.yaml \
  --checkpoint-file gs://bucket/model/model.ckpt-200000 \
  --image-dir gs://bucket/images \
  --out /tmp/segmentation.jsonlines

More details:

poetry run kaggle_imaterialist2020_model/cmd/segment.py --help

Prediction Schema

The prediction results are dumped to gs://yourbucket/yourdataset/predictions/predictions.json (JSON Lines). Its schema is:

{
  "image_id": 0,
  "category_id": 32,
  "bbox": [
    382.5208129883,
    660.4463500977,
    156.1093292236,
    122.5846252441
  ],
  "score": 0.984375,
  "segmentation": {
    "size": [
      1024, // height
      839   // width
    ],
    "counts": "<compressed RLE>"
  },
  "mask_mean_score": 0.9516192675,
  "mask_area_fraction": 0.7079081535,
  "attribute_probabilities": [
    0.0067976713,
    0.0062188804,
    //...
    0.0126225352
  ],
  "id": 1,
  "filename": "example.jpg"
}
{
    //...
}
//...

You can get the binary mask numpy.ndarray for a particular category using hrsma2i/segmentation.

Detection Prediction Data Flow

The data flow of tf_tpu_models/official/detection/main.py is depicted as the following figure.

Check accuracy when editing training code

If you edit the training code ( tf_tpu_models/official/detection/main.py ), you must check that the accuracy of a new model doesn't get worse.

poetry run python kaggle_imaterialist2020_model/cmd/check.py \
--checkpoint-path "gs://bucket/model/model.ckpt-20000" \
--config-file "gs://bucket/model/config.yaml" \
--out-dir tmp/segment/

All expected masks ( tests/resources/masks/*.npy ) must be included in the new predictions, if the accuracy doesn't get worse.

You'll get a message like:

tests/resources/masks/top.npy: OK
tests/resources/masks/coat.npy: OK
tests/resources/masks/stockings_right.npy: OK
tests/resources/masks/skirt.npy: OK
...

On the other hand, you'll get a message like the following, if the accuracy gets worse:

AssertionError: belt.npy mask doesn't exist in the prediction.

You can evaluate the predictions in more detail by seeing the actual and expcted images in the directory tmp/segment/mask_images given by --out-dir.

├── mask_images
|   ├── actual_0_leg_warmer.png
|   ├── actual_1_leg_warmer.png
|   ├── ...
|   ├── actual_8_shirt|blouse.png
|   ├── actual_9_shirt|blouse.png
|   ├── expected_belt.png
|   ├── expected_coat.png
|   ├── expected_skirt.png
|   ├── expected_stockings_right.png
|   └── expected_top.png
└── actual_masks
    ├── 0_leg_warmer.npy
    ├── ...
    └── 9_shirt|blouse.npy

If you update the segmentation model, also update the test cases in tests/resources/masks by choosing appropriate masks from /tmp/segment/actual_masks.

Name		Name	Last commit message	Last commit date
Latest commit History 1,152 Commits
.github		.github
benchmarks		benchmarks
configs/spinenet		configs/spinenet
kaggle_imaterialist2020_model		kaggle_imaterialist2020_model
scripts		scripts
tests/resources		tests/resources
tf_tpu_models		tf_tpu_models
tools		tools
tpu_configs		tpu_configs
.flake8		.flake8
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
detection_main_predict_flow.png		detection_main_predict_flow.png
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
requirements_from_poetry.txt		requirements_from_poetry.txt
requiremets-spotty.txt		requiremets-spotty.txt
setup.py		setup.py
spotty.yaml		spotty.yaml

License

hrsma2i/kaggle-imaterialist2020-model

Folders and files

Latest commit

History

Repository files navigation

Kaggle iMaterialist 2020

Setup

Train

Predict

TPU

CPU/GPU

Prediction Schema

Detection Prediction Data Flow

Check accuracy when editing training code

About

Resources

License

Stars

Watchers

Forks

Languages