Skip to content
This repository has been archived by the owner on Oct 9, 2023. It is now read-only.

Latest commit

 

History

History
87 lines (62 loc) · 3.27 KB

image_classification_multi_label.rst

File metadata and controls

87 lines (62 loc) · 3.27 KB

Multi-label Image Classification

The Task

Multi-label classification is the task of assigning a number of labels from a fixed set to each data point, which can be in any modality (images in this case). Multi-label image classification is supported by the ~flash.image.classification.model.ImageClassifier via the multi-label argument.


Example

Let's look at the task of trying to predict the movie genres from an image of the movie poster. The data we will use is a subset of the awesome movie poster genre prediction data set from the paper "Movie Genre Classification based on Poster Images with Deep Neural Networks" by Wei-Ta Chu and Hung-Jui Guo, resized to 128 by 128. Take a look at their paper (and please consider citing their paper if you use the data) here: www.cs.ccu.edu.tw/~wtchu/projects/MoviePoster/. The data set contains train and validation folders, and then each folder contains images and a metadata.csv which stores the labels. Here's an overview:

movie_posters
├── train
│   ├── metadata.csv
│   ├── tt0084058.jpg
│   ├── tt0084867.jpg
│   ...
└── val
    ├── metadata.csv
    ├── tt0200465.jpg
    ├── tt0326965.jpg
    ...

Once we've downloaded the data using ~flash.core.data.download_data, we need to create the ~flash.image.classification.data.ImageClassificationData. We first create a function (load_data) to extract the list of images and associated labels which can then be passed to ~flash.image.classification.data.ImageClassificationData.from_files. We select a pre-trained backbone to use for our ~flash.image.classification.model.ImageClassifier and fine-tune on the posters data. We then use the trained ~flash.image.classification.model.ImageClassifier for inference. Finally, we save the model. Here's the full example:

../../../flash_examples/image_classification_multi_label.py

To learn how to view the available backbones / heads for this task, see backbones_heads.


Flash Zero

The multi-label image classifier can be used directly from the command line with zero code using flash_zero. You can run the movie posters example with:

flash image_classification from_movie_posters

To view configuration options and options for running the image classifier with your own data, use:

flash image_classification --help

Serving

The ~flash.image.classification.model.ImageClassifier is servable. For more information, see image_classification.