Skip to content

Latest commit

 

History

History
170 lines (123 loc) · 5.6 KB

DogsVsCats.rst

File metadata and controls

170 lines (123 loc) · 5.6 KB

Dogs vs Cats

This topic describes how to manage the Dogs vs Cats Dataset, which is a dataset with reference/label_format:Classification label.

Authorize a Client Instance

An reference/glossary:accesskey is needed to authenticate identity when using TensorBay.

../../../docs/code/DogsVsCats.py

Create Dataset

../../../docs/code/DogsVsCats.py

Organize Dataset

It takes the following steps to organize the "Dogs vs Cats" dataset by the ~tensorbay.dataset.dataset.Dataset instance.

Step 1: Write the Catalog

A reference/dataset_structure:catalog contains all label information of one dataset, which is typically stored in a json file.

../../../tensorbay/opendataset/DogsVsCats/catalog.json

The only annotation type for "Dogs vs Cats" is reference/label_format:Classification, and there are 2 reference/label_format:category types.

Important

See catalog table <reference/dataset_structure:catalog> for more catalogs with different label types.

Step 2: Write the Dataloader

A reference/glossary:dataloader is needed to organize the dataset into a ~tensorbay.dataset.dataset.Dataset instance.

../../../tensorbay/opendataset/DogsVsCats/loader.py

See Classification annotation <reference/label_format:Classification> for more details.

Note

Since the Dogs vs Cats dataloader <dogsvscats-dataloader> above is already included in TensorBay, so it uses relative import. However, the regular import should be used when writing a new dataloader.

../../../docs/code/DogsVsCats.py

There are already a number of dataloaders in TensorBay SDK provided by the community. Thus, instead of writing, importing an available dataloadert is also feasible.

../../../docs/code/DogsVsCats.py

Note

Note that catalogs are automatically loaded in available dataloaders, users do not have to write them again.

Important

See dataloader table <reference/glossary:dataloader> for more examples of dataloaders with different label types.

Visualize Dataset

Optionally, the organized dataset can be visualized by Pharos, which is a TensorBay SDK plug-in. This step can help users to check whether the dataset is correctly organized. Please see features/visualization:Visualization for more details.

Upload Dataset

The organized "Dogs vs Cats" dataset can be uploaded to TensorBay for sharing, reuse, etc.

../../../docs/code/DogsVsCats.py

Similar with Git, the commit step after uploading can record changes to the dataset as a version. If needed, do the modifications and commit again. Please see features/version_control:Version Control for more details.

Read Dataset

Now "Dogs vs Cats" dataset can be read from TensorBay.

../../../docs/code/DogsVsCats.py

In reference/dataset_structure:dataset "Dogs vs Cats", there are two segments <reference/dataset_structure:segment>: train and test. Get the segment names by listing them all.

../../../docs/code/DogsVsCats.py

Get a segment by passing the required segment name.

../../../docs/code/DogsVsCats.py

In the train reference/dataset_structure:segment, there is a sequence of reference/dataset_structure:data, which can be obtained by index.

../../../docs/code/DogsVsCats.py

In each reference/dataset_structure:data, there is a sequence of reference/label_format:Classification annotations, which can be obtained by index.

../../../docs/code/DogsVsCats.py

There is only one label type in "Dogs vs Cats" dataset, which is classification. The information stored in reference/label_format:category is one of the names in "categories" list of catalog.json <dogsvscats-catalog>. See reference/label_format:Classification label format for more details.

Delete Dataset

../../../docs/code/DogsVsCats.py