Skip to content

Latest commit

 

History

History
76 lines (56 loc) · 2.04 KB

merge_datasets.rst

File metadata and controls

76 lines (56 loc) · 2.04 KB

Merge Datasets

This topic describes the merge dataset operation.

Take the Oxford-IIIT Pet and Dogs vs Cats as examples. Their structures looks like:

Oxford-IIIT Pet/
    test/
        Abyssinian_002.jpg
        ...
    trainval/
        Abyssinian_001.jpg
        ...

Dogs vs Cats/
    test/
        1.jpg
        10.jpg
        ...
    train/
        cat.0.jpg
        cat.1.jpg
        ...

There are lots of pictures of cats and dogs in these two datasets, merge them to get a more diverse dataset.

Note

Before merging datasets, fork both of the open datasets first.

Create a dataset which is named mergedDataset.

../../../docs/code/merge_datasets.py

Copy all segments in OxfordIIITPetDog to mergedDataset.

../../../docs/code/merge_datasets.py

Use the catalog of OxfordIIITPet as the catalog of the merged dataset.

../../../docs/code/merge_datasets.py

Unify categories of train segment.

../../../docs/code/merge_datasets.py

Note

The category in OxfordIIITPet is of two-level formats, like cat.Abyssinian, but in Dogs vs Cats it only has one level, like cat. Thus it is important to unify the categories, for example, rename cat.Abyssinian to cat.

Copy data from Dogs vs Cats to mergedDataset.

../../../docs/code/merge_datasets.py