fewview_train subset JSON contain frames that belong in both of train and test sets #49

zhizdev · 2022-08-02T21:20:29Z

I am trying to use the CO3Dv2 dataset, however, I ran into some weird issues with the set_lists/set_lists_fewview_train.json fewview train JSON subset lists.

As defined in `co3d.implicitron.dataset.json_index_dataset_map_provider_v2.py' line 104, each JSON file should contain the following structure:

Each `set_lists_<subset_name_l>.json` file contains the following dictionary:
{
    "train": [
        (sequence_name: str, frame_number: int, image_path: str),
        ...
    ],
    "val": [
        (sequence_name: str, frame_number: int, image_path: str),
        ...
    ],
    "test": [
        (sequence_name: str, frame_number: int, image_path: str),
        ...
    ],
}

In the case of the tv, hydrant, donut (and I believe all) categories, in set_lists_fewview_train.json, all of the frames (image_path) under "train" are also under "test".

However, set_lists_fewview_dev.json and set_lists_fewview_test.json contain clearly separated "train" and "test" frames.

I am not sure if this behavior is a design choice or a bug. My goal to is train a model only on the training set, and not the dev or test sets. What would be the correct JSON subset list and subset to use?

The text was updated successfully, but these errors were encountered:

davnov134 · 2022-08-03T09:00:37Z

Hello, this is by design.

Tl;dr: Indeed, using the train setlist of set_lists_fewview_train is the best way to train your few-view model.

In more detail, all frames within a category are separated to 6 sets <sequence_set>_<seen|unknown>, i.e.:

train_unseen
train_known
dev_unseen
dev_known
test_unseen
test_known

The set_lists_fewview_*.json set lists are defined as follows:

set_lists_fewview_train: {
    "train": train_known,
    "val": train_known + train_unseen,
    "test": train_known + train_unseen,
}
set_lists_fewview_dev: {
    "train": train_known,
    "val": dev_known + dev_unseen,
    "test": dev_known + dev_unseen,
}
set_lists_fewview_test: {
    "train": train_known,
    "val": dev_known + dev_unseen,
    "test": test_known + test_unseen,
}

For your case specifically, the train setlist of set_lists_fewview_train contains only the train_known frames which should be used for training. However, the val setlist of set_lists_fewview_train contains train_known but ALSO train_unseen. This is why you see that all frames from train are also in val.

The "val" set contains also the "train" views because, when validating/testing, one needs to have access to the "known" source views (from the train set) in order to be able to generate the unseen views. This requires both known and unseen views to live in the same set of loaded images.

Indeed, if you inspect the eval_batches files, you will discover that the first (target) frame in an eval batch is always drawn from the unseen set of frames, while the rest of the frames comes from the known frames.

In order to find out which frames are known/unseen, feel free to inspect the meta.frame_type fields in frame_annotations.jgz.

I hope this helps, let me know if further clarification is needed.

zhizdev · 2022-08-03T14:47:34Z

Thank you so much for the reply! This is super helpful!

zhizdev changed the title ~~fewview_train subset JSON contain frames that belong in all of train, val, and test sets~~ fewview_train subset JSON contain frames that belong in both of train and test sets Aug 2, 2022

zhizdev closed this as completed Aug 3, 2022

davnov134 mentioned this issue Mar 15, 2023

Data relationship difference between v1 and v2 #66

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fewview_train subset JSON contain frames that belong in both of train and test sets #49

fewview_train subset JSON contain frames that belong in both of train and test sets #49

zhizdev commented Aug 2, 2022 •

edited

Loading

davnov134 commented Aug 3, 2022

zhizdev commented Aug 3, 2022

fewview_train subset JSON contain frames that belong in both of train and test sets #49

fewview_train subset JSON contain frames that belong in both of train and test sets #49

Comments

zhizdev commented Aug 2, 2022 • edited Loading

davnov134 commented Aug 3, 2022

zhizdev commented Aug 3, 2022

zhizdev commented Aug 2, 2022 •

edited

Loading