[Feature] Support loading data from Ceph in SOT dataset class #494

JingweiZhang12 · 2022-04-13T07:07:51Z

Modification

This PR mainly contains the following:

loading data information files in text fromat from Ceph in SOT dataset class
loading images and annotations from Ceph in SOT dataset class

Use cases (Optional)

If you want to load data from Ceph in SOT, you should modify your configs like the following:

file_client_args = dict(
    backend='petrel',
    path_mapping=dict({
        'data/got10k':
        'openmmlab:s3://openmmlab/datasets/tracking/GOT10k',
        'data/trackingnet':
        'openmmlab:s3://openmmlab/datasets/tracking/TrackingNet',
        'data/lasot':
        'openmmlab:s3://openmmlab/datasets/tracking/LaSOT_full',
        'data/coco':
        'openmmlab:s3://openmmlab/datasets/detection/coco',
        'data/ILSVRC':
        'openmmlab:s3://openmmlab/datasets/tracking/ILSVRC',
        'data/otb100':
        'openmmlab:s3://openmmlab/datasets/tracking/OTB100',
        'data/UAV123':
        'openmmlab:s3://openmmlab/datasets/tracking/UAV123',
        'data/vot2018':
        'openmmlab:s3://openmmlab/datasets/tracking/VOT2018'
    }))

train_pipeline = [
    ...
    dict(
        type='LoadMultiImagesFromFile',
        to_float32=True,
        file_client_args=file_client_args),
    dict(
        type='SeqLoadAnnotations',
        with_bbox=True,
        with_label=False,
        file_client_args=file_client_args),
    ...
]

test_pipeline = [
    dict(
        type='LoadImageFromFile',
        to_float32=True,
        file_client_args=file_client_args),
    dict(
        type='LoadAnnotations',
        with_bbox=True,
        with_label=False,
        file_client_args=file_client_args),
    ...
]

data = dict(
    ...,
    train=dict(
        type='RandomSampleConcatDataset',
        dataset_sampling_weights=[1,1],
        dataset_cfgs=[
            dict(
                type='GOT10kDataset',
                ann_file=data_root +
                'got10k/annotations/got10k_train_infos.txt',
                img_prefix=data_root + 'got10k',
                pipeline=train_pipeline,
                split='train_vot',
                test_mode=False,
                file_client_args=file_client_args),
            dict(
                type='LaSOTDataset',
                ann_file=data_root + 'lasot/annotations/lasot_train_infos.txt',
                img_prefix=data_root + 'lasot/LaSOTBenchmark',
                pipeline=train_pipeline,
                split='train',
                test_mode=False,
                file_client_args=file_client_args),
           ]
        ...),
     val=dict(
        type='GOT10kDataset',
        ann_file=data_root + 'got10k/annotations/got10k_test_infos.txt',
        img_prefix=data_root + 'got10k',
        pipeline=test_pipeline,
        split='test',
        test_mode=True,
        file_client_args=file_client_args),
    test=dict(
        type='GOT10kDataset',
        ann_file=data_root + 'got10k/annotations/got10k_test_infos.txt',
        img_prefix=data_root + 'got10k',
        pipeline=test_pipeline,
        split='test',
        test_mode=True,
        file_client_args=file_client_args))

…to sot_ceph

JingweiZhang12 · 2022-04-13T07:09:37Z

I'll verify loading text-format information files from Ceph in training and test later.

codecov · 2022-04-13T08:01:09Z

Codecov Report

Merging #494 (63702a4) into master (c6e1e85) will increase coverage by 0.03%.
The diff coverage is 86.36%.

@@            Coverage Diff             @@
##           master     #494      +/-   ##
==========================================
+ Coverage   73.15%   73.19%   +0.03%     
==========================================
  Files         126      126              
  Lines        7306     7361      +55     
  Branches     1378     1379       +1     
==========================================
+ Hits         5345     5388      +43     
- Misses       1536     1545       +9     
- Partials      425      428       +3

Flag	Coverage Δ
unittests	`73.16% <86.36%> (+0.03%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
mmtrack/datasets/otb_dataset.py	`55.81% <42.85%> (ø)`
mmtrack/datasets/trackingnet_dataset.py	`78.87% <85.71%> (ø)`
mmtrack/datasets/base_sot_dataset.py	`87.67% <100.00%> (+0.80%)`	⬆️
mmtrack/datasets/got10k_dataset.py	`87.67% <100.00%> (ø)`
mmtrack/datasets/lasot_dataset.py	`100.00% <100.00%> (ø)`
mmtrack/datasets/sot_coco_dataset.py	`100.00% <100.00%> (ø)`
mmtrack/datasets/sot_imagenet_vid_dataset.py	`100.00% <100.00%> (ø)`
mmtrack/datasets/uav123_dataset.py	`100.00% <100.00%> (ø)`
mmtrack/datasets/vot_dataset.py	`91.66% <100.00%> (ø)`
mmtrack/models/reid/base_reid.py	`78.26% <0.00%> (-8.11%)`	⬇️
... and 18 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c6e1e85...63702a4. Read the comment docs.

GT9505

Please make sure that we support both diskbackend and petrelbackend in SOT dataset

GT9505 · 2022-04-13T11:14:59Z

mmtrack/datasets/got10k_dataset.py

+        data_infos_str = self.file_client.get_text(
+            self.ann_file).strip().split('\n')


Can we use the self.loadtxt() API?

I don't use self.loadtxt() API for the efficiency.
self.loadtxt() is fit for the load of the whole txt files in the NumPy format.

I have readjusted it.

mmtrack/datasets/base_sot_dataset.py

JingweiZhang12 · 2022-04-15T12:28:29Z

I have verified loading from Ceph for all SOT datasets. It's OK.

GT9505 · 2022-04-15T12:26:01Z

mmtrack/datasets/sot_coco_dataset.py

+        if 'file_client_args' in kwargs and kwargs['file_client_args'][
+                'backend'] != 'disk':
+            self.file_client = mmcv.FileClient(**kwargs['file_client_args'])
+            with self.file_client.get_local_path(ann_file) as local_path:
+                self.coco = COCO(local_path)
+        else:
+            self.coco = COCO(ann_file)
        super().__init__(*args, **kwargs)


Too ugly here, we may only keep line 28-30?

If only keep line 28-30, it will create a temporary file and read it even reading local files.

I think it is acceptable. And mmdet also use the same way

OK， I modify it right now.

GT9505 · 2022-04-15T12:26:15Z

mmtrack/datasets/sot_imagenet_vid_dataset.py

+            with self.file_client.get_local_path(ann_file) as local_path:
+                self.coco = CocoVID(local_path)
+        else:
+            self.coco = CocoVID(ann_file)


JingweiZhang12 added 14 commits April 3, 2022 01:33

add lasot got trackingnet and vot data parsing

bd07704

add docs

8a3dadd

update SOT dataset class and configs for the new info_file

3afc059

update docs, configs and fix some bugs

fd5ef98

update unit test

a91d204

del zip files in unit test

e165fec

fix path error in win

01c7653

fix path of unit test in win

568071e

Merge branch 'offline_ann' of github.com:JingweiZhang12/mmtracking in…

d0cb44e

…to sot_ceph

support ceph for SOT

aa33d0d

merge from upstream master

f0955fe

update docs

3e8698a

fix comflict

e8ef790

support reading images in vot eavluation

d1ff81b

JingweiZhang12 requested a review from GT9505 April 13, 2022 07:11

fix lint

f173043

GT9505 added the Awaiting response label Apr 13, 2022

GT9505 reviewed Apr 13, 2022

View reviewed changes

mmtrack/datasets/base_sot_dataset.py Show resolved Hide resolved

JingweiZhang12 added 3 commits April 13, 2022 19:43

update coco and imagenetvid

5a55932

fix docs and adjust severals

34f4994

adjust bugs about self.laodtxt

1a69be3

JingweiZhang12 requested a review from GT9505 April 15, 2022 08:40

fix gen_vot_infos path bug

c8329d7

GT9505 reviewed Apr 15, 2022

View reviewed changes

JingweiZhang12 added 2 commits April 15, 2022 20:55

simplify path parsing for sot_coco and sot_imagenetvid

3710bfa

fix bugs in sot_coco and sot_imagenetvid

7995385

fix readme.md in stark configs

63702a4

JingweiZhang12 requested a review from GT9505 April 16, 2022 10:06

GT9505 approved these changes Apr 16, 2022

View reviewed changes

GT9505 merged commit 503756c into open-mmlab:master Apr 16, 2022

GT9505 mentioned this pull request Apr 19, 2022

[Fix] Fix an important bug about loading lasot dataset #480

Closed

JingweiZhang12 deleted the sot_ceph branch October 13, 2022 11:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Support loading data from Ceph in SOT dataset class #494

[Feature] Support loading data from Ceph in SOT dataset class #494

JingweiZhang12 commented Apr 13, 2022

JingweiZhang12 commented Apr 13, 2022 •

edited

Loading

codecov bot commented Apr 13, 2022 •

edited

Loading

GT9505 left a comment

GT9505 Apr 13, 2022

JingweiZhang12 Apr 13, 2022

JingweiZhang12 Apr 13, 2022

JingweiZhang12 commented Apr 15, 2022

GT9505 Apr 15, 2022

JingweiZhang12 Apr 15, 2022

GT9505 Apr 15, 2022

JingweiZhang12 Apr 15, 2022

GT9505 Apr 15, 2022

		data_infos_str = self.file_client.get_text(
		self.ann_file).strip().split('\n')

[Feature] Support loading data from Ceph in SOT dataset class #494

[Feature] Support loading data from Ceph in SOT dataset class #494

Conversation

JingweiZhang12 commented Apr 13, 2022

Modification

Use cases (Optional)

JingweiZhang12 commented Apr 13, 2022 • edited Loading

codecov bot commented Apr 13, 2022 • edited Loading

Codecov Report

GT9505 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JingweiZhang12 commented Apr 15, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JingweiZhang12 commented Apr 13, 2022 •

edited

Loading

codecov bot commented Apr 13, 2022 •

edited

Loading