Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Support loading data from Ceph in SOT dataset class #494

Merged
merged 22 commits into from
Apr 16, 2022

Conversation

JingweiZhang12
Copy link
Collaborator

Modification

This PR mainly contains the following:

  1. loading data information files in text fromat from Ceph in SOT dataset class
  2. loading images and annotations from Ceph in SOT dataset class

Use cases (Optional)

If you want to load data from Ceph in SOT, you should modify your configs like the following:

file_client_args = dict(
    backend='petrel',
    path_mapping=dict({
        'data/got10k':
        'openmmlab:s3://openmmlab/datasets/tracking/GOT10k',
        'data/trackingnet':
        'openmmlab:s3://openmmlab/datasets/tracking/TrackingNet',
        'data/lasot':
        'openmmlab:s3://openmmlab/datasets/tracking/LaSOT_full',
        'data/coco':
        'openmmlab:s3://openmmlab/datasets/detection/coco',
        'data/ILSVRC':
        'openmmlab:s3://openmmlab/datasets/tracking/ILSVRC',
        'data/otb100':
        'openmmlab:s3://openmmlab/datasets/tracking/OTB100',
        'data/UAV123':
        'openmmlab:s3://openmmlab/datasets/tracking/UAV123',
        'data/vot2018':
        'openmmlab:s3://openmmlab/datasets/tracking/VOT2018'
    }))

train_pipeline = [
    ...
    dict(
        type='LoadMultiImagesFromFile',
        to_float32=True,
        file_client_args=file_client_args),
    dict(
        type='SeqLoadAnnotations',
        with_bbox=True,
        with_label=False,
        file_client_args=file_client_args),
    ...
]

test_pipeline = [
    dict(
        type='LoadImageFromFile',
        to_float32=True,
        file_client_args=file_client_args),
    dict(
        type='LoadAnnotations',
        with_bbox=True,
        with_label=False,
        file_client_args=file_client_args),
    ...
]

data = dict(
    ...,
    train=dict(
        type='RandomSampleConcatDataset',
        dataset_sampling_weights=[1,1],
        dataset_cfgs=[
            dict(
                type='GOT10kDataset',
                ann_file=data_root +
                'got10k/annotations/got10k_train_infos.txt',
                img_prefix=data_root + 'got10k',
                pipeline=train_pipeline,
                split='train_vot',
                test_mode=False,
                file_client_args=file_client_args),
            dict(
                type='LaSOTDataset',
                ann_file=data_root + 'lasot/annotations/lasot_train_infos.txt',
                img_prefix=data_root + 'lasot/LaSOTBenchmark',
                pipeline=train_pipeline,
                split='train',
                test_mode=False,
                file_client_args=file_client_args),
           ]
        ...),
     val=dict(
        type='GOT10kDataset',
        ann_file=data_root + 'got10k/annotations/got10k_test_infos.txt',
        img_prefix=data_root + 'got10k',
        pipeline=test_pipeline,
        split='test',
        test_mode=True,
        file_client_args=file_client_args),
    test=dict(
        type='GOT10kDataset',
        ann_file=data_root + 'got10k/annotations/got10k_test_infos.txt',
        img_prefix=data_root + 'got10k',
        pipeline=test_pipeline,
        split='test',
        test_mode=True,
        file_client_args=file_client_args))

@JingweiZhang12
Copy link
Collaborator Author

JingweiZhang12 commented Apr 13, 2022

I'll verify loading text-format information files from Ceph in training and test later.

@codecov
Copy link

codecov bot commented Apr 13, 2022

Codecov Report

Merging #494 (63702a4) into master (c6e1e85) will increase coverage by 0.03%.
The diff coverage is 86.36%.

@@            Coverage Diff             @@
##           master     #494      +/-   ##
==========================================
+ Coverage   73.15%   73.19%   +0.03%     
==========================================
  Files         126      126              
  Lines        7306     7361      +55     
  Branches     1378     1379       +1     
==========================================
+ Hits         5345     5388      +43     
- Misses       1536     1545       +9     
- Partials      425      428       +3     
Flag Coverage Δ
unittests 73.16% <86.36%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
mmtrack/datasets/otb_dataset.py 55.81% <42.85%> (ø)
mmtrack/datasets/trackingnet_dataset.py 78.87% <85.71%> (ø)
mmtrack/datasets/base_sot_dataset.py 87.67% <100.00%> (+0.80%) ⬆️
mmtrack/datasets/got10k_dataset.py 87.67% <100.00%> (ø)
mmtrack/datasets/lasot_dataset.py 100.00% <100.00%> (ø)
mmtrack/datasets/sot_coco_dataset.py 100.00% <100.00%> (ø)
mmtrack/datasets/sot_imagenet_vid_dataset.py 100.00% <100.00%> (ø)
mmtrack/datasets/uav123_dataset.py 100.00% <100.00%> (ø)
mmtrack/datasets/vot_dataset.py 91.66% <100.00%> (ø)
mmtrack/models/reid/base_reid.py 78.26% <0.00%> (-8.11%) ⬇️
... and 18 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c6e1e85...63702a4. Read the comment docs.

Copy link
Collaborator

@GT9505 GT9505 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make sure that we support both diskbackend and petrelbackend in SOT dataset

Comment on lines 47 to 48
data_infos_str = self.file_client.get_text(
self.ann_file).strip().split('\n')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use the self.loadtxt() API?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't use self.loadtxt() API for the efficiency.
self.loadtxt() is fit for the load of the whole txt files in the NumPy format.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have readjusted it.

@JingweiZhang12
Copy link
Collaborator Author

I have verified loading from Ceph for all SOT datasets. It's OK.

Comment on lines 26 to 33
if 'file_client_args' in kwargs and kwargs['file_client_args'][
'backend'] != 'disk':
self.file_client = mmcv.FileClient(**kwargs['file_client_args'])
with self.file_client.get_local_path(ann_file) as local_path:
self.coco = COCO(local_path)
else:
self.coco = COCO(ann_file)
super().__init__(*args, **kwargs)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too ugly here, we may only keep line 28-30?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If only keep line 28-30, it will create a temporary file and read it even reading local files.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is acceptable. And mmdet also use the same way

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I modify it right now.

with self.file_client.get_local_path(ann_file) as local_path:
self.coco = CocoVID(local_path)
else:
self.coco = CocoVID(ann_file)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

@GT9505 GT9505 merged commit 503756c into open-mmlab:master Apr 16, 2022
@JingweiZhang12 JingweiZhang12 deleted the sot_ceph branch October 13, 2022 11:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants