Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add feature with_offset for rawframe dataset #48

Merged
merged 11 commits into from
Jul 29, 2020

Conversation

kennymckormick
Copy link
Member

add feature 'with offset' for rawframe dataset, need this feature to train recognition models on untrimmed datasets

@codecov
Copy link

codecov bot commented Jul 23, 2020

Codecov Report

Merging #48 into master will increase coverage by 0.13%.
The diff coverage is 93.15%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #48      +/-   ##
==========================================
+ Coverage   84.87%   85.00%   +0.13%     
==========================================
  Files          73       73              
  Lines        3827     3889      +62     
  Branches      618      629      +11     
==========================================
+ Hits         3248     3306      +58     
- Misses        480      482       +2     
- Partials       99      101       +2     
Flag Coverage Δ
#unittests 85.00% <93.15%> (+0.13%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
mmaction/models/backbones/resnet3d.py 89.69% <76.47%> (-0.89%) ⬇️
mmaction/datasets/rawframe_dataset.py 87.36% <95.23%> (+2.00%) ⬆️
mmaction/datasets/pipelines/loading.py 93.13% <100.00%> (+0.04%) ⬆️
mmaction/models/backbones/resnet3d_slowfast.py 87.97% <100.00%> (+0.07%) ⬆️
mmaction/models/backbones/resnet_tsm.py 95.08% <100.00%> (+1.67%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7dc99cf...cedfbd2. Read the comment docs.

@@ -0,0 +1,2 @@
test_imgs 2 5 127
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you defining a new dataset annotation format?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does each column mean?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I add an optional format for ann_file, instead of 'frame_dir num_frame label[s]', it can be 'frame_dir start_idx num_frame label[s]'. My code is compatible with original format.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what num_frame in this context mean? the length of the video or the length of the segment?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The length of the video clip (which is part of the entire untrimmed video).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need a doc page to describe all possible annotation formats we used. @dreamerlin

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can prepare a annotation_description.md in docs/ to describe all possible annotation formats.

@innerlee
Copy link
Contributor

innerlee commented Jul 24, 2020

I'm thinking on some short descriptor of the annotation file which resembles the channel order description.

Say we let

  • X: video
  • Y: label
  • S: snippet start index (begin with 0, this number is included)
  • D: duration of snippet
  • L: length of video

Then this can be represented as " XSDY". Notice the starting space , which represents the delimiter.
Other formats could be ",XLY", meaning comma delimited, giving video name, video length and label.

For one based frame datasets, one can add an extra column S=1, with format " VSDL".
We can deprecate the many options like start_index, etc.

cc. @dreamerlin @hellock @kennymckormick

and @JoannaLXY , could this fit into localization tasks? Anything else needed to be added?

@dreamerlin dreamerlin changed the title Hd/with offset Add feature with_offset for rawframe dataset Jul 24, 2020
@kennymckormick
Copy link
Member Author

I'm thinking on some short descriptor of the annotation file which resembles the channel order description.

Say we let

  • X: video
  • Y: label
  • S: snippet start index (begin with 0, this number is included)
  • D: duration of snippet
  • L: length of video

Then this can be represented as " XSDY". Notice the starting space , which represents the delimiter.
Other formats could be ",XLY", meaning comma delimited, giving video name, video length and label.

For one based frame datasets, one can add an extra column S=1, with format " VSDL".
We can deprecate the many options like start_index, etc.

cc. @dreamerlin @hellock @kennymckormick

and @JoannaLXY , could this fit into localization tasks? Anything else needed to be added?

That's a good idea, where should we place the description?

@innerlee
Copy link
Contributor

Proposal 2

A second thought, maybe we have an easier solution, that is to unify all recognition (single class) annotation to
video start_index length class

This should cover all use cases, right?

@kennymckormick
Copy link
Member Author

Proposal 2

A second thought, maybe we have an easier solution, that is to unify all recognition (single class) annotation to
video start_index length class

This should cover all use cases, right?

I think the current design is OK since:

  1. To unify all recognition (single class) annotation, you need to change a lot of codes, like data preprocessing, dataset, etc.
  2. The current code is both forward compatible and backward compatible, so we don't need to make extra efforts in unifying annotations. And think about that case: maybe we need to add many unforeseen features in the future. For example, to support HVU, we need to write a dataset that supports multiple attributes (with action class, scene class, sentiment class, etc.). At that time, the efforts made now may become worthless。

@innerlee
Copy link
Contributor

The current design is not ideal because it is stacking many switches. What's worse is that those switches are not independent.

A better design is needed. Any idea?

@kennymckormick
Copy link
Member Author

@hellock @dreamerlin Any good idea about annotation design for recognition? I think json which contains a list of dictionaries (each represents a sample) is a good choice. Each dictionary looks like dict(frame_dir='path', num_frame=100, label=1), etc. . I think this design is good since it can be extended easily to your need.

@hellock
Copy link
Member

hellock commented Jul 26, 2020

I suggest refactoring as early as possible. When there are more users, the backward compatibility issue will be a technical debt. We can define a primary annotation format described in a json file, which will be more extensible.

@kennymckormick
Copy link
Member Author

I suggest refactoring as early as possible. When there are more users, the backward compatibility issue will be a technical debt. We can define a primary annotation format described in a json file, which will be more extensible.

You mean we can support both txtlist and json, while using json as the primary format?

@innerlee
Copy link
Contributor

Let's merge it for now, and leave the unification of data annotation to future.

@innerlee innerlee merged commit a9aa5f8 into open-mmlab:master Jul 29, 2020
@kennymckormick kennymckormick deleted the hd/with_offset branch September 7, 2020 09:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants