Add feature vector dataloader #291

xianyuanliu · 2022-01-22T14:14:42Z

Description

Update action_dann_lightn example with a dataloader of feature vector input. This dataloader is used in the EPIC challenge.
The trainer and models for feature vectors will be updated in the next PR after this one is merged. Coverage in video.py will be improved in the next PR because of most parts are used in training.
Changes are summarized below.

Add EPIC100DatasetAccess
Add new hyperparameters: CLASS_TYPE, INPUT_TYPE and NUM_SEGMENTS
Improve video-related trainers and ClassNetVideo for multi-class output.

Status

Ready

Types of changes

Non-breaking change (fix or new feature that would not break existing functionality).
Breaking change (fix or new feature that would cause existing functionality to change).
New tests added to cover the changes.
In-line docstrings updated.
Source for documentation at docs manually updated for new API.

codecov-commenter · 2022-01-23T09:47:48Z

Codecov Report

Merging #291 (2773771) into main (3706d7c) will decrease coverage by 0.39%.
The diff coverage is 80.92%.

@@            Coverage Diff             @@
##             main     #291      +/-   ##
==========================================
- Coverage   92.96%   92.56%   -0.40%     
==========================================
  Files          45       45              
  Lines        4622     4724     +102     
==========================================
+ Hits         4297     4373      +76     
- Misses        325      351      +26

Impacted Files	Coverage Δ
kale/loaddata/videos.py	`73.18% <51.85%> (-13.48%)`	⬇️
kale/loaddata/video_access.py	`97.05% <94.91%> (-0.86%)`	⬇️
kale/embed/video_feature_extractor.py	`96.66% <100.00%> (+0.05%)`	⬆️
kale/pipeline/video_domain_adapter.py	`96.93% <100.00%> (ø)`
kale/predict/class_domain_nets.py	`91.89% <100.00%> (+1.89%)`	⬆️
kale/prepdata/video_transform.py	`88.00% <100.00%> (+1.04%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3706d7c...2773771. Read the comment docs.

shuo-zhou · 2022-01-28T20:27:22Z

examples/action_dann_lightn/config.py

@@ -16,15 +16,17 @@
 # Dataset
 # -----------------------------------------------------------------------------
 _C.DATASET = CN()
-_C.DATASET.ROOT = "I:/Datasets/EgoAction/"  # "/shared/tale2/Shared"
+_C.DATASET.ROOT = "J:/Datasets/EgoAction/"  # "/shared/tale2/Shared"


Avoid using absolute path in examples

It may be necessary because the data is stored somewhere else (e.g. /Shared/data), not in pykale.

Is it possible to use public dataset for demonstration? Do you expect users to change the .yaml configurations and download the data themselves?

It uses public video datasets (e.g. EPIC KITCHENS), but each is large (>10G). Therefore, users need to download and prepare data following dataset instructions before running this example. Then, of course, they need to change configurations fit for data.

Reasons that this example does not contain video dataset downloading are as follows:

Some datasets provide tools or packages to boost downloading speed, but no need to add these into our pykale.

Pykale and data may not be in the same hardware due to the datasets' large-scale.

I may update the readme to describe the dataset format. We may add video dataset downloading in the future when we reduce dataset storage by using some dataset tools (e.g. Hub).

Examples are used for demonstration only. It should not be run on large, and full dataset. Is there any small and public dataset can be used?

I agree with Shuo. PyKale examples should be relatively easy to run. Large-scale data experiment can be created on separate repo under PyKale or your personal repo. We can discuss tomorrow.

shuo-zhou · 2022-01-28T20:34:51Z

examples/action_dann_lightn/main.py

    # ---- setup dataset ----
    seed = cfg.SOLVER.SEED
-    source, target, num_classes = VideoDataset.get_source_target(
+    source, target, dict_num_classes = VideoDataset.get_source_target(


What is the reason of renaming num_classes as dict_num_classes? It is not consistent with the variable names in other examples.

The reason is that num_classes is a dictionary with a multi-task label (verb and noun). Other examples do not consider this. But I think no need to rename num_classes now. I will change it back.

github-actions · 2022-10-07T01:01:29Z

This pull request has been automatically marked as stale due to lack of activity.

xianyuanliu added 13 commits January 20, 2022 16:46

update .gitignore

7ccd345

update .gitignore

d955f73

change root dir

1cecdf2

add EPIC100DatasetAccess

f9d0577

change transform_kind to transform

046ef98

add NUM_SEGMENTS

77f1b0f

add INPUT_TYPE

8a8581b

add functions in VideoDatasetAccess for feature vector input

23b0e8e

add get_class_type

f993f8d

add CLASS_TYPE

60951d4

change num_classes to dict_num_classes

76f3e72

update ClassNetVideo for dual-class task

feaf72a

update test

f5bc2b7

xianyuanliu marked this pull request as draft January 22, 2022 14:14

github-actions bot added this to In progress in v0.1.0 Jan 22, 2022

xianyuanliu added new feature New feature/module (including request) work-in-progress Work in progress that should NOT be merged labels Jan 22, 2022

xianyuanliu added 10 commits January 22, 2022 22:16

Merge branch 'main' into add_feature_vector_dataloader

63c5be9

change output folder to tb_logs

f89d8fc

add get_class_type test

b845a88

update test_video_access

ef74b72

update config

b43802c

test bug fixes

ba6f5c5

add VideoFeatureRecord in Videos.py & improve doc

bdf9cbb

add epic100 test & bug fixes

3ea4678

test bug fixes

1540051

test bug fixes

de0e6cd

xianyuanliu marked this pull request as ready for review January 23, 2022 11:27

xianyuanliu requested a review from haipinglu January 23, 2022 11:27

xianyuanliu removed the work-in-progress Work in progress that should NOT be merged label Jan 23, 2022

This was referenced Jan 24, 2022

Simplify video_domain_adapter #292

Open

Add a new example for video feature vector input #293

Open

xianyuanliu requested a review from shuo-zhou January 28, 2022 08:10

shuo-zhou requested changes Feb 1, 2022

View reviewed changes

xianyuanliu and others added 3 commits February 7, 2022 21:25

rename to num_classes

a95a185

change root dir

ab23896

Merge branch 'main' into add_feature_vector_dataloader

2773771

haipinglu removed this from In progress in v0.1.0 Aug 11, 2022

haipinglu added this to In progress in v0.2.0 via automation Aug 11, 2022

github-actions bot added the Stale label Oct 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add feature vector dataloader #291

Add feature vector dataloader #291

xianyuanliu commented Jan 22, 2022 •

edited

codecov-commenter commented Jan 23, 2022 •

edited

shuo-zhou Jan 28, 2022

xianyuanliu Feb 7, 2022

shuo-zhou Feb 7, 2022 •

edited

xianyuanliu Feb 8, 2022

shuo-zhou Feb 9, 2022

haipinglu Feb 9, 2022

shuo-zhou Jan 28, 2022

xianyuanliu Feb 7, 2022

github-actions bot commented Oct 7, 2022

Add feature vector dataloader #291

Are you sure you want to change the base?

Add feature vector dataloader #291

Conversation

xianyuanliu commented Jan 22, 2022 • edited

Description

Status

Types of changes

codecov-commenter commented Jan 23, 2022 • edited

Codecov Report

shuo-zhou Jan 28, 2022

Choose a reason for hiding this comment

xianyuanliu Feb 7, 2022

Choose a reason for hiding this comment

shuo-zhou Feb 7, 2022 • edited

Choose a reason for hiding this comment

xianyuanliu Feb 8, 2022

Choose a reason for hiding this comment

shuo-zhou Feb 9, 2022

Choose a reason for hiding this comment

haipinglu Feb 9, 2022

Choose a reason for hiding this comment

shuo-zhou Jan 28, 2022

Choose a reason for hiding this comment

xianyuanliu Feb 7, 2022

Choose a reason for hiding this comment

github-actions bot commented Oct 7, 2022

xianyuanliu commented Jan 22, 2022 •

edited

codecov-commenter commented Jan 23, 2022 •

edited

shuo-zhou Feb 7, 2022 •

edited