New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add feature vector dataloader #291
base: main
Are you sure you want to change the base?
Conversation
Codecov Report
@@ Coverage Diff @@
## main #291 +/- ##
==========================================
- Coverage 92.96% 92.56% -0.40%
==========================================
Files 45 45
Lines 4622 4724 +102
==========================================
+ Hits 4297 4373 +76
- Misses 325 351 +26
Continue to review full report at Codecov.
|
@@ -16,15 +16,17 @@ | |||
# Dataset | |||
# ----------------------------------------------------------------------------- | |||
_C.DATASET = CN() | |||
_C.DATASET.ROOT = "I:/Datasets/EgoAction/" # "/shared/tale2/Shared" | |||
_C.DATASET.ROOT = "J:/Datasets/EgoAction/" # "/shared/tale2/Shared" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid using absolute path in examples
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be necessary because the data is stored somewhere else (e.g. /Shared/data), not in pykale.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to use public dataset for demonstration? Do you expect users to change the .yaml configurations and download the data themselves?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It uses public video datasets (e.g. EPIC KITCHENS), but each is large (>10G). Therefore, users need to download and prepare data following dataset instructions before running this example. Then, of course, they need to change configurations fit for data.
Reasons that this example does not contain video dataset downloading are as follows:
- Some datasets provide tools or packages to boost downloading speed, but no need to add these into our pykale.
- Pykale and data may not be in the same hardware due to the datasets' large-scale.
I may update the readme to describe the dataset format. We may add video dataset downloading in the future when we reduce dataset storage by using some dataset tools (e.g. Hub).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Examples are used for demonstration only. It should not be run on large, and full dataset. Is there any small and public dataset can be used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Shuo. PyKale examples should be relatively easy to run. Large-scale data experiment can be created on separate repo under PyKale or your personal repo. We can discuss tomorrow.
examples/action_dann_lightn/main.py
Outdated
# ---- setup dataset ---- | ||
seed = cfg.SOLVER.SEED | ||
source, target, num_classes = VideoDataset.get_source_target( | ||
source, target, dict_num_classes = VideoDataset.get_source_target( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason of renaming num_classes
as dict_num_classes
? It is not consistent with the variable names in other examples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason is that num_classes is a dictionary with a multi-task label (verb and noun). Other examples do not consider this. But I think no need to rename num_classes
now. I will change it back.
This pull request has been automatically marked as stale due to lack of activity. |
Description
Update action_dann_lightn example with a dataloader of feature vector input. This dataloader is used in the EPIC challenge.
The trainer and models for feature vectors will be updated in the next PR after this one is merged. Coverage in video.py will be improved in the next PR because of most parts are used in training.
Changes are summarized below.
Status
Ready
Types of changes
docs
manually updated for new API.