Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve dataset test infrastructure #3450

Merged
merged 8 commits into from
Feb 25, 2021

Conversation

pmeier
Copy link
Collaborator

@pmeier pmeier commented Feb 24, 2021

  • Currently, we only use the default config if the user didn't specify any CONFIGS. This means on the other hand that if he specifies a config, only the specified values are forwarded to inject_fake_data.

    Since it may be the case that the user only specifies a part of available parameters and wants to leave the others on their default value, we should include them. With this PR the default config is always used as base and simply updated by the config of the user.

  • Sometimes the image or video filename is a random string. This PR adds a create_random_string() function for a convenient interface.

  • The output checking of the inject_fake_data() method takes up most of the body of the create_dataset() and thus drowning the actual important parts. In order to make the base class more maintainable, this PR moves the checking into a dedicated function.

  • This PR splits the patching of the download / extract and check logic and only makes the patching of the check logic optional. Before we might run into scenarios where the test_not_found_or_corrupted test actually tried to download the dataset. This happens for example for datasets.SBU since it uses download=True as default and in contrast to most other datasets.

@pmeier pmeier changed the title Always use default config as base for dataset tests Improve dataset test infrastructure Feb 25, 2021
@codecov
Copy link

codecov bot commented Feb 25, 2021

Codecov Report

Merging #3450 (75de680) into master (a24191e) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #3450   +/-   ##
=======================================
  Coverage   76.11%   76.11%           
=======================================
  Files         105      105           
  Lines        9697     9697           
  Branches     1556     1556           
=======================================
  Hits         7381     7381           
  Misses       1836     1836           
  Partials      480      480           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a24191e...75de680. Read the comment docs.

Copy link
Member

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@fmassa fmassa merged commit 2e8c124 into pytorch:master Feb 25, 2021
@pmeier pmeier deleted the test-datasets-default-config branch February 25, 2021 18:31
facebook-github-bot pushed a commit that referenced this pull request Mar 4, 2021
Summary:
* always use default config as base

* fix test_all_configs decorator

* lint

* add a utility function to create a random string

* move output check of inject_fake_data to dedicated method

* always disable download and extract functionality

Reviewed By: fmassa

Differential Revision: D26756261

fbshipit-source-id: a720dce48148d4d69678d43a2c5ec50ac92d69a0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants