Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dataset validation step and refactoring data modules #417

Merged
merged 23 commits into from
Apr 17, 2024

Conversation

illian01
Copy link
Collaborator

Description

Please include a summary of this pull request in English. If it closes an issue, please mention it here.

Closes: #253

We recommend to link at least one existing issue for PR. Before your create a PR, please check if there is an issue for this change.

Change(s)

  • Massive refactoring
    • Rename BaseSampler to BaseSampleLoader.
    • Define overlapping logics of various tasks in BaseSampleLoader.
    • Compress task specific modules to one file per task.
  • Check dataset path at the start of build_dataset.
    • For train, train split of dataset must be provided, test split will be ignored.
    • For test, test split of dataset must be provided, train and valid split will be ignored.
  • Check loaded dataset at the end of build_dataset.
    • log dataset volume.
    • If loaded dataset has no samples, raise error.

Code Formatting

If you PR to either master or dev branch, you should follow the code linting process. Please check your code with lint_check.sh in ./scripts directory.
For more information, please read the contribution guide in CONTRIBUTING.md.

Changelog

If your PR is granted to dev branch, codeowner will add a brief summary of the change to the Upcoming Release section of the CHANGELOG.md file including a link to the PR (formatted in markdown) and a link to your github profile.

For example,

- Added a new feature by `@myusername` in [PR 2023](https://github.com/Nota-NetsPresso/netspresso-trainer/pull/2023)

@illian01 illian01 added enhancement New feature or request refactoring Changes code or architecture internally without changing the output. labels Apr 16, 2024
@illian01 illian01 self-assigned this Apr 16, 2024
@illian01 illian01 marked this pull request as ready for review April 16, 2024 09:45
@illian01 illian01 merged commit c34875f into dev Apr 17, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request refactoring Changes code or architecture internally without changing the output.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Sprint] Adding dataset validator
1 participant