-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix invalid parsing for old export cache cleanup calls #8039
Conversation
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughThe recent modifications update the CVAT application's dataset manager tests and utility functionality for handling export paths. Tests were enhanced to accommodate new export path scenarios, and utility functions were refined for better parsing of export path details, including handling of timestamps and format tags. Changes
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (2)
- cvat/apps/dataset_manager/tests/test_rest_api_formats.py (4 hunks)
- cvat/apps/dataset_manager/util.py (2 hunks)
Additional comments not posted (3)
cvat/apps/dataset_manager/util.py (2)
220-221
: The use of the walrus operator (:=
) here is a modern Python feature that improves readability by combining the assignment and condition in a single expression.
198-198
: Ensure the regex pattern used here handles both the old and new filename formats correctly.cvat/apps/dataset_manager/tests/test_rest_api_formats.py (1)
Line range hint
1975-2015
: The test simulates an old file path and uses backward compatibility to ensure that cleanup tasks can still be executed on files from previous versions. This is crucial for avoiding disruptions in environments where long-running tasks might still be using older versions of the API.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #8039 +/- ##
===========================================
- Coverage 83.65% 83.63% -0.03%
===========================================
Files 383 383
Lines 40440 40440
Branches 3815 3815
===========================================
- Hits 33830 33821 -9
- Misses 6610 6619 +9
|
cvat/apps/dataset_manager/util.py
Outdated
@@ -195,7 +195,7 @@ def parse_export_file_path(file_path: os.PathLike[str]) -> ParsedExportFilename: | |||
( | |||
r'(?P<export_mode>dataset|annotations)' | |||
r'(?:-instance(?P<instance_timestamp>\d+\.\d+))?' # optional for backward compatibility | |||
r'-(?P<format_tag>.+)' | |||
r'(?(instance_timestamp)-|_)(?P<format_tag>.+)' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The conditional regex here seems unnecessary. Wouldn't it be possible to express the same thing as (?:-instance(?P<instance_timestamp>\d+\.\d+)-|_)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, seems to be working as well.
cvat/apps/dataset_manager/util.py
Outdated
r'(?:-instance(?P<instance_timestamp>\d+\.\d+))?' # optional for backward compatibility | ||
r'-(?P<format_tag>.+)' | ||
# optional for backward compatibility | ||
r'(?:-instance(?P<instance_timestamp>\d+\.\d+)-|_)?' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r'(?:-instance(?P<instance_timestamp>\d+\.\d+)-|_)?' | |
r'(?:-instance(?P<instance_timestamp>\d+\.\d+)-|_)' |
At least a separator must be present, right?
Quality Gate passedIssues Measures |
Motivation and context
In #7864, a new file naming scheme was introduced for dataset cache entries, while the old naming convention was deprecated. The old names were parsed incorrectly, leading to failing cache cleanup attempts in
clear_export_cache()
. A test was added in that PR, but it didn't reproduce the old behavior at the full extent.ValueError: Couldn't parse filename components in 'annotations_cvat-for-images-11.ZIP
errors)How has this been tested?
Unit tests.
Checklist
develop
branch(cvat-canvas,
cvat-core,
cvat-data and
cvat-ui)
License
Feel free to contact the maintainers if that's a concern.
Summary by CodeRabbit
instance_timestamp
andformat_tag
in dataset exports.