Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#795] Show error when adding duplicate warc file #797

Merged
merged 2 commits into from
Jan 19, 2023

Conversation

kuechensofa
Copy link
Contributor

Description

Show an error when adding a warc file that has the same filename as an existing file in the collection and do not overwrite the existing file.

Motivation and Context

See #795

Types of changes

  • Replay fix (fixes a replay specific issue)
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added or updated tests to cover my changes.
  • All new and existing tests passed.

Show an error when adding a warc file that has the same filename as an existing file in the collection and do not overwrite the existing file.
@tw4l tw4l self-requested a review January 18, 2023 20:07
@tw4l
Copy link
Contributor

tw4l commented Jan 18, 2023

Thank you for the bug report and patch @kuechensofa! It would be great if instead of immediately raising an IOError when encounting a warc whose names already exists in the collection, we allowed the other warcs (those without naming conflicts) to be added and indexed, and throw the exception at the end of the process with the list of warcs that weren't able to be added.

Would you be able to make that change? Otherwise happy to do it myself.

@kuechensofa
Copy link
Contributor Author

Thank you for the bug report and patch @kuechensofa! It would be great if instead of immediately raising an IOError when encounting a warc whose names already exists in the collection, we allowed the other warcs (those without naming conflicts) to be added and indexed, and throw the exception at the end of the process with the list of warcs that weren't able to be added.

Would you be able to make that change? Otherwise happy to do it myself.

Thanks for the reply! I'm happy to implement that change.

Skip duplicate warcs and print a warning but add and index warcs that don't have duplicates.
@tw4l tw4l merged commit e6ec8b4 into webrecorder:main Jan 19, 2023
@tw4l
Copy link
Contributor

tw4l commented Jan 19, 2023

thanks @kuechensofa ! merged into main and will be included in the next point release :)

@kuechensofa
Copy link
Contributor Author

@tw4l Thanks, for merging! :)

@kuechensofa kuechensofa deleted the 795-duplicate-warc-filename-error branch January 19, 2023 16:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants