Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ensure all warcwriter write operations go through a queue. #528

Merged
merged 1 commit into from
Apr 4, 2024

Conversation

ikreymer
Copy link
Member

@ikreymer ikreymer commented Apr 4, 2024

Currently, only the recorder's WARCWriter writes records through a queue, resulting in other WARCs potentially suffering from concurrent write attempts. This fixes that by:

  • adding the concurrent queue to WARCWriter itself
  • all writeRecord, writeRecordPair, writeNewResourceRecord calls are first added to the PQueue, which ensures writes happen in order and one-at-a-time
  • flush() also ensures queue is empty/idle
  • should avoid any issues with concurrent writes to any WARC

…rder:

- add warcQ to WARCWriter itself
- all writeRecord, writeRecordPair, writeNewResourceRecord just add to the PQueue,
which ensures writes happen in order and one-at-a-time
- flush also ensures queue is empty/idle
- should avoid any issues with concurrent writes to any WARC
@ikreymer ikreymer requested a review from tw4l April 4, 2024 00:50
Copy link
Contributor

@tw4l tw4l left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this! Tested and code looks good

@ikreymer ikreymer merged commit 98f6445 into main Apr 4, 2024
4 checks passed
@ikreymer ikreymer deleted the warcwriter-queue branch April 4, 2024 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants