Skip to content

Lock-free stream implementation #61

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
May 2, 2023
Merged

Conversation

martinling
Copy link
Member

@martinling martinling commented Feb 16, 2023

This PR adds a new lock-free implementation of the basic underlying data structure required by Packetry: an append-only data stream, backed by a file, for which we require fast random access to any part of the data whilst it continues to grow.

Packetry's current implementation, in the FileVec and HybridIndex types, uses file I/O which adds significant overhead. Both reading and writing require exclusive access to a stream object, which is currently managed by a single Mutex protecting the whole Capture database. This creates contention between the decoder and UI threads.

The new implementation creates two objects for each stream: a unique StreamWriter, and a clonable StreamReader. The writer and any number of readers may operate concurrently without blocking each other.

  • StreamReader creates memory-mapped regions on demand to access the file in blocks of a configurable size, which must be a multiple of the system page size. Each reader maintains a fixed-size cache of mappings for its most recently used blocks, using the LruBTreeMap type.

  • StreamWriter operates much like a conventional buffered writer, but uses interchangeable write buffers whose lifetimes are managed by Arc. To read stream data not yet written to the file, readers clone the Arc of the current write buffer, which they can then retain access to after the writer moves on to using another buffer.

Atomically switching from one write buffer to another is achieved using the ArcSwap type.

@martinling martinling mentioned this pull request Feb 22, 2023
7 tasks
@martinling martinling force-pushed the stream branch 2 times, most recently from 2a55fef to 0d96de9 Compare February 28, 2023 13:36
@martinling martinling marked this pull request as ready for review February 28, 2023 15:34
@martinling martinling force-pushed the stream branch 2 times, most recently from fca5718 to b7c5ac8 Compare March 8, 2023 20:06
@martinling
Copy link
Member Author

I've added a runtime check that the block size is a multiple of the system page size.

Copy link
Member

@antoinevg antoinevg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@martinling martinling requested review from miek and antoinevg May 1, 2023 16:34
Copy link
Member

@antoinevg antoinevg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@miek miek merged commit 77812b7 into greatscottgadgets:main May 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants