-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds streaming append capability. #11
Conversation
Append threshold when appending to store is calculated as follows: append_threshold = remaining_store_capacity + max_store_overflow BREAKING CHANGE: adds new member max_store_overflow to segment::Config
… if no read segments are present
Codecov ReportPatch coverage:
📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more Additional details and impacted files@@ Coverage Diff @@
## develop #11 +/- ##
===========================================
+ Coverage 80.57% 87.95% +7.38%
===========================================
Files 19 37 +18
Lines 942 4875 +3933
===========================================
+ Hits 759 4288 +3529
- Misses 183 587 +404
☔ View full report in Codecov by Sentry. |
Add development updates from #11
laminarmq
specific enhancements to thesegmented_log
data structureWhile the conventional
segmented_log
data structure is quite performant for acommit_log
implementation,it still requires the following properties to hold true for the record being appended:
It's not possible to know this information when the record bytes are read from an asynchronous stream of
bytes. Without the enhancements, we would have to concatenate intermediate byte buffers to a vector.
This would not only incur more allocations, but also slow down our system.
Hence, to accommodate this use case, we introduced an intermediate indexing layer to our design.
Fig: Data organisation for persisting the
segmented_log
data structure on a*nix
file system.In the new design, instead of referring to records with a raw offset, we refer to them with indices. The
index in each segment translates the record indices to raw file position in the segment store file.
Now, the store append operation accepts an asynchronous stream of bytes instead of a contiguously laid out
slice of bytes. We use this operation to write the record bytes, and at the time of writing the record
bytes, we calculate the record bytes' length and checksum. Once we are done writing the record bytes to
the store, we write its corresponding
record_header
(containing the checksum and length), position andindex as an
index_record
in the segment index.This provides two quality-of-life enhancements:
Now, to prevent a malicious user from overloading our storage capacity and memory with a maliciously
crafted request which infinitely loops over some data and sends it to our server, we have provided an
optional
append_threshold
parameter to all append operations. When provided, it prevents streamingappend writes to write more bytes than the provided
append_threshold
.At the segment level, this requires us to keep a segment overflow capacity. All segment append operations
now use
segment_capacity - segment.size + segment_overflow_capacity
as theappend_threshold
value.A good
segment_overflow_capacity
value could besegment_capacity / 2
.