You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
Each StorageWriter iteration works along these lines
Fetch a sequence of operations from DurableLog
Process each operation in sequence.
When processing them, determine which Segment they belong to and initialize an appropriate SegmentAggregator for that segment if necessary.
Flush anything if needed
Ack.
The problem is that if, in step 3, the SegmentAggregator.initialize errors out (it needs to do a Storage.getStreamSegmentInfo on LTS), then the entire iteration is aborted and all non-processed operations are lost. They have already been fetched out of the DurableLog and now there's nowhere to get them from again.
Fortunately there are plenty of safeguards and sanity checks in the StorageWriter and SegmentAggregator classes to detect situations like these before any actual data loss or corruption happens (the data itself will still be in Tier 1 since it cannot be truncated out).
To Reproduce
See above.
Expected behavior
StorageWriter should be resilient in the face of SegmentAggregator.initialize errors and not "lose" in-flight operations.
The text was updated successfully, but these errors were encountered:
Cherry-picking these PRs:
#5841: Issue #5840: (SegmentStore) Fixed a deadlock in SegmentKeyCache.
#5851: Issue #5850: (SegmentStore) Fixed a bug in WriterTableProcessor where it would attempt to flush to a deleted segment.
#5586: Issue #5581: (SegmentStore) Disabling non-essential cache inserts if cache utilization is high
#5804: Issue #5789: (SegmentStore) Improving stability during Segment Container Recoveries
#5811: Issue #5810: (SegmentStore) Fixed a StorageWriter bug that could lead to data loss
#5783: Issue #5771: (SegmentStore) Reducing the amount of heap memory used when doing Table Segment Reads.
Signed-off-by: Andrei Paduroiu <andrei.paduroiu@emc.com>
Describe the bug
Each StorageWriter iteration works along these lines
DurableLog
SegmentAggregator
for that segment if necessary.The problem is that if, in step 3, the
SegmentAggregator.initialize
errors out (it needs to do aStorage.getStreamSegmentInfo
on LTS), then the entire iteration is aborted and all non-processed operations are lost. They have already been fetched out of theDurableLog
and now there's nowhere to get them from again.Fortunately there are plenty of safeguards and sanity checks in the
StorageWriter
andSegmentAggregator
classes to detect situations like these before any actual data loss or corruption happens (the data itself will still be in Tier 1 since it cannot be truncated out).To Reproduce
See above.
Expected behavior
StorageWriter should be resilient in the face of
SegmentAggregator.initialize
errors and not "lose" in-flight operations.The text was updated successfully, but these errors were encountered: