Skip to content

fix: Handle TCP segmentation in SLK file transitions#3925

Merged
as51340 merged 2 commits intomasterfrom
fix/slk-files-bug
Mar 25, 2026
Merged

fix: Handle TCP segmentation in SLK file transitions#3925
as51340 merged 2 commits intomasterfrom
fix/slk-files-bug

Conversation

@as51340
Copy link
Copy Markdown
Contributor

@as51340 as51340 commented Mar 18, 2026

What

Fix a bug in SLK stream processing where TCP segmentation during file transitions caused CheckStreamStatus to incorrectly return NEW_FILE instead of PARTIAL. The fix adds metadata completeness validation
in CheckStreamStatus() (src/slk/streams.cpp) when a kFileSegmentMask is encountered during a file transition (i.e., remaining_file_size is set), ensuring the full file metadata (filename string + file
size uint) is present before signaling the transition.

Why

When the SLK protocol transitions between files in a multi-file transfer, the sender emits kFileSegmentMask followed by the new file's metadata (filename + size). TCP can split the segment boundary such that
the mask arrives in one packet but the metadata arrives later. Without this fix, the OpenFile path in the Reader would attempt to parse incomplete metadata and fail with "Size data missing in SLK stream!".

How

In CheckStreamStatus, when kFileSegmentMask is encountered and remaining_file_size is set (indicating a file-to-file transition, not the first file), the code now performs a two-stage check before
returning NEW_FILE:

  1. String prefix check — verifies at least 9 bytes (1 marker + 8 length) are available after the mask.
  2. Full metadata check — reads the string length, then verifies the entire metadata block (string prefix + string data + uint marker + uint value) fits in the remaining buffer.

If either check fails, PARTIAL is returned with the appropriate stream_size hint so the caller knows how much more data to wait for. The check is gated on remaining_file_size being set, so first-file
detection is unaffected.

Testing

Added 5 unit tests in tests/unit/slk_streams.cpp:

  • FileTransitionMaskOnly — mask present but zero metadata bytes; expects PARTIAL
  • FileTransitionPartialStringPrefix — fewer than 9 bytes after mask; expects PARTIAL
  • FileTransitionPartialStringData — string prefix present but string body truncated; expects PARTIAL
  • FileTransitionSufficientMetadata — complete metadata present; expects NEW_FILE (positive case)
  • FirstFileMaskNotAffectedByCheck — no remaining_file_size; verifies the new guard doesn't regress first-file detection

Notes for reviewers

  • The stream_size values returned in the PARTIAL branches are best-effort hints (pos + kSegmentMaxTotalSize when the string length is unknown, pos + needed when it is). Verify these are sufficient for
    the caller's read-more logic.
  • The guard str_len <= kSegmentMaxDataSize protects against corrupted length fields — if the length is impossibly large, the code falls through to the existing NEW_FILE return, which will let the Reader
    surface a more specific error during actual parsing

Fixed CheckStreamStatus() to request additional data when the
file-segment mask arrives but the next file's metadata (filename +
filesize) hasn't been received yet due to TCP segmentation.
@as51340 as51340 added this to the mg-v3.10.0 milestone Mar 18, 2026
@as51340 as51340 self-assigned this Mar 18, 2026
@as51340 as51340 added bug bug Docs - changelog only Docs - changelog only CI -build=coverage -test=core Run coverage build and core tests on push CI -build=jepsen -test=core Run jepsen build and core tests on push CI -build=debug -test=core Run debug build and core tests on push CI -build=release -test=core Run release build and core tests on push CI -build=release -test=e2e Run release build and e2e tests on push labels Mar 18, 2026
@as51340
Copy link
Copy Markdown
Contributor Author

as51340 commented Mar 18, 2026

Tracking

  • [Link to Epic/Issue]

Standard development

CI Testing Labels

  • Select the appropriate CI test labels (CI -build=build-name -test=test-suite)

Documentation checklist

  • Add the documentation label
  • Add the bug / feature label
  • Add the milestone for which this feature is intended
    • If not known, set for a later milestone
  • Write a release note, including added/changed clauses
    • Fixed a bug where large file transfers between Memgraph instances (such as snapshots during replication) could fail with a "Size data missing in SLK stream!" error when TCP split the data at an unfortunate
      boundary between consecutive files. This was most likely to occur under high network load or with large snapshot files. No action is required — upgrading to this version resolves the issue automatically. #3925
    • What has changed? What does it mean for a user? What should a user do with it? [#{{PR_number}}]({{link to the PR}})
  • [ Documentation PR link memgraph/documentation#XXXX ]
    • Is back linked to this development PR

@sonarqubecloud
Copy link
Copy Markdown

Quality Gate Failed Quality Gate failed

Failed conditions
B Maintainability Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

@as51340 as51340 requested a review from andrejtonev March 20, 2026 12:23
@as51340 as51340 marked this pull request as ready for review March 20, 2026 12:23
@as51340 as51340 added this pull request to the merge queue Mar 25, 2026
Merged via the queue into master with commit ead7b30 Mar 25, 2026
43 of 44 checks passed
@as51340 as51340 deleted the fix/slk-files-bug branch March 25, 2026 16:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug bug CI -build=coverage -test=core Run coverage build and core tests on push CI -build=debug -test=core Run debug build and core tests on push CI -build=jepsen -test=core Run jepsen build and core tests on push CI -build=release -test=core Run release build and core tests on push CI -build=release -test=e2e Run release build and e2e tests on push Docs - changelog only Docs - changelog only

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants