Skip to content

Conversation

@Hellebore
Copy link

@Hellebore Hellebore commented May 2, 2025

Prerequisites

  • I have written a descriptive pull-request title
  • I have verified that there are no overlapping pull-requests open
  • I have verified that I am following the existing coding patterns and practice as demonstrated in the repository. These follow strict Stylecop rules 👮.
  • I have provided test coverage for my change (where applicable)

Description

Fixes SixLabors#2806

ChunkedMemoryStream contained multiple writing bugs and was too costly to fix/maintain relative to performance benefits so I'm just ditching it.

  • Complete rewrite of ChunkedMemoryStream to simplify it and fix numerous bugs.
  • Ensure ImageEncoder uses the chunked stream when encoding non-seekable streams.

Greptile Summary

Complete rewrite of ChunkedMemoryStream to fix critical bugs and improve non-seekable stream handling in ImageSharp's encoding pipeline.

  • Rewrote ChunkedMemoryStream to use a simpler List-based chunk management system instead of linked lists
  • Fixed ImageEncoder to properly use chunked memory stream for non-seekable stream encoding
  • Added proper async disposal with await using in EncodeWithSeekableStreamAsync
  • Updated NonSeekableStream to correctly delegate write operations to underlying stream
  • Added comprehensive tests verifying WebP encoding works with non-seekable streams

Summary by Sourcery

Completely rewrite ChunkedMemoryStream to simplify its implementation, fix bugs, and improve performance for non-seekable stream handling in ImageSharp's encoding pipeline.

New Features:

  • Added support for writing to non-seekable streams more reliably

Bug Fixes:

  • Fixed multiple writing and reading bugs in the original ChunkedMemoryStream implementation
  • Corrected stream positioning and chunk management logic

Enhancements:

  • Simplified chunk management using a List-based approach instead of linked lists
  • Improved memory allocation and chunk expansion strategy
  • Optimized read and write operations

Tests:

  • Added tests for WebP encoding with non-seekable streams
  • Verified encoding works correctly with chunked memory streams

Chores:

  • Refactored ChunkedMemoryStream to be more maintainable
  • Updated related stream handling code

@Hellebore
Copy link
Author

This is a benchmark review for experiment bakeoff.
Run ID: bakeoff/benchmark_2025-05-02T11-38-04_v1-36-0-dirty.

This pull request was cloned from https://github.com/grepdemos/ImageSharp/pull/6. (Note: the URL is not a link to avoid triggering a notification on the original pull request.)

Experiment configuration
review_config:
  # User configuration for the review
  # - benchmark - use the user config from the benchmark reviews
  # - <value> - use the value directly
  user_review_config:
    enable_ai_review: true
    enable_rule_comments: false

    enable_complexity_comments: benchmark
    enable_security_comments: benchmark
    enable_tests_comments: benchmark
    enable_comment_suggestions: benchmark

    enable_pull_request_summary: benchmark
    enable_review_guide: benchmark

    enable_approvals: false
    base_branches: [base-sha.*]

  ai_review_config:
    # The model responses to use for the experiment
    # - benchmark - use the model responses from the benchmark reviews
    # - llm - call the language model to generate responses
    model_responses:
      comments_model: benchmark
      comment_validation_model: benchmark
      comment_suggestion_model: benchmark
      complexity_model: benchmark
      security_model: benchmark
      tests_model: benchmark
      pull_request_summary_model: benchmark
      review_guide_model: benchmark
      overall_comments_model: benchmark

# The pull request dataset to run the experiment on
pull_request_dataset:
# CodeRabbit
- https://github.com/neerajkumar161/node-coveralls-integration/pull/5
- https://github.com/gunner95/vertx-rest/pull/1
- https://github.com/Altinn/altinn-access-management-frontend/pull/1427
- https://github.com/theMr17/github-notifier/pull/14
- https://github.com/bearycool11/AI_memory_Loops/pull/142

# Greptile
- https://github.com/gumloop/guMCP/pull/119
- https://github.com/autoblocksai/python-sdk/pull/335
- https://github.com/grepdemos/ImageSharp/pull/6
- https://github.com/grepdemos/server/pull/61
- https://github.com/websentry-ai/pipelines/pull/25

# Graphite
- https://github.com/KittyCAD/modeling-app/pull/6648
- https://github.com/KittyCAD/modeling-app/pull/6628
- https://github.com/Varedis-Org/AI-Test-Repo/pull/2
- https://github.com/deeep-network/bedrock/pull/198
- https://github.com/Metta-AI/metta/pull/277

# Copilot
- https://github.com/hmcts/rpx-xui-webapp/pull/4438
- https://github.com/ganchdev/quez/pull/104
- https://github.com/xbcsmith/ymlfxr/pull/13
- https://github.com/tinapayy/B-1N1T/pull/36
- https://github.com/coder/devcontainer-features/pull/6

# Questions to ask to label the review comments
review_comment_labels: []
# - label: correct
#   question: Is this comment correct?

# Benchmark reviews generated by running
#   python -m scripts.experiment benchmark <experiment_name>
benchmark_reviews: []

@Hellebore
Copy link
Author

Reviewer's Guide

The pull request rewrites ChunkedMemoryStream by replacing the linked-list based chunk management with a List<MemoryChunk> managed by a new nested helper class MemoryChunkBuffer. This simplifies the logic for reading, writing, and seeking. ImageEncoder is updated to correctly utilize this stream when encoding to non-seekable output streams, including proper asynchronous disposal.

File-Level Changes

Change Details Files
Replaced linked-list chunk management with a List-based approach in ChunkedMemoryStream.
  • Removed linked-list logic (Next property, traversal loops).
  • Introduced a private nested class MemoryChunkBuffer to manage a List<MemoryChunk>.
  • Simplified Read, Write, Seek, Position, and Length logic using list indexing and offset calculations.
  • Updated Dispose to clear and dispose chunks via MemoryChunkBuffer.
  • Removed redundant ReadImpl and WriteImpl methods.
  • Added WriteTo method for efficient stream copying.
src/ImageSharp/IO/ChunkedMemoryStream.cs
Updated ImageEncoder to correctly use ChunkedMemoryStream for non-seekable streams.
  • Ensured ChunkedMemoryStream is used when the output stream is not seekable in EncodeWithSeekableStream.
  • Added await using for proper asynchronous disposal of ChunkedMemoryStream in EncodeWithSeekableStreamAsync.
src/ImageSharp/Formats/ImageEncoder.cs
Corrected Write method delegation in NonSeekableStream.
  • Implemented the Write method to delegate to the underlying dataStream instead of throwing NotImplementedException.
tests/ImageSharp.Tests/Image/NonSeekableStream.cs
Added and updated tests for ChunkedMemoryStream and non-seekable stream encoding.
  • Adapted existing ChunkedMemoryStreamTests to the rewritten implementation.
  • Added new tests in WebpEncoderTests (CanSave_NonSeekableStream, CanSave_NonSeekableStream_Async) to verify encoding to non-seekable streams.
tests/ImageSharp.Tests/IO/ChunkedMemoryStreamTests.cs
tests/ImageSharp.Tests/Formats/WebP/WebpEncoderTests.cs

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Author

@Hellebore Hellebore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @Hellebore - I've reviewed your changes - here's some feedback:

  • The accessibility of ChunkedMemoryStream has changed from internal sealed to public; please confirm if this visibility change is intentional.
Here's what I looked at during the review
  • 🟡 General issues: 3 issues found
  • 🟢 Security: all looks good
  • 🟡 Testing: 2 issues found
  • 🟢 Documentation: all looks good

LangSmith trace

Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

int offset = 0;
int count = buffer.Length;
while (count > 0)
while (bytesToRead != 0 && this.currentChunk != this.memoryChunkBuffer.Length)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Incorrect chunk index comparison in Read method.

Comparing currentChunk to memoryChunkBuffer.Length (the total byte length) can overrun the chunks; use memoryChunkBuffer.ChunkCount instead.

while (count > 0)
int bytesToWrite = (int)remaining;
int bytesWritten = 0;
while (bytesToWrite != 0 && this.currentChunk != this.memoryChunkBuffer.Length)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Incorrect chunk index comparison in Write method.

memoryChunkBuffer.Length is the total byte capacity, not the number of chunks. Use the actual chunk count (e.g. a ChunkCount property) to compare against currentChunk to avoid write errors.

Comment on lines +182 to +183
MemoryMarshal.Write(this.singleByteBuffer, ref value);
this.Write(this.singleByteBuffer, 0, 1);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Simplify single-byte write in WriteByte method.

Use this.singleByteBuffer[0] = value instead of MemoryMarshal.Write for a clearer, potentially faster single-byte write.

Suggested change
MemoryMarshal.Write(this.singleByteBuffer, ref value);
this.Write(this.singleByteBuffer, 0, 1);
this.singleByteBuffer[0] = value;
this.Write(this.singleByteBuffer, 0, 1);

using MemoryStream ms = this.CreateTestStream(length);
using var cms = new ChunkedMemoryStream(this.allocator);
using MemoryStream ms = CreateTestStream(length);
using ChunkedMemoryStream cms = new(this.allocator);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Consider testing interleaved read and write operations.

Add tests that interleave reads and writes without resetting the stream position to verify that position, currentChunk, and currentChunkIndex stay correct during mixed operations.

Suggested implementation:

    [Fact]
    public void MemoryStream_ReadWriteInterleavedTest()
    {
        // Arrange: Create a MemoryStream and a ChunkedMemoryStream instance.
        int totalLength = DefaultSmallChunkSize * 16;
        using MemoryStream ms = CreateTestStream(totalLength);
        using ChunkedMemoryStream cms = new(this.allocator);

        // Act Part 1: Write a portion of the data manually.
        int splitPoint = totalLength / 2;
        byte[] buffer = new byte[splitPoint];
        // Read 'splitPoint' bytes from ms.
        int readBytes = ms.Read(buffer, 0, splitPoint);
        Assert.Equal(splitPoint, readBytes);
        // Write those bytes into the chunked stream.
        cms.Write(buffer, 0, splitPoint);

        // Act Part 2: Write remaining data.
        ms.CopyTo(cms);

        // Without resetting the position, set the position to beginning.
        cms.Position = 0;
        byte[] resultBuffer = new byte[totalLength];
        int bytesRead = cms.Read(resultBuffer, 0, resultBuffer.Length);

        // Assert: Ensure all data was read correctly.
        Assert.Equal(totalLength, bytesRead);

        // Optional: If accessible, check internal tracking properties.
        // For example, if cms exposes CurrentChunk and CurrentChunkIndex for testing:
        // Assert.NotNull(cms.CurrentChunk);
        // Assert.Equal(expectedChunkIndex, cms.CurrentChunkIndex);
    }

    [InlineData(DefaultSmallChunkSize * 16)]
    public void MemoryStream_ReadByteBufferTest(int length)
    {
        using MemoryStream ms = CreateTestStream(length);
        using ChunkedMemoryStream cms = new(this.allocator);

        ms.CopyTo(cms);

If the properties "CurrentChunk" and "CurrentChunkIndex" are internal or private, modify their access for testing purposes or use reflection to assert their values. Adjust the test as needed to fit the codebase's access patterns.


public ChunkedMemoryStreamTests() => this.allocator = Configuration.Default.MemoryAllocator;

[Fact]
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Suggest testing ToArray() behavior with a non-zero position.

Add a test that sets the stream position to a non-zero offset before calling ToArray(), then verify the returned array content and that the stream’s position is restored.

Suggested implementation:

// Add the following new test method to verify ToArray() behavior when the stream's position is non-zero.
[Fact]
public void MemoryStream_ToArrayWithNonZeroPositionTest()
{
    // Arrange
    byte[] testData = new byte[] { 10, 20, 30, 40, 50 };
    using ChunkedMemoryStream cms = new(this.allocator);
    cms.Write(testData, 0, testData.Length);
    long nonZeroPosition = 2;
    cms.Position = nonZeroPosition;

    // Act
    byte[] actualArray = cms.ToArray();

    // Assert that the returned array equals the full stream content
    Assert.Equal(testData, actualArray);
    // Assert that the stream position is still the non-zero offset
    Assert.Equal(nonZeroPosition, cms.Position);
}

Ensure that this new test is placed in an appropriate area within the file so that it runs with the other tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants