Slowdown in MagickImage.ToByteArray on arm64 Mac #1447

schnoberts1 · 2023-10-04T04:08:23Z

Magick.NET version

Magick.NET-Q8-arm64 13.1.2

Environment (Operating system, version and so on)

arm64, Mac M1, .NET 7.0.11/7.0.401

Description

            using var image = new MagickImage(stream, new MagickReadSettings { Format = MagickFormat.Psd });
            image.ToByteArray();

takes minutes in 13.1.2 when it used to take seconds for a 3k/4k file of reasonable image complexity. All the additional time is in ToByteArray. In 13.1.2 there's an overflow exception and in 13.1.0 the speed is as I expect.

I scaled the image down so I can attach it (if I can work out how to do that) and the perforce is 25ms for 13.1.0 and 4123ms for 13.1.2. Note: the slow down is less extreme when there is less detail in the image.

Steps to Reproduce

Create a PSD (RGB/8 is fine), make it say 1000 x 1000 and ensure it has some detail in it. Time the code above. Increase the image size and complexity to make the timing difference larger. My Mac has 64GB of RAM and an Apple M1 Max processor.

The text was updated successfully, but these errors were encountered:

dlemstra · 2023-10-04T05:49:00Z

Have you tested this with the most recent version?

schnoberts1 · 2023-10-04T07:56:02Z

Yes, same issue. On 4 Oct 2023, at 06:49, Dirk Lemstra ***@***.***> wrote: Have you tested this with the most recent version? —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: ***@***.***>

dlemstra · 2023-10-04T15:47:42Z

Do you get the "old" performance when you do this instead:

using var stream = new MemoryStream();
image.Write(stream);
var bytes = stream.ToArray();

schnoberts1 · 2023-10-04T16:07:08Z

Writing the image into a MemoryStream and then getting the array takes < 1ms on both new and old version. [CORRECTION: ~20-25ms]

dlemstra · 2023-10-04T16:15:07Z

I really wonder what kind of measurements you are using. This is what was happening inside the old version of ToByteArray and that should not be less than 1ms when you are writing a PSD image. Can you create a small project with an image on GitHub that reproduces what you are seeing?

schnoberts1 · 2023-10-04T16:23:44Z

namespace Ezen.Uploader.Tests;

using ImageMagick;
using System.IO;
using System.Diagnostics;
public class Tests
{
    [SetUp]
    public void Setup()
    {
    }

    [Test]
    public void Test1()
    {
        FileStream stream = File.OpenRead("/Users/andy/code/ezen/ezen-uploader/JPN485.psd");
        using var image = new MagickImage(stream, new MagickReadSettings { Format = MagickFormat.Psd });

        var s = Stopwatch.StartNew();
        var ba = image.ToByteArray();        
        Console.WriteLine($"ToByteArray {s.ElapsedMilliseconds}ms {ba.Length} bytes");

        s = Stopwatch.StartNew();
        using var mstream = new MemoryStream();
        image.Write(mstream);
        var bytes = mstream.ToArray();
        Console.WriteLine($"ToArray {s.ElapsedMilliseconds}ms {bytes.Length} bytes");

        Assert.Pass();
    }
}

csproj snippet:

  <Choose>
    <When Condition="'$(Platform)' == 'x64'">
      <ItemGroup>
        <PackageReference Include="Magick.NET-Q8-x64" Version="13.3.0"/>
      </ItemGroup>
    </When>
  </Choose>
  <Choose>
    <When Condition="'$(Platform)' == 'arm64'">
      <ItemGroup>
        <PackageReference Include="Magick.NET-Q8-arm64" Version="13.3.0"/>
      </ItemGroup>
    </When>
  </Choose>

Run test with 13.3.0

(base) andy@ice ezen-uploader % dotnet test -c release Ezen.Uploader.Tests  -p:Platform=arm64 -l console
  Determining projects to restore...
  All projects are up-to-date for restore.
  Ezen.Uploader -> /Users/andy/code/ezen/ezen-uploader/Ezen.Uploader/bin/arm64/release/net7.0/Ezen.Uploader.dll
  Ezen.Uploader.Tests -> /Users/andy/code/ezen/ezen-uploader/Ezen.Uploader.Tests/bin/arm64/release/net7.0/Ezen.Uploader.Tests.dll
Test run for /Users/andy/code/ezen/ezen-uploader/Ezen.Uploader.Tests/bin/arm64/release/net7.0/Ezen.Uploader.Tests.dll (.NETCoreApp,Version=v7.0)
Microsoft (R) Test Execution Command Line Tool Version 17.7.1 (arm64)
Copyright (c) Microsoft Corporation.  All rights reserved.

Starting test execution, please wait...
A total of 1 test files matched the specified pattern.
ToByteArray 4167ms 7842052 bytes
ToArray 22ms 7842052 bytes

Run test with 13.1.0

(base) andy@ice ezen-uploader % dotnet test -c release Ezen.Uploader.Tests  -p:Platform=arm64 -l console
  Determining projects to restore...
  Restored /Users/andy/code/ezen/ezen-uploader/Ezen.Uploader.Tests/Ezen.Uploader.Tests.csproj (in 148 ms).
  Restored /Users/andy/code/ezen/ezen-uploader/Ezen.Uploader/Ezen.Uploader.csproj (in 148 ms).
  Ezen.Uploader -> /Users/andy/code/ezen/ezen-uploader/Ezen.Uploader/bin/arm64/release/net7.0/Ezen.Uploader.dll
  Ezen.Uploader.Tests -> /Users/andy/code/ezen/ezen-uploader/Ezen.Uploader.Tests/bin/arm64/release/net7.0/Ezen.Uploader.Tests.dll
Test run for /Users/andy/code/ezen/ezen-uploader/Ezen.Uploader.Tests/bin/arm64/release/net7.0/Ezen.Uploader.Tests.dll (.NETCoreApp,Version=v7.0)
Microsoft (R) Test Execution Command Line Tool Version 17.7.1 (arm64)
Copyright (c) Microsoft Corporation.  All rights reserved.

Starting test execution, please wait...
A total of 1 test files matched the specified pattern.
ToByteArray 27ms 7842052 bytes
ToArray 23ms 7842052 bytes

The output of file on the DLLs is PE32+ executable (console) Aarch64 Mono/.Net assembly, for MS Windows

I'll pull this into a project if this isn't enough, let me know. I'll send you the image.

schnoberts1 · 2023-10-04T16:25:04Z

My mistake, not 1ms, 23ish. I guess I must have read 21 as '1' :(

schnoberts1 · 2023-10-04T16:47:48Z

... just so I can follow on at home:

So versions that perform differently are post a change where ToByteArray was changed to look a lot like the Stream writer:

   public byte[] ToByteArray()
    {
        _settings.FileName = null;

        using var wrapper = new ByteArrayWrapper();
        var writer = new ReadWriteStreamDelegate(wrapper.Write);
        var seeker = new SeekStreamDelegate(wrapper.Seek);
        var teller = new TellStreamDelegate(wrapper.Tell);
        var reader = new ReadWriteStreamDelegate(wrapper.Read);

        _nativeInstance.WriteStream(_settings, writer, seeker, teller, reader);

        return wrapper.GetBytes();
    }

... what you are alluding to is the memory stream is how ToByteArray used to look:

            using var stream = new MemoryStream();
            Write(stream);
            return stream.ToArray();

and Write looks a lot like what ToByteArray was changed to:

            Throw.IfNull(nameof(stream), stream);

            _settings.FileName = null;

            using var wrapper = StreamWrapper.CreateForWriting(stream);
            var writer = new ReadWriteStreamDelegate(wrapper.Write);
            ReadWriteStreamDelegate? reader = null;
            SeekStreamDelegate? seeker = null;
            TellStreamDelegate? teller = null;

            if (stream.CanSeek)
            {
                seeker = new SeekStreamDelegate(wrapper.Seek);
                teller = new TellStreamDelegate(wrapper.Tell);
            }

            if (stream.CanRead)
                reader = new ReadWriteStreamDelegate(wrapper.Read);

            _nativeInstance.WriteStream(_settings, writer, seeker, teller, reader);

I have to agree, these are very very similar things.

schnoberts1 · 2023-10-04T17:17:30Z

I ran both versions through a profiler. In my test code using version 13.3.0 it's spending a lot of time in Seek because Seek is spending its time calling System.Array.Resize 7269 times. I can't find any calls to Resize in the profile for 13.1.0.

ByyeArrayWrapper.Seek in 13.3.0 looks like:

    public long Seek(long offset, IntPtr whence, IntPtr user_data)
    {
        AppendBufferToBytes();

        var newOffset = (int)((SeekOrigin)whence switch
        {
            SeekOrigin.Begin => offset,
            SeekOrigin.Current => _offset + offset,
            SeekOrigin.End => _bytes.Length - offset,
            _ => -1,
        });

        if (_offset == newOffset)
            return _offset;

        if (newOffset < 0)
            return -1;
        else if (newOffset > _bytes.Length)
            ResizeBytes(newOffset);

        _offset = newOffset;
        return _offset;
    }

... could the interaction of this with the calling function be the smoking gun. In 13.1.0 StreamWrapper is used and there Seek delegates to the Stream passed in, which presumably is doing something quite different in Seek than the ByteArrayWrapper as it’s a core .NET library and likely doesn’t increment in the way ByteArrayWrapper does.

schnoberts1 · 2023-10-05T07:13:48Z

... this also explains why the slow down is related to the complexity of the image (something I noticed yesterday). If the image is entirely white (one of my test cases that didn't exhibit as bad a slowdown) then perhaps it's uncompressed in a very small number of chunks as the information content is low, so the array size is grown to size in a small number of Resize calls.

I don't know the codebase very well at all, am I heading in the right direction here?

If I am right then I should be able to replicate this in a test in the Native repo. I'll give that a go in the next 3-4 weeks. If I replicate it then I'll think about a fix if you would like.

dlemstra · 2023-10-05T19:31:25Z

Thanks for all the research. I was suspecting this was the issue but I was not sure about it. The ToByteArray method was changed to reduce memory allocations.

Would it be possible to share an image that I can use to reproduce the issue? I might be able to find a workaround for this myself then. Maybe this would work ResizeBytes(newOffset + BufferSize);?

schnoberts1 · 2023-10-06T08:31:14Z

I thought I had shared an image but didn’t so I will do that ASAP. On 5 Oct 2023, at 20:31, Dirk Lemstra ***@***.***> wrote: Thanks for all the research. I was suspecting this was the issue but I was not sure about it. Would it be possible to share an image that I can use to reproduce the issue? I might be able to find a workaround for this myself then. Maybe this would work ResizeBytes(newOffset + BufferSize);? —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: ***@***.***>

schnoberts1 · 2023-10-06T11:08:14Z

I would imagine so. It'd likely reduce the number of Resize calls which would also have the handy feature of reducing the chances of heap fragmentation requiring the whole array is moved in memory as part of resizing so it can maintain a contiguous chunk of memory (assuming that's how C# arrays work, I've only been coding in C# for less than a month part time so I know very little.

I would think you don't need to allocate memory on a Seek as no Read or Write is happening. You could consider Seek() a declaration of intent. If you took that approach then you only need the array resized when something happens, e.g. when reads or writes overlap a region that was intended to be resized via a previous seek and has yet to be resized. Write already handles something similar to this case but it'd need adjusting and you'd need to add Resize support to Read to handle when Read is reading off the end of the allocated area into the "to be allocated" area. It also means you can Seek around as you wish without incurring cost until you do something. That being said, I don't know what the call pattern of the Readers/Writers is in this library.

Re: reproduction, a simple test that just seeks forward 1 byte at a time is going to produce this problem on arm64 (I guess). I'd implement this test myself but with my inexperience I can't get the stuff building on osx-arm64. dotnet build -r osx-arm64 Magick.NET.sln is just generating some complaint that NETSDK1100: To build a project targeting Windows on this operating system, set the EnableWindowsTargeting property to true. which is broadly equivalent to "pilot does not know what he is doing" :)

schnoberts1 · 2023-10-06T11:20:01Z

I've just checked the C# Memory Stream implementation and it takes the approach I describe above, which is why that performs in a more consistent fashion. Look at Seek() in https://github.com/microsoft/referencesource/blob/master/mscorlib/system/io/memorystream.cs.

dlemstra · 2023-10-13T14:18:00Z

Could you please share a test file I could use for this? I would like to investigate this but having no test file is blocking me.

schnoberts1 · 2023-10-13T16:54:48Z

...sent to the email address on your github profile

dlemstra · 2023-10-15T21:17:24Z

Thanks for the file and I can see what is happening. I really think I should simplify the implementation of the ByteArrayWrapper.

dlemstra · 2023-10-16T18:40:54Z

I made some changes to the implementation and this should fix the performance again in the next release.

schnoberts1 · 2023-10-16T21:45:08Z

That’s great, it’ll let me upgrade :) thank you. On 16 Oct 2023, at 19:41, Dirk Lemstra ***@***.***> wrote: I made some changes to the implementation and this should fix the performance again in the next release. —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: ***@***.***>

dlemstra added a commit that referenced this issue Oct 16, 2023

Improved performance of the ToByteArray methods (#1447)

d284055

dlemstra added this to the 13.4.0 milestone Oct 16, 2023

dlemstra closed this as completed Dec 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slowdown in MagickImage.ToByteArray on arm64 Mac #1447

Slowdown in MagickImage.ToByteArray on arm64 Mac #1447

schnoberts1 commented Oct 4, 2023 •

edited

Loading

dlemstra commented Oct 4, 2023

schnoberts1 commented Oct 4, 2023 via email

dlemstra commented Oct 4, 2023

schnoberts1 commented Oct 4, 2023 •

edited

Loading

dlemstra commented Oct 4, 2023

schnoberts1 commented Oct 4, 2023

schnoberts1 commented Oct 4, 2023 •

edited

Loading

schnoberts1 commented Oct 4, 2023

schnoberts1 commented Oct 4, 2023 •

edited

Loading

schnoberts1 commented Oct 5, 2023 •

edited

Loading

dlemstra commented Oct 5, 2023 •

edited

Loading

schnoberts1 commented Oct 6, 2023 via email

schnoberts1 commented Oct 6, 2023 •

edited

Loading

schnoberts1 commented Oct 6, 2023

dlemstra commented Oct 13, 2023

schnoberts1 commented Oct 13, 2023 •

edited

Loading

dlemstra commented Oct 15, 2023

dlemstra commented Oct 16, 2023

schnoberts1 commented Oct 16, 2023 via email

Slowdown in MagickImage.ToByteArray on arm64 Mac #1447

Slowdown in MagickImage.ToByteArray on arm64 Mac #1447

Comments

schnoberts1 commented Oct 4, 2023 • edited Loading

Magick.NET version

Environment (Operating system, version and so on)

Description

Steps to Reproduce

dlemstra commented Oct 4, 2023

schnoberts1 commented Oct 4, 2023 via email

dlemstra commented Oct 4, 2023

schnoberts1 commented Oct 4, 2023 • edited Loading

dlemstra commented Oct 4, 2023

schnoberts1 commented Oct 4, 2023

schnoberts1 commented Oct 4, 2023 • edited Loading

schnoberts1 commented Oct 4, 2023

schnoberts1 commented Oct 4, 2023 • edited Loading

schnoberts1 commented Oct 5, 2023 • edited Loading

dlemstra commented Oct 5, 2023 • edited Loading

schnoberts1 commented Oct 6, 2023 via email

schnoberts1 commented Oct 6, 2023 • edited Loading

schnoberts1 commented Oct 6, 2023

dlemstra commented Oct 13, 2023

schnoberts1 commented Oct 13, 2023 • edited Loading

dlemstra commented Oct 15, 2023

dlemstra commented Oct 16, 2023

schnoberts1 commented Oct 16, 2023 via email

schnoberts1 commented Oct 4, 2023 •

edited

Loading

schnoberts1 commented Oct 4, 2023 •

edited

Loading

schnoberts1 commented Oct 4, 2023 •

edited

Loading

schnoberts1 commented Oct 4, 2023 •

edited

Loading

schnoberts1 commented Oct 5, 2023 •

edited

Loading

dlemstra commented Oct 5, 2023 •

edited

Loading

schnoberts1 commented Oct 6, 2023 •

edited

Loading

schnoberts1 commented Oct 13, 2023 •

edited

Loading