Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Buffered Result Sets over-allocating memory #244

Closed
caleblloyd opened this issue Apr 20, 2017 · 3 comments
Closed

Buffered Result Sets over-allocating memory #244

caleblloyd opened this issue Apr 20, 2017 · 3 comments

Comments

@caleblloyd
Copy link
Contributor

Originally reported at PomeloFoundation/Pomelo.EntityFrameworkCore.MySql#262

      SELECT `m`.`Id`
      FROM `Blogs` AS `m`
      LIMIT 1000

Here are the actual allocations by ArrayPool.Shared.Rent/Return:

bufferDataLengths: Wanted 1 Got 16
bufferedDataOffsets: Wanted 1 Got 16
bufferedPayload: Wanted 16384 Got 16384
bufferDataLengths: Wanted 1 Got 16
bufferedDataOffsets: Wanted 1 Got 16
bufferedPayload: Wanted 16384 Got 16384
bufferDataLengths: Wanted 1 Got 16
bufferedDataOffsets: Wanted 1 Got 16
bufferedPayload: Wanted 16384 Got 16384

bufferedPayload must span multiple MySQL rows, need to re-work buffering. ArrayPool.Shared.Rent/Return may be overkill here as it seems to like to allocate a power of 2, we should probably switch this to allocating the correct sized objects.

@caleblloyd caleblloyd self-assigned this Apr 20, 2017
caleblloyd added a commit to caleblloyd/MySqlConnector that referenced this issue Apr 21, 2017
@caleblloyd
Copy link
Contributor Author

As a bit of a postmortem, I'm fairly sure that the sequence of events that read to this bug were:

  1. Buffering Rows was initially written into Split out MySqlDataReader ResultSet and Row operations into classes #136, but at that time socket reads were not buffered, so the only data in m_buffer pertained to the current row.
  2. Socket reads were adjusted to be buffered in Buffer Socket Reads #167, which meant that m_buffer could now contain more than one MySQL row
  3. I did not catch the high memory usage when implementing BufferResultSets Buffer Result Sets Implementation #204

Most people using this library will use the default BufferResultSets=false and never encounter this bug. The exception is users of Pomelo.EntityFramework.MySql, which sets it to true.

The effects of this bug are that memory usage per row read from MySQL will often be 16384 bytes instead of whatever storage the row would take normally. This means that if the row would only normally be 16 bytes, memory usage would be 1,024 times higher than it should.

If all rows are read from the buffered result set, then memory would be returned properly. This bug becomes a really big issue if several thousand rows are read. For example, 62,500 rows being read would cause the reader to consume 1GB of memory before it was available for consumption so that memory could be freed.

@bgrainger
Copy link
Member

Fixed in 0.18.2.

@xPaw
Copy link
Contributor

xPaw commented Aug 7, 2024

One thing to note here is that ArrayPool will return an existing array, and not allocate one if it already had an array of equal or bigger size available (rented somewhere else in the app, used by previous mysql result etc).

One thing I notice with the old code is it was renting ArrayPool<int> which would be using its own pool, renting byte and memory marshalling it to int would be beneficial.

Over the long run of the app, ArrayPool probably wins over the manual cache, as other places in the app can also be sharing the pool.

https://learn.microsoft.com/en-us/dotnet/api/system.buffers.arraypool-1.rent?view=net-8.0#remarks

Context: I was looking at memory allocations in ReadPayloadAsync, which would be caused by ArraySegmentHolder<byte> previousPayloads growing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

3 participants