FileChannelJournalSegmentReader throws half buffers away #4248

Zelldon · 2020-04-07T05:23:22Z

Description

During our investigation of the engine latency we found out that the atomix journal is a big hotspot performance wise. We had a deeper look at the code and found the readNext method.

We have to say before that per Zeebe default the maxEntrySize is 4 MB. The reader will allocate a buffer which is 8MB big. Our assumption is that it wants to do a read a head, but actually it doesn't because it clears the buffer positions etc.

  /** Reads the next entry in the segment. */
  @SuppressWarnings("unchecked")
  private void readNext() {
    // Compute the index of the next entry in the segment.
    final long index = getNextIndex();

    try {
      // Read more bytes from the segment if necessary.
      if (memory.remaining() < maxEntrySize) {
        final long position = channel.position() + memory.position();
        channel.position(position);
        memory.clear(); // <=== this resets to position 0, limit to capacity
        channel.read(memory);  // <=== we read now again 8 MB
        channel.position(position);
        memory.flip();
      }

      // Mark the buffer so it can be reset if necessary.
      memory.mark();

      try {
        // Read the length of the entry.
        final int length = memory.getInt();

        // If the buffer length is zero then return.
        if (length <= 0 || length > maxEntrySize) {
          memory.reset().limit(memory.position());
          nextEntry = null;
          return;
        }

        // Read the checksum of the entry.
        final long checksum = memory.getInt() & 0xFFFFFFFFL;

        // Compute the checksum for the entry bytes.
        final Checksum crc32 = new CRC32();
        crc32.update(memory.array(), memory.position(), length);

        // If the stored checksum equals the computed checksum, return the entry.
        if (checksum == crc32.getValue()) {
          final int limit = memory.limit();
          memory.limit(memory.position() + length);
          final E entry = namespace.deserialize(memory);
          memory.limit(limit);
          nextEntry = new Indexed<>(index, entry, length);
        } else {
          memory.reset().limit(memory.position());
          nextEntry = null;
        }
      } catch (final BufferUnderflowException e) {
        memory.reset().limit(memory.position());
        nextEntry = null;
      }
    } catch (final IOException e) {
      throw new StorageException(e);
    }
  }

This means we will read 8 MB process from these ~4 MBuntil our remaining is less then maxEntrySize, so after we reached half of the buffer. Then we will read again 8MB where the other remaining is thrown away and will end at the beginning of the buffer, because it is re-read.

This problem is more an issue when the maxEntrySize and realEntrySize is very different, which is our case. We using 4MB to support big deployments but actually most of our records are less then 1 kb. If the sizes would be nearer then we would probably also read most of the buffer, but this is not our use case so we should fix this!

The text was updated successfully, but these errors were encountered:

4372: fix(atomix): consume read buffer correctly r=Zelldon a=Zelldon ## Description Fix issue where the read buffer was consumed only half and the rest was thrown away. Refactor `FileChannelJournalSegmentReader#readNext` to improve readability and hopefully maintainability.  I run also a benchmark with these changes (#4358 was also part of the branch I run a benchmark with) ![metrics](https://user-images.githubusercontent.com/2758593/80199113-45202a00-8621-11ea-9595-e8c65130ec97.png) ## Related issues  closes #4248 # Co-authored-by: Christopher Zell <zelldon91@googlemail.com>

Zelldon changed the title ~~FileChannelJournalSegmentReader seems to work in an unexpected way~~ FileChannelJournalSegmentReader throws half buffers away Apr 7, 2020

Zelldon added this to the Maintenance milestone Apr 7, 2020

This was referenced Apr 7, 2020

Massive cache misses due to big default maxEntrySize #4249

Closed

Using a mmap file instead of normal file io #4274

Closed

Zelldon mentioned this issue Apr 24, 2020

fix(atomix): consume read buffer correctly #4372

Merged

3 tasks

zeebe-bors bot closed this as completed in bf5d2fa May 1, 2020

npepinpe added the Release: 0.24.0-alpha2 label Jun 2, 2020

npepinpe added the Release: 0.24.0 label Jul 3, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FileChannelJournalSegmentReader throws half buffers away #4248

FileChannelJournalSegmentReader throws half buffers away #4248

Zelldon commented Apr 7, 2020

FileChannelJournalSegmentReader throws half buffers away #4248

FileChannelJournalSegmentReader throws half buffers away #4248

Comments

Zelldon commented Apr 7, 2020