Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster bulk numeric reads from BufferedIndexInput #12453

Merged
merged 7 commits into from Jul 24, 2023

Conversation

original-brownbear
Copy link
Contributor

Reading ints/floats/longs one-by-one from a heap-byte-buffer, including doing our own bounds checks is not very efficient. We can use the ability to translate the buffer and read in bulk while taking turns with one-off reading/refilling instead.

Reading ints/floats/longs one-by-one from a heap-byte-buffer, including
doing our own bounds checks is not very efficient. We can use the
ability to translate the buffer and read in bulk while taking turns with
one-off reading/refilling instead.
Copy link
Contributor

@stefanvodita stefanvodita left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some nit-picky comments and some questions, but overall this looks like a good and useful change.

@@ -159,6 +159,63 @@ public final long readLong() throws IOException {
}
}

@Override
public void readFloats(float[] floats, int offset, int len) throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Could we change the argument names to be consistent (dst, len)? I see the parent class, DataInput, has the same inconsistency.

@@ -159,6 +159,63 @@ public final long readLong() throws IOException {
}
}

@Override
public void readFloats(float[] floats, int offset, int len) throws IOException {
int remaining = len;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I found remaining to be a little confusing just because the buffer also calls to remaining(). What if we called it remainingDst to better differentiate from the buffer's remaining?

}

@Override
public void readLongs(long[] dst, int offset, int length) throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering why we're following a different pattern with these methods than with readBytes. If we rewrote readBytes using this model, would it just work?

  @Override
  public final void readBytes(byte[] dst, int offset, int len) throws IOException {
    int remaining = len;
    while (remaining > 0) {
      int cnt = Math.min(buffer.remaining(), remaining);
      buffer.get(dst, offset + len - remaining, cnt);
      remaining -= cnt;
      if (remaining > 0) {
        if (buffer.hasRemaining()) {
          dst[offset + len - remaining] = readByte();
          --remaining;
        } else {
          refill();
        }
      }
    }
  }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would come out to about that code if we inlined readBytes with useBuffer=true except that

        if (buffer.hasRemaining()) {
          dst[offset + len - remaining] = readByte();
          --remaining;
        }

is never entered, hasRemaining is just always false because we don't have anything like 1 one of 4 int bytes available.

-> the difference I think is just in that we have to deal with the alignment at the buffer boundary

}
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should TestBufferedIndexInput have some basic tests for these new methods?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ good call, already had one bug that I only found when running benchmarks. Added a test for each of the 3 that should cover all the boundary/alignment/offset cases we can expect.

Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. Can you add an entry to CHANGES.txt?

.put(byten(offset + 1))
.put(byten(offset + 2))
.put(byten(offset + 3));
assertEquals(bb.getFloat(0), floatBuffer[idx], 0f);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: would it make more sense to compare the int bits rather than the float itself, to make sure +/-0 or the different representations of NaN are considered different?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right that's technically more exact I think :) Adjusted that and added changes.txt entry

@@ -84,6 +84,8 @@ Optimizations

* GITHUB#12372: Reduce allocation during HNSW construction (Jonathan Ellis)

* GITHUB#12453: Faster bulk numeric reads from BufferedIndexInput (Armin Braun)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the section for the upcoming Lucene major (10.0), your change looks safe for backport, let's move it to section 9.8?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Adrien!

@jpountz jpountz merged commit 20e97fb into apache:main Jul 24, 2023
4 checks passed
@original-brownbear original-brownbear deleted the faster-bulk-nums-read-bii branch July 25, 2023 08:23
jpountz pushed a commit that referenced this pull request Jul 26, 2023
Reading ints/floats/longs one-by-one from a heap-byte-buffer, including
doing our own bounds checks is not very efficient. We can use the
ability to translate the buffer and read in bulk while taking turns with
one-off reading/refilling instead.
@zhaih zhaih added this to the 9.8.0 milestone Sep 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants