New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BufferOverflowException for certain cachePercent and file sizes #72

Closed
timhurkmans opened this Issue Mar 9, 2018 · 0 comments

Comments

Projects
None yet
1 participant
@timhurkmans
Copy link

timhurkmans commented Mar 9, 2018

Given version 1.3.1, and a file of size between 100 and 799 bytes, with a cachePercent of 1.

When running

        RandomAccessFile raf = new RandomAccessFile(new File("file799bytes.txt"), "r");
        new FileWordList(raf, true, 1);

Then

java.nio.BufferOverflowException
	at java.nio.Buffer.nextPutIndex(Buffer.java:521)
	at java.nio.ByteBufferAsLongBufferB.put(ByteBufferAsLongBufferB.java:128)
	at org.passay.dictionary.AbstractFileWordList$Cache.put(AbstractFileWordList.java:386)
	at org.passay.dictionary.AbstractFileWordList.initialize(AbstractFileWordList.java:139)
	at org.passay.dictionary.FileWordList.<init>(FileWordList.java:138)
	at org.passay.dictionary.FileWordList.<init>(FileWordList.java:109)
	at org.passay.dictionary.FileWordList.<init>(FileWordList.java:83)
	at org.passay.dictionary.FileWordList.<init>(FileWordList.java:62)
	at org.passay.dictionary.FileWordList.<init>(FileWordList.java:43)

Analysis:

For small files that use a small cache, the cache buffer that is generated can be too small. This is due to how the cache is generated. In the constructor of cache the cacheSize is computed as:

final long cacheSize = (fileSize / 100) * cachePercent;

If this cacheSize is larger than 0, then used in resize(..) to create a ByteBuffer using:

ByteBuffer.allocate((int) size)

This ByteBuffer is then converted to a longBuffer using:

ByteBuffer.allocate((int) size).asLongBuffer()

Assume that the ByteBuffer is generated (cacheSize is larger than 0), then replacing the cacheSize formula in the ByteBuffer allocation, you get this:

ByteBuffer.allocate((int) (fileSize / 100) * cachePercent).asLongBuffer()

Assume cachePercent is 1:

ByteBuffer.allocate((int) (fileSize / 100) * 1).asLongBuffer()
ByteBuffer.allocate((int) (fileSize / 100)).asLongBuffer()

The error happens when the ByteBuffer generated by allocate is less than 8 bytes, as asLongBuffer will then convert it to a long buffer of size 0 (allocateSize divided by 8, integer). When trying to putting values in an empty buffer, the BufferOverflowException is thrown.

For example, 100 bytes:

ByteBuffer.allocate((int) (100 / 100) * 1).asLongBuffer()
ByteBuffer.allocate(1).asLongBuffer()
1 size ByteBuffer -> asLongBuffer -> 0 size LongBuffer

For example, 799 bytes:

ByteBuffer.allocate((int) (799 / 100) * 1).asLongBuffer()
ByteBuffer.allocate(7).asLongBuffer()
7 size ByteBuffer -> asLongBuffer -> 0 size LongBuffer

No problem, 800 or more bytes:

ByteBuffer.allocate((int) (800 / 100) * 1).asLongBuffer()
ByteBuffer.allocate(8).asLongBuffer()
8 size ByteBuffer -> asLongBuffer -> 1 size LongBuffer

When using cachePercent of 2, it goes wrong up until 399 bytes:

ByteBuffer.allocate((int) (399 / 100) * 2).asLongBuffer()
ByteBuffer.allocate(3 * 2).asLongBuffer()
6 size ByteBuffer -> asLongBuffer -> 0 size LongBuffer

Errors (anything less than 8)

  • cachePercent 1 -> fileSize (100 - 799 bytes): (799 / 100) * 1 = 7 * 1 = 7
  • cachePercent 2 -> fileSize (100 - 399 bytes): (399 / 100) * 2 = 3 * 2 = 6
  • cachePercent 3 -> fileSize (100 - 299 bytes): (299 / 100) * 3 = 2 * 3 = 6
  • cachePercent 4 -> fileSize (100 - 199 bytes): (199 / 100) * 4 = 1 * 4 = 4
  • cachePercent 5 -> fileSize (100 - 199 bytes): (199 / 100) * 5 = 1 * 5 = 5
  • cachePercent 6 -> fileSize (100 - 199 bytes): (199 / 100) * 6 = 1 * 6 = 6
  • cachePercent 7 -> fileSize (100 - 199 bytes): (199 / 100) * 7 = 1 * 7 = 7

Edit: It can also fail when the buffer is resized when calling cache.put(..).

A workaround is too disable the cache (cachePercent 0), use a cachePercent of at least 8, or not use small files. A fix would probably be to check for a too small cacheSize (< 8 ), and allocate at least 8 bytes in that case.

dfish3r added a commit that referenced this issue Mar 9, 2018

Require a minimum file cache size.
For a cache to operate properly it must be at least two bytes in size.
That allows for storage of one position and incrementing to the next position before a resize occurs.
Fixes #72.

@serac serac closed this in #73 Mar 9, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment