New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework I/O buffering #633
Rework I/O buffering #633
Conversation
@@ -7,7 +7,7 @@ final class ByteBufferPool { | |||
private var buffers: List[ByteBuffer] = Nil | |||
|
|||
private def alloc(): ByteBuffer = { | |||
ByteBuffer.allocateDirect(64 * 1024 * 1024) | |||
ByteBuffer.allocateDirect(8 * 1024 * 1024) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test this with a large file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested it with ;rebuild;tests/run
. The problem with large files is in CodeGen
which I changed to not use ByteBufferPool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The largest .nir
file I could find in the repo is 2M so yes this is somewhat of a stopgap measure, but fixing the NIR code generation is a much larger effort to refactor.
> find . -name "*.nir" | xargs stat -f "%z %N" | sort -n | tail -5
546211 ./scalalib/target/scala-2.11/classes/scala/collection/immutable/StreamViewLike$$anon$13.nir
580412 ./scalalib/target/scala-2.11/classes/scala/collection/mutable/IndexedSeqLike$$anon$1.nir
612962 ./javalib/target/scala-2.11/classes/java/math/BigDecimal.nir
699387 ./javalib/target/scala-2.11/classes/java/util/Arrays$.nir
2301253 ./javalib/target/scala-2.11/classes/java/lang/Character$.nir
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Solution proposed here is a bit of bandaid, we'll probably rework this more fundamentally later on. There is quite a bit of cruft that has been accumulated around here and quite a few things that can be improved, but that would require drastic changes which are out of scope for this pr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, should we create a new more focused ticket to refactor buffering in NIR and code gen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably go in hand with binary format revamp I've planned for a while in #132. I've added buffer size concern to the list on that issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's also the codegen side, where we could investigate whether we can generate in one pass by reordering instructions to be aware of dependencies and whether we can render directly into a CharBuffer
to not pay the StringBuffer
=> byte[]
=> ByteBuffer
tax
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That one probably deserves its own ticket, feel free to file a new one with possible ideas for the solution. 👍
I will rebase to fix conflict. But else I think this is ready for review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
The buffers provided by withScratchBuffer has a fixed size which for large programs might be insufficient. Since CodeGen internally uses a StringBuffer we can instead allocate a buffer of the suitable size.
Reduce the buffer size to 8M based on the fact that the largest .nir file currently in the repo is java.lang.Character at ~2M: > find . -name "*.nir" | xargs stat -f "%z %N" | sort -n | tail -5 546211 ./scalalib/target/scala-2.11/classes/scala/collection/immutable/StreamViewLike$$anon$13.nir 580412 ./scalalib/target/scala-2.11/classes/scala/collection/mutable/IndexedSeqLike$$anon$1.nir 612962 ./javalib/target/scala-2.11/classes/java/math/BigDecimal.nir 699387 ./javalib/target/scala-2.11/classes/java/util/Arrays$.nir 2301253 ./javalib/target/scala-2.11/classes/java/lang/Character$.nir
Needs to be rebased on the sbt refactoring.