Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
script with long DATA block hangs #873
How big is this file sizewise? We do not have the same impl as MRI (which uses FILE*) and we end up allocating a big bytearrayinputstream out of that section 1k at a time. Assuming memory is not an issue we can probably bump this size up to a larger number like 32k since not many people use END and you are not the first large data set person.
If you could make a script to generate a representative END dataset we can probably poke at this and improve our impl. Ultimately, we want a read/write END data section preferably on top of NIO, but I know we looked at that in the past and there were some issues.
So yeah, @enebo was right about the cause. We read the DATA contents all into memory currently, 1k at a time. Those bytes go into a slowly-growing array, so larger files will take DATA.size / 1024 read + resize + copy operations. It just ends up doing too much work.
I'm going to do a short-term fix to increase the buffer size. For a 10MB file, a 64k buffer loads DATA almost immediately.
We are also talking about the longer-term fix to actually pass the real stream/channel for DATA rather than reading into memory.