New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Array#shift is slow #5126
Comments
Well, doesn't your implementation leak memory? That is the array will never shrink again? However I agree we should only move after some threshold, that is move if there' X free space in front of the buffer and also check that when pushing. In any case check whether |
Duplicate of #573 |
We should probably document that |
Well Deque has the fast shift implementation, but I need other functions from array first before shifting. |
@benoist As to why your implementation doesn't work, later when you need to grow the array later you lost the reference to the original pointer to invoke |
Alternatively we could store an offset inside the array, but maybe that's a big penalty for just this use case. |
converting to deque before shift has some overhead of course, but it's already a 160x speedup
|
In Ruby they do increase the base pointer... I don't know how they can later reallocate. So it's worth investigating how they do it, it might worth doing the same in our case. |
If I'm reading the ruby implementation correctly it seems like they are holding off the memmove until inserts and they double the capacity upon the insert. That would give you the move penalty only once to when you actually need to reallocate. If the array is shifted until empty it would never need the reallocation. Just need to make sure it doesn't leak memory like @jhass pointed out. |
But how do they retain the original pointer and the current pointer? |
If the array is not shared when shift is called, it will call the ary_make_shared function. I think this makes a copy of the original pointer and freezes it. This pointer is used in ary_modify to determine the shift. |
Well, what functions do you need that Another option is |
I'm currently using uniq and sort from array, which might be a lot slower to do in deque as those functions are not just pushes and shifts for which deque is optimized. Thats just an assumption though, I might be wrong here. |
Have you tried it? benchmarked it? Don't work on assumptions. You can always convert the array into a deque rather easilly. Or if use |
There's a bit of a problem: |
@RX14 Yes I know I can convert it to deque or use reverse and pop, but thats not really the issue. If we can make Array#shift a lot faster, isn't that worth investigating? It's not very intuitive to do all these workarounds because Array#shift is slow. The reason I'm making assumptions for the other parts now, is because putting the effort in testing and verifying still leaves this problem unsolved. If the general conclusion is that Array#shift is as fast as it's going to get for now, then I have no problem if this issue gets closed. :-) |
Yeah, exactly. The non-workaround is |
I would still consider that a workaround, but if thats just me, I can live with that :-) |
@benoist Using the correct datastructure for the job with the correct algorithmic complexity is a workaround? |
if Deque is the only data structure that should do shifts, then shift should be removed from array. But I don't think that should be the case. If the overall operations on the same data are faster within one data structure, then it makes no sense to change. I think Array#shift can be made faster. start = Time.now
a = Array.new(100_000, "a")
a.size.times do
a.shift
end
puts Time.now - start
|
Ruby implements an optionally shared buffer for the Array. Some Array instances owns the memory they use, and they will themselves reallocate it and free it when needed. But some Array instances are shared, in which they will keep reference to an external reference-counted buffer and maintain an offset on it. This shared Array will never touch the buffer unless it is the only array referencing it. This also has Copy-on-Write semantics. Then an shared Array tries to modify some data, it will first allocate its own buffer and copy everything to it. This is all done without any external impact, so the user of an Array cannot tell the difference, except with timing. Here is some proof of it, taking @benoist sample: start = Time.now
a = Array.new(100_000, "a")
a.size.times do
a[0] = "b" # Modifying the array will force the array to own its own memory.
a.shift
end
puts Time.now - start Running on my computer: (I did not run crystal with release optimizations!)
This kind of optimization is nice because it makes some things faster and you only have to pay some costs if you need to pay the costs. But it also brings inexplicable slowdowns in functions that shouldn't ever be slow. Who could tell that Refs: |
@lbguilherme thank you for this explanation! Just to be sure, what would be your suggestion to do with the current Array#shift implementation, leave it as is or change it into something similar to ruby? |
I'm not sure in which direction Crystal wants to go here. We could:
Still on point 2, there are simpler optimizations, like keeping an buffer and offset and only reallocating the they differ too much. This would be of much less impact than shared buffers, but still. Just to be clear, there are possible optimizations for many other functions as well, not just I don't dislike either solution. Maybe a speed focused container could come as a shard so not everybody would have to fear unpredictable performance. Or maybe is should be in the standard Array itself, so that everybody can benefit the performance. I particularly like optimizing for the average user, even with surprising behavior. I would like to hear from @asterite on this. |
From a user perspective, I always liked how Array, Hash and String are super generally optimized data structures in Ruby. They can share data with other instances, they are mutable and can be made immutable, they have generally fast operations, etc. Of course that comes at the cost of implementing all of that. Maybe in Ruby it makes more sense because it's a dynamic language and implementing other data structures is inefficient, unless implemented in C. In most (all?) compiled languages you have different data structures, like in Java you have ArrayList, LinkedList, Dequeue, etc. That's nice but it's more cumbersome for the user, because she has to pick a data structure. So... I don't know. If you need to Maybe adding an offset to Array is acceptable, I don't know. Maybe |
This discussion should happen later once optimizations like this come to the top of our agenda, instead of simply features and stability. At that time we'll have a lot more data on array performance in practice. |
@asterite I've written a simple column storage database with encoding and to convert the columns to rows again I was shifting the values to form the rows. I'm using iterators because the column storage is not always aligned to values that belong to the same row due to compression. I'm keeping an offset now and iterate through the array instead of shifting. This allows me to use the sort function without an extra conversion to Deque. |
What I find interesting is that I can read more about how ruby implements arrays here on crystal, than I can find on the ruby issue tracker or what not. Sorry for the distraction. ;-) |
@asterite there is no need in circular buffer. Just move array's content to the begin of allocation when allocation end is reached. That is the way Ruby's Array works actially. 'Shared array' is just implementation trick. |
@funny-falcon Yes, that's a possibility. But if you do |
If Array will be used as Deque, then there should be amortization room, so move doesn't happen too often. I did fix for Ruby's Array exactly for this scenario (ie Ruby as Deque). |
As I've said, when |
If i'm not non-grata here, I can make PR for this. |
@funny-falcon Sure! No one is non-grata here. I still have my doubts about this, though: adding four bytes to all arrays for just one method that's maybe not used all the time. And for example in the OPs use case there was really no need for a shift. |
Haven't we already mentioned that it's too early for this kind of optimization? Besides, this is nothing more than a workaround. |
|
Just a quick question, what downsides are there of a deque over an array with a start offset from the allocation start? Apart from the obvious |
@asterite , single benefit of implementing this is getting rid of Deque, and being closer to Ruby. All other issues could be solved with programmers discipline. @RX14 simplest way is to not change usage of |
I'd preffer to store pointer to allocation. |
First about the upsides of The downside of a |
I really don't think that In actual fact, I think that 95% of arrays will never wrap. Adding elements to the end is by far the most common array mutation op, and so I think most arrays won't ever wrap around. |
In fact, instead of this suggestion, why not actually mix the implementations of |
@oprypin what do you mean by a "shortcut scenario". The check that |
That or just |
@oprypin no, why? That means that there's more branches in the non-zero case and for what gain? Avoiding a single 1 cycle addition instruction in a piece of code that is memory bound, not IPC bound anyway? |
@oprypin , you are not fully correct about complexity. Yes, occasional single operation is |
@funny-falcon what about simple Hash implementation w/o normalization etc.? Now I think that small PRs will be reviewed quickly. |
LBTM, there are no reasons to mix Array with Deque. |
If I did language, I'd repeat Ruby's trick, just "because I can". But yeah, given Deque already implemented, there is no much point. |
@akzhan what are the reasons not to? Deque appears to be a superset of Array's functionality with the cost being in implementation complexity. |
It's just my point. I rarely use Arrays in Deque manner. |
To achieve it, pointer to allocation is preserved, and @buffer is moved instead of moving elements on `#shift`. For crystal-lang#5126
To achieve it, pointer to allocation is preserved, and @buffer is moved instead of moving elements on `#shift`. For crystal-lang#5126
I was using array shift a lot, but I found out it's pretty slow compared to using an index lookup
This is the current implementation
If I change this function by
It is a lot faster
Shifting Array(String).new(100_000, "a") shows the following speed improvement
Question is, can it really be implemented like this or is there in important reason the buffer move and clear is required for this operation? I've made the change in the current master branch and tests all pass.
The text was updated successfully, but these errors were encountered: