-
Notifications
You must be signed in to change notification settings - Fork 225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow lists to be built from iterators #486
Conversation
Nice! Did you compare the performance of both approaches? |
make_list_from_end was actually 20% faster when encoding arrow arrays. I am yet to benchmark if it makes the default Vector encoding faster too. Later today I Will have some numbers. :) |
Ok, so this is tricky. Here are the numbers:
You can see this PR is faster for small lists but then it gets slower, most likely because I will be glad to revert the changes to EDIT: After thinking a bit more about it, I would argue that this PR may be the way to go:
|
With regards to erlang/otp#6293: There are a couple of places I can think of where the performance difference might come from between approach 1. and 2.: a. The alloc/free of the buffer used to store terms passed to The two different approaches are affected by the different factors as follows:
I have nothing hard to base this on, but I have a gut feeling the reason why approach 2 gets slower is because c becomes the a pretty large dominating factor. If this is the case, introducing an API like A solution to this would be to introduce some way of allowing our code to populate the data on the process heap directly without going through a function call. This might be a hard sell to the OTP team, because (at least in the simplest approach) it would require compiled NIFs to have some knowledge on how lists are laid out on process heaps. Of course all of this is very speculative on my end, but I think it would be useful to validate what causes performance to drop before we introduce new NIF APIs that might not improve things by much. |
@hansihe if the NIF cost is the culprit, then we can emulate it? Here is what I did, I changed encode to perform a NIF call to impl<T> Encoder for [T]
where
T: Encoder,
{
fn encode<'b>(&self, env: Env<'b>) -> Term<'b> {
let env_as_c_arg = env.as_c_arg();
let term_array: Vec<NIF_TERM> = self.iter().map(|x| {
unsafe { rustler_sys::_enif_make_list(env_as_c_arg, 0) };
x.encode(env).as_c_arg()
}).collect();
unsafe { Term::new(env, list::make_list(env.as_c_arg(), &term_array)) }
}
} These are the results:
As you can see, it is slightly lower for the large case, but sometimes even slightly faster, so I don't think the NIF call adds to much? WDYT? |
@josevalim Smart way of testing it! Seems I was very off target with what cost dominated. The API you suggested would probably work pretty well |
In particular, allow lists to be built from
double ended iterators, which avoids allocating
intermediate vectors.