Allow lists to be built from iterators #486

josevalim · 2022-09-11T08:32:48Z

In particular, allow lists to be built from
double ended iterators, which avoids allocating
intermediate vectors.

In particular, allow lists to be built from double ended iterators, which avoids allocating intermediate vectors.

filmor · 2022-09-11T09:52:01Z

Nice! Did you compare the performance of both approaches?

josevalim · 2022-09-11T12:17:28Z

make_list_from_end was actually 20% faster when encoding arrow arrays. I am yet to benchmark if it makes the default Vector encoding faster too. Later today I Will have some numbers. :)

josevalim · 2022-09-11T14:02:51Z

Ok, so this is tricky. Here are the numbers:

## New (this PR)

Name               ips        average  deviation         median         99th %
16 elems        1.03 M      971.63 ns  ±1498.29%         958 ns        1083 ns
32 elems      668.49 K        1.50 μs   ±918.15%        1.46 μs        1.58 μs
128 elems     220.57 K        4.53 μs   ±388.58%        4.38 μs        4.75 μs
1k elems       31.51 K       31.73 μs     ±5.36%       31.29 μs       36.92 μs
1M elems         28.28       35.37 ms     ±7.33%       33.73 ms       40.14 ms

## Old (master)

Name               ips        average  deviation         median         99th %
16 elems      915.15 K        1.09 μs  ±1399.89%        1.08 μs        1.21 μs
32 elems      696.04 K        1.44 μs   ±943.38%        1.42 μs        1.54 μs
128 elems     274.22 K        3.65 μs   ±289.96%        3.54 μs        3.88 μs
1k elems       42.16 K       23.72 μs    ±67.46%       23.17 μs       29.50 μs
1M elems         38.43       26.02 ms     ±7.68%       24.69 ms       30.16 ms

You can see this PR is faster for small lists but then it gets slower, most likely because enif_make_list_from_array allocates the whole area upfront, while this PR has to allocate cell by cell. On the other hand, the current approach uses less memory.

I will be glad to revert the changes to rustler::types::list if you would prefer to not change the default encoding.

EDIT: After thinking a bit more about it, I would argue that this PR may be the way to go:

This PR is faster for small lists (say up to 16 elems)
For larger lists, it is slower, but that's when avoiding the double allocation matters more

rustler/src/wrapper/list.rs

hansihe · 2022-09-11T17:22:26Z

With regards to erlang/otp#6293:

There are a couple of places I can think of where the performance difference might come from between approach 1. and 2.:

a. The alloc/free of the buffer used to store terms passed to enif_make_list_from_array
b. The allocation on the process heap inside enif_make_list_from_array/enif_make_list_cell
c. The function call overhead each time enif_make_list_cell is called

The two different approaches are affected by the different factors as follows:

a and b dominates
b and c dominates

I have nothing hard to base this on, but I have a gut feeling the reason why approach 2 gets slower is because c becomes the a pretty large dominating factor.

If this is the case, introducing an API like enif_alloc_list/enif_set_list might not improve things by much.

A solution to this would be to introduce some way of allowing our code to populate the data on the process heap directly without going through a function call. This might be a hard sell to the OTP team, because (at least in the simplest approach) it would require compiled NIFs to have some knowledge on how lists are laid out on process heaps.

Of course all of this is very speculative on my end, but I think it would be useful to validate what causes performance to drop before we introduce new NIF APIs that might not improve things by much.

josevalim · 2022-09-11T18:44:20Z

@hansihe if the NIF cost is the culprit, then we can emulate it? Here is what I did, I changed encode to perform a NIF call to enif_make_list(env, 0), which is as cheap as a NIF can be:

impl<T> Encoder for [T]
where
    T: Encoder,
{
    fn encode<'b>(&self, env: Env<'b>) -> Term<'b> {
        let env_as_c_arg = env.as_c_arg();
        let term_array: Vec<NIF_TERM> = self.iter().map(|x| {
            unsafe { rustler_sys::_enif_make_list(env_as_c_arg, 0) };
            x.encode(env).as_c_arg()
        }).collect();
        unsafe { Term::new(env, list::make_list(env.as_c_arg(), &term_array)) }
    }
}

These are the results:

16 elems       945.72 K        1.06 μs  ±1417.14%        1.04 μs        1.17 μs
32 elems       703.49 K        1.42 μs   ±889.53%        1.38 μs        1.58 μs
128 elems      278.42 K        3.59 μs   ±367.05%        3.46 μs        3.88 μs
1k elems        43.41 K       23.04 μs    ±51.07%       22.46 μs       29.45 μs
1M elems          37.20       26.88 ms     ±7.57%       25.52 ms       31.26 ms

As you can see, it is slightly lower for the large case, but sometimes even slightly faster, so I don't think the NIF call adds to much? WDYT?

hansihe · 2022-09-11T18:53:07Z

@josevalim Smart way of testing it!

Seems I was very off target with what cost dominated. The API you suggested would probably work pretty well

josevalim added 3 commits September 11, 2022 10:30

Allow lists to be built from iterators

051ec45

In particular, allow lists to be built from double ended iterators, which avoids allocating intermediate vectors.

cargo fmt

6dbfd4a

cargo fmt

3d4b806

josevalim mentioned this pull request Sep 11, 2022

Introduce enif_alloc_list / enif_make_alloc_list / enif_release_alloc_list erlang/otp#6293

Open

josevalim commented Sep 11, 2022

View reviewed changes

rustler/src/wrapper/list.rs Outdated Show resolved Hide resolved

Update rustler/src/wrapper/list.rs

624d100

josevalim closed this by deleting the head repository Feb 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow lists to be built from iterators #486

Allow lists to be built from iterators #486

josevalim commented Sep 11, 2022

filmor commented Sep 11, 2022

josevalim commented Sep 11, 2022

josevalim commented Sep 11, 2022 •

edited

Loading

hansihe commented Sep 11, 2022 •

edited

Loading

josevalim commented Sep 11, 2022

hansihe commented Sep 11, 2022

Allow lists to be built from iterators #486

Allow lists to be built from iterators #486

Conversation

josevalim commented Sep 11, 2022

filmor commented Sep 11, 2022

josevalim commented Sep 11, 2022

josevalim commented Sep 11, 2022 • edited Loading

hansihe commented Sep 11, 2022 • edited Loading

josevalim commented Sep 11, 2022

hansihe commented Sep 11, 2022

josevalim commented Sep 11, 2022 •

edited

Loading

hansihe commented Sep 11, 2022 •

edited

Loading