Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce enif_alloc_list / enif_make_alloc_list / enif_release_alloc_list #6293

Open
josevalim opened this issue Sep 11, 2022 · 7 comments
Open
Assignees
Labels
enhancement team:VM Assigned to OTP team VM

Comments

@josevalim
Copy link
Contributor

When working with NIFs and you have to allocate large lists, we typically have two options:

  1. Use enif_make_list_from_array but that requires allocating the data into an array, which will then be discarded
  2. Use enif_make_list_cell but that requires traversing from the back and allocating the cells one at a time

For large lists (32 elements over), 1 ends up being faster because the whole list is allocated at once but it uses more memory. 2 uses less memory but ends up slower (at least on Apple M1).

It would be nice if we had a similar API to binaries, where the list is preallocated once and then we can set the list elements one by one:

ErlNifBinary mutable_list;
enif_alloc_list(env, 10, enif_make_list(env, 0), mutable_list);

The above will allocate a list of 10 elements, where each element points to an empty list (the third argument). Now, we can update the given list element with:

enif_set_list(env, 0, enif_make_int(env, 1), mutable_list);
enif_set_list(env, 1, enif_make_atom(env, "awesome"), mutable_list);

Eventually, enif_make_alloc_list or enif_release_alloc_list needs to be invoked, similar to the binary API. The hope is that this API will put the performance closer to enif_make_list_from_array (the cost is one extra pass setting the CARs) while avoiding allocating an intermediate array.

I am not sure if this proposal makes sense, therefore feedback is welcome. :)

Is your feature request related to a problem? Please describe.

This came up here: rusterlium/rustler#486

@sverker
Copy link
Contributor

sverker commented Sep 12, 2022

I see two problems with the API:

  1. It's not functional. It's possible to create cyclic terms (by mistake).
  2. It kind of forces the implementation to allocate the list as one contiguous heap block.

Nr 1 is more about API consistency than a practical problem I think.

What do you think about a callback driven API, something like this:

{
    int counter = 1;
    return enif_generate_list(env, 10, callback, &counter);
}

ERL_NIF_TERM callback(ErlNifEnv* env, void* arg)
{
    int* counter_ptr = (int*)arg;
    return enif_make_int(env, (*counter_ptr)++);
}

This would make a lists:seq(1,10).
It's a compact API, just one function enif_generate_list.
It could also be expanded to let the callback function decide when to terminate the list.
The drawback is the list terms must be generated sequentially from first to last. No random access like your enif_set_list.

@josevalim
Copy link
Contributor Author

API wise that looks great to me, although I am unsure if this is something we could leverage from Rust. @hansihe, do you have thoughts here?

The drawback is the list terms must be generated sequentially from first to last.

If that ever becomes a need, we could also have enif_generate_reverse_list or something.

@hansihe
Copy link

hansihe commented Sep 12, 2022

I think an API like that would work very well with Rust iterators.

Allowing the callback to decide when the list terminates is a good idea as well, as it would enable us to build lists from iterators of unknown length. I don't know the internals on the OTP side, but if it could be beneficial, allowing the caller to specify an optional list size hint would also be an option.

@sverker
Copy link
Contributor

sverker commented Sep 12, 2022

Proposal:

ERL_NIF_TERM enif_generate_list(ErlNifEnv*,
                                ErlNifGenerateListFn* callback,
                                size_t length_hint);

typedef ERL_NIF_TERM ErlNifGenerateListFn(ErlNifEnv*,
                                          ERL_NIF_TERM* end_tail,
                                          void* arg);

To terminate the list, the callback would do
*end_tail = enif_make_list(env,0);
or set something else to append on another list or even make an improper list.

It should probably be named something else as all other term creating functions are named enif_make_*. The other list creating functions are called enif_make_list*, so maybe

enif_make_list_generate
enif_make_list_while

@sverker
Copy link
Contributor

sverker commented Dec 14, 2022

A revised API with simple implementation proposal.
In short, in each callback

  • write to *head to create a new list cell.
  • write to *tail to end the list with whatever tail you want.
typedef void ErlNifMakeListWhileFn(ErlNifEnv*,
                                  ERL_NIF_TERM* head,
                                  ERL_NIF_TERM* tail,
                                  void* arg);

ERL_NIF_TERM enif_make_list_while(ErlNifEnv* env,
                                  ErlNifMakeListWhileFn* callback,
                                  void* arg,
                                  size_t length_hint)
{
    Eterm result;
    Eterm *prev_tailp = &result;
    Eterm head, tail;
    do {
        head = tail = THE_NON_VALUE;
        callback(env, &head, &tail, arg);

        if (head != THE_NON_VALUE) {
            Eterm* hp = alloc_heap(env, 2);
            *prev_tailp = make_list(hp);
            CAR(hp) = head;
            prev_tailp = &CDR(hp);
        }
    } while (tail == THE_NON_VALUE);

    *prev_tailp = tail;
    return result;
}

Features

  • Create list from head to tail without prior knowledge of list length
  • Create improper lists
  • Append on existing list
  • Create empty lists
  • Option to end the list either at creation of last cell or at next call after last cell, depending on what fit users iteration best.
  • Bonus "feature": Make any term you want (by writing only to *tail at first call).

I thought about adding some way to abort and signal error back to caller, but the user can do that themself with arg.

@josevalim
Copy link
Contributor Author

LGTM and i think it would also work with Rust iterators, right @hansihe?

@hansihe
Copy link

hansihe commented Dec 14, 2022

Yep, this looks like a really nice API that would fit in with Rust iterators very well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement team:VM Assigned to OTP team VM
Projects
None yet
Development

No branches or pull requests

4 participants