Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NIF-free Riak #961

Open
martinsumner opened this issue Mar 14, 2019 · 7 comments
Open

NIF-free Riak #961

martinsumner opened this issue Mar 14, 2019 · 7 comments

Comments

@martinsumner
Copy link
Contributor

This has come up at Riak meetups, and also conferences. Can Riak be run NIF-free (or at least without eleveldb)?

There has also been a mailing list question about running Riak on different architectures - http://lists.basho.com/pipermail/riak-users_lists.basho.com/2019-March/039337.html - and a NIF-free Riak would make this much easier.

There is now a pure-Erlang KV backend for Riak in leveled (it does import a lz4 nif, but this is not used in default configuration so stripping this out would be fairly easy).

However, eleveldb is also used as the backend for the riak_core hashtree implementation. This can be disabled in non-yokozuna implementations to use Tictac AAE instead. However, this is still used in riak_core for AAE of cluster metadata.

Cluster metadata isn't particularly big though, so when providing AAE for cluster metadata, eleveldb could be swapped for dets fairly easily. Performance-wise, that may be imperfect if AAE is being used in KV. It would also be possible to swap for leveled (although that like leveldb seems overkill in cases where the only use is for AAE).

I may have a look at this next month as a side project, trying to get a NIF-free Riak branch (with some feature limitations), and see how difficult this is.

@martinsumner
Copy link
Contributor Author

martinsumner commented Mar 14, 2019

I think the three main places where there are NIFs are:

erlang_js (JS map reduce)
leveldb
bitcask

But there is also:

ebloom (not sure it is used outside of riak_repl, and there are pure erlang alternatives)
riak_ensemble (has a monotonic clock NIF - could drop ensemble, or with OTP 20 work use native erlang version?)
syslog (used by lager syslog backend. Just drop this? Does anyone use it?)
canola (PAM authentication used by riak_auth_mods. Just drop this? Does anyone use it?)

So it would need to be a slimmed down riak.

@lemenkov
Copy link

I think the three main places where there are NIFs are:

erlang_js (JS map reduce)

Technically this isn't a NIF yet but yes, this is arch-dependent.

leveldb
bitcask

But there is also:

ebloom (not sure it is used outside of riak_repl, and there are pure erlang alternatives)
riak_ensemble (has a monotonic clock NIF - could drop ensemble, or with OTP 20 work use native erlang version?)

I believe we can use erlang:monotonic_time() as a monotonic clock.

syslog (used by lager syslog backend. Just drop this? Does anyone use it?)

Syslog can be easily reimplemented as a NIF-free library nowadays. Or even better rely on leger/logger and let user decide what to use for backend.

canola (PAM authentication used by riak_auth_mods. Just drop this? Does anyone use it?)

So it would need to be a slimmed down riak.

@bryanhuntesl
Copy link
Contributor

bryanhuntesl commented Mar 21, 2019 via email

@andytill
Copy link

I'm not that familiar with changes to this branch but is sext still in use? The performance boost over pure erlang was considerable, the numbers are probably in a Basho private repo somewhere.

@llelf
Copy link
Contributor

llelf commented Mar 21, 2019

sext still in use?

still used by eleveldb backend (but correct me if I'm wrong).

@martinsumner
Copy link
Contributor Author

SEXT:

Yes it is used by eleveldb (only), and so if we strip eleveldb then sext will go with it.

(btw I also remember the performance benefits were real, there were some big differences with 2i queries - but in testing we get bigger improvements with 2i by switching to leveled, although as much due to better vnode queue management as raw query throughput).

ENSEMBLE:

Was something we wanted to burn altogether, but a customer emerged who was using it (they built their tech platform for a superbowl ad campaign on it). There is also continued use in academia (I think @cmeiklejohn is actively using it in his research).

Just for clarity on the purpose of this issue (and proposed side project). The idea would be to have a pure-erlang "nucleus" of Riak. There would still be the option for people to build optional extensions that were non-Erlang (e.g. ensemble, yokozuna, etc).

However, I do think that JS map/reduce should probably be junked altogether and not even be an option in the future.

@aramallo
Copy link

My 2 cents here, if sext is used to keep sort order in eleveldb for composite keys then maybe they can be avoided by a technique Russell showed me he was using in BigSets prototype and I’ve been using in my Riak Core based DB, basically you use term_to_binary/1 for each key component and separate it with <<$0>>. Leveldb will keep the order when separating component with $0. In our case we end up using &lt;&lt;0,$\31,0>> as key separator for 6-element keys and it works like charm, removing 50 microsecs for every sext:decode/1 call!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants