Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Starting riak_core as application, how-to #1

Closed
djui opened this issue Oct 12, 2010 · 2 comments
Closed

Starting riak_core as application, how-to #1

djui opened this issue Oct 12, 2010 · 2 comments
Assignees

Comments

@djui
Copy link

djui commented Oct 12, 2010

Hej,

I was wondering if the riak_core_util:start_app_deps(riak_core) in riak_core_app:start/2 isn't creating a chicken'n'egg problem? When I try to start riak_core (for testing) from a shell by application:start(riak_core), it throws exception because the other applications (crypto, webmachine) are not running yet. So the code of start_app_deps/1, which itself uses the same app file key as ERTS for getting a list of applications is never reached. And when starting the deps applications before, there is no use for it.

So I wonder if I missed something when starting the riak_core app up. I currently use:

erl -pa ebin deps/*/ebin \
    -boot start_sasl \
    -eval "application:start(crypto), application:start(webmachine), application:start(riak_core)."
argv0 added a commit to argv0/riak_core that referenced this issue Apr 22, 2011
slfritchie added a commit that referenced this issue May 11, 2012
In an ideal world, this module would live in a repo that would be
easily sharable across multiple Basho projects.  The tricky bit for
this would be trying to make generic the
`application:get_env(riak_core, dtrace_support)' call that's
currently in the `riak_kv_dtrace:dtrace/1' function.  But we'll
wait for another day, I think.

The purpose of this module is to reduce the overhead of DTrace (and
SystemTap) probes when those probes are: 1. not supported by the VM,
or 2. disabled by application configuration.  #1 is the bigger
problem: a single call to the code loader can take several
milliseconds.  #2 is useful in the case that we want to try to reduce
the overhead of adding these probes even further by avoiding the NIF
call entirely.

SLF's MacBook Pro tests

without cover, with R14B04 + DTrace:

timeit_naive                 average  2236.306 usec/call over     500.0 calls
timeit_mochiglobal           average     0.509 usec/call over  225000.0 calls
timeit_best OFF (fastest)    average     0.051 usec/call over  225000.0 calls
timeit_best ON -init         average     1.027 usec/call over  225000.0 calls
timeit_best ON +init         average     0.202 usec/call over  225000.0 calls

with cover, with R14B04 + DTrace:

timeit_naive                 average  2286.202 usec/call over     500.0 calls
timeit_mochiglobal           average     1.255 usec/call over  225000.0 calls
timeit_best OFF (fastest)    average     1.162 usec/call over  225000.0 calls
timeit_best ON -init         average     2.207 usec/call over  225000.0 calls
timeit_best ON +init         average     1.303 usec/call over  225000.0 calls
@rzezeski
Copy link
Contributor

This is interesting. For a Riak deployment the bootscripts should be starting all the applications and in the appropriate order. If just messing with core at a console then I don't see how this works at all because the applications will be returned in the wrong order and when ensure_started calls application:start it will {error,{not_started,foo}} and cause a badmatch. Although, in practice, when I run application:start(riak_core) I don't see a badmatch.

I'm not sure why this code is needed at all and I don't think it should be called by riak_core_app:start.

@ghost ghost assigned rzezeski Aug 23, 2012
@jrwest
Copy link
Contributor

jrwest commented Aug 9, 2013

closing this out. perhaps there is an issue here but after working on two separate core applications (riak and other application for previous employer) booting applications has not been an issue. If the issue still applies today please create a new one.

@jrwest jrwest closed this as completed Aug 9, 2013
jrwest added a commit that referenced this issue Dec 19, 2013
Yet Another Round of Cluster Metadata Improvements (YAROCMI #1)
rzezeski added a commit that referenced this issue Sep 9, 2014
Abondon the idea of having an implicit lookup from bucket to index
when using Yokozuna to drive search input to map-reduce.

This is a very long-winded commit message because the subject matter
is confusing.  Scroll to end for TL;DR.

Riak Search and Yokozuna can both be used as engines to produces
results for a search input to a map-reduce job.  For example, in
pre-Yokozuna days if a user specified the following HTTP JSON
map-reduce job Riak Search would be used to run a query against the
"bucket" foo.

"inputs":{"bucket":"foo", "query":"bar"}

Calling it bucket is really a misnomer because what is being searched
is actually an _index_ of bucket foo.  However, since Riak Search
forced a 1:1 mapping from bucket to index name there wasn't much of a
difference.  The possibility of a M:1 bucket to index in the future
was never considered.

Fast forward to today.  There are two search systems in Riak now.
Yokozuna will eventually replace Riak Search but for the time being
there will be migrations from one to the other.  Furthermore, Search
APIs have already been exposed that assumed the functionality of Riak
Search, which is a small subset of Yokozuna.  Therefore, Yokozuna must
work around these weird cases.

In this case, passing the previous HTTP JSON map-reduce to Yokozua,
the user might expect 1 of 2 behaviors.

1. A bucket has an associated index.  Get the index from the mapping
and run the query against that.

2. Since the input should have been an index name to being with, and
1:1 mapping was just a coincidence in Riak Search, then simply treat
the 'bucket' input as the index name to search against.

The first option might seem like the most obvious but what if the user
decided to index multiple buckets under the same index?  The results
would now include results from other buckets.  An implicit filter
could be added to the query but we've started introducing more
implicit (magical) behavior.  Furthermore, under the covers, it
complicates the code.  For security purposes do we check for
permission on the bucket, on both?  But even if that wasn't enough,
things get EVEN STRANGER when you consider that you can use 2i as
input to a map-reduce job with the following JSON.

"inputs":{"bucket":"foo", "index":"field1_bin", "key":"bar"}

Hey, I know, we could overload this "inputs" field even more and
covert the following into a search request.

"inputs":{"index":"foo", "query":"bar"}

But if we chose option #1 from above we have a problem here.  The
Yokozuna code needs to know if the incoming search request is using an
index name or a bucket name.  If the later if needs to lookup the
index to query.  But in order to do this the Riak KV code needs to
create a new data structure to pass as the "Bucket" parameter so that
the Yokozuna callback knows which is which.  Does this sound very
confusing?  I hope so, because it is!

So after a long talk with Bryan Fink we decided to stop trying to be
cute and just face the fact that "bucket" was a poor name for the
input structure.  Just pretend it is actually "index" and Yokozuna
will treat it as such.

TL;DR;

If you are migrating from Riak Search then you should make sure to
keep a 1:1 mapping for bucket to index if you don't want to change
your map-reduce input.

If you are starting fresh then just use the new JSON format feed
map-reduce with Yokozuna.

"inputs":{"index":"foo", "query":"field1:bar", "filter"field2:baz"}

(Note that "filter" is optional)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants