restructure supervision tree so that folsom is an included app #217

Merged
merged 1 commit into from Jul 25, 2012

Projects

None yet

4 participants

@russelldb
Contributor

All stat mods depend on folsom, yet they are not linked to it.
This change brings folsom under supervision of a core stat sup,
which also supervises the riak stat subsystem. Now when folsom exits
everyone gets to restart clean and recover.

riak_core_sup
   |
   riak_core_stat_sup (rest_for_one)
    \
       - folsom_sup
       - riak_core_stats_sup (one_for_one)
        \
          - riak_*_stat
          - riak_stat_cache

riak_core_stats_sup will start and supervise gen_server stat mods at
registration time, and will re-start them should the sup crash.

@russelldb russelldb restructure supervision tree so that folsom is an included app
All stat mods depend on folsom, yet they are not linked to it.
This change brings folsom under supervision of a core stat sup,
which also supervises the riak stat subsystem. Now when folsom exits
everyone gets to restart clean and recover.

riak_core_sup
    |
riak_core_stat_sup (rest_for_one)
    \
       - folsom_sup
       - riak_core_stats_sup (one_for_one)
        \
          - riak_*_stat
          - riak_stat_cache

riak_core_stats_sup will start and supervise gen_server stat mods at
registration time, and will re-start them should the sup crash.
9279d51
@russelldb russelldb was assigned Jul 25, 2012
@argv0
Contributor
argv0 commented Jul 25, 2012

+1 to merge. Tested various crash scenarios, does what it says on the tin.

@argv0 argv0 merged commit 9279d51 into 1.2 Jul 25, 2012
@yrashk
yrashk commented Mar 5, 2013

I am wondering if this was a good decision. Systems that use riak_core would also like to use folsom (this is our case, too) — but we can't nicely initialize our metrics for these reasons:

  • folsom_sup doesn't read out folsom application env to initialize metrics declared there
  • with riak_core being an included application (at least in our case) we can't initialize our metrics before we start our supervisor because in this case riak_core startup will fail — because we would have to start folsom to initialize our metrics, and riak_core_stat_sup will obviously choke on that
  • also, we can't initialize our metrics after we start our own supervision tree because some processes in that supervision tree need the metrics to be initialized right away, at their startup time

I only have two workarounds, and neither of them is pretty:

  • Start a temporary (or a permanent) worker right after riak_core to let it declare the metrics. It can also find folsom_sup and link to it, I guess
  • Initialize required metrics in their respective workers — which makes metrics initialization really scattered around and in certain case, leads to runtime initialization duplication overhead.

Any thoughts?

@aerosol

Unfortunately this prevents riak_core from starting when having an application that depends on riak_core and folsom :(

@russelldb could you ELI5 the purpose of it? I'm unable to comprehend the commit message tbh. Were you experiencing folsom crashing? Was folsom's own supervision not good enough?

Contributor

ELI15? Folsom's supervisor is not good enough. Ets tables containing histogram data were crashing and then forever errorring as some other table would not let you re-create them, nor delete them. TBH we're removing folsom in preference to exometer, so a PR that removes this change might get some attention.

@seancribbs seancribbs deleted the rdb-stats-aas branch Apr 1, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment