Skip to content

Commit

Permalink
Replace Folsom and improve performance
Browse files Browse the repository at this point in the history
Folsom histograms are a major bottleneck under high concurrency, as described
in #4650. This was noticed during performance testing, confirmed using Erlang
VM lock counting, then verified by creating a test release with histogram
update logic commented out [1].

CouchDB doesn't use most of the Folsom statistics and metrics; we only use
counters, gauges and one type of sliding window, sampling histogram. Instead of
trying to re-design and update Folsom, which is a generic stats and metrics
library, take a simpler approach and create just the three metrics we need, and
then remove Folsom and Bear dependencies altogether.

All the metrics types we re-implement are based on two relatively new
Erlang/OTP features: counters [2] and persistent terms [3]. Counters are
mutable arrays of integers, which allow fast concurrent updates, and persistent
terms allow fast, global, constant time access to Erlang terms.

Gauges and counters are implemented as counter arrays with one element.
Histograms are represented as counter arrays where each array element is a
histogram bin. Since we're dealing with sliding time window histograms, we have
a tuple of counter arrays, where each time instant (each second) is a counter
array. The overall histogram object then looks something like:

```
Histogram = {
     1          = [1, 2, ..., ?BIN_COUNT]
     2          = [1, 2, ..., ?BIN_COUNT]
     ...
     TimeWindow = [1, 2, ..., ?BIN_COUNT]
  }
```

To keep the structure immutable we need to set a limit on both the number of
bins and the time window size. To limit the number of bins we need to set some
minimum and maximum value limits. Since almost all our histograms record access
times in milliseconds, we pick a range from 10 microseconds up to over one
hour. Histogram bin widths are increasing exponentially in order to keep a
reasonable precision across the whole range of values. This encoding is similar
to how floating point numbers work. Additional details on how this works are
described in the the `couch_stats_histogram.erl` module.

To keep the histogram object structure immutable, the time window is used in a
circular fashion. The time parameter to the histogram `update/3` function is the
monotonic clock time, and the histogram time index is computed as `Time rem
TimeWindow`. So, as the monotonic time is advancing forward, the histogram time
index will loop around. This comes with a minor annoyance of having to allocate
a larger time window to accommodate some process which cleans stale (expired)
histogram entries, possibly with some extra buffers to ensure the currently
updated interval and the interval ready to be cleaned would not overlap. This
periodic cleanup is performed in the couch_stats_server process.

Besides performance, the new histograms have two other improvement over the
Folsom ones:

  - They record every single value. Previous histograms did sampling and
    recorded mostly just the first 1024 values during each time instant
    (second).

  - They are mergeable. Multiple histograms can be merged with corresponding
    bins summed together. This could allow cluster wide histogram summaries or
    gathering histograms from individual processes, then combining them at the
    end in a central process.

Other performance improvement in this commit is eliminating the need to
periodically flush or scrape stats in the background in both couch_stats and
prometheus apps. Stats fetching from persistent terms and counters takes less
than 5 milliseconds, and sliding time window histogram will always return the
last 10 seconds of data no matter when the stats are queried. Now that will be
done only when the stats are actually queried.

Since the Folsom library was abstracted away behind a couch_stats API, the rest
of the applications do not need to be updated. They still call
`couch_stats:update_histogram/2`, `couch_stats:increment_counter/1`, etc.

Previously couch_stats did not have any tests at all. Folsom and Bear had some
tests, but I don't think we ever ran those test suites. To rectify the
situation added tests to cover the functionality. All the newly added or
updated modules should be have near or exactly 100% test coverage.

[1] #4650 (comment)
[2] https://www.erlang.org/doc/man/counters.html
[3] https://www.erlang.org/doc/man/persistent_term.html
  • Loading branch information
nickva committed Jul 12, 2023
1 parent 890c26d commit 69b10f0
Show file tree
Hide file tree
Showing 27 changed files with 1,577 additions and 410 deletions.
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ DESTDIR=

# Rebar options
apps=
skip_deps=folsom,meck,mochiweb,triq,proper,snappy,bcrypt,hyper,ibrowse
skip_deps=meck,mochiweb,triq,proper,snappy,bcrypt,hyper,ibrowse
suites=
tests=

Expand Down
2 changes: 1 addition & 1 deletion Makefile.win
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ DESTDIR=

# Rebar options
apps=
skip_deps=folsom,meck,mochiweb,triq,proper,snappy,bcrypt,hyper,ibrowse,local
skip_deps=meck,mochiweb,triq,proper,snappy,bcrypt,hyper,ibrowse,local
suites=
tests=

Expand Down
2 changes: 1 addition & 1 deletion build-aux/print-committerlist.sh
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ function get_contributors {

function print_comitter_list {
# list of external repos that we exclude
local EXCLUDE=("bear" "folsom" "goldrush" "ibrowse" "jiffy" "lager" "meck" "mochiweb" "snappy")
local EXCLUDE=("goldrush" "ibrowse" "jiffy" "lager" "meck" "mochiweb" "snappy")
local EXCLUDE=$(printf "\|%s" "${EXCLUDE[@]}")
local EXCLUDE=${EXCLUDE:2}
local SUBREPOS=$(ls src/ | grep -v "$EXCLUDE")
Expand Down
4 changes: 1 addition & 3 deletions mix.exs
Original file line number Diff line number Diff line change
Expand Up @@ -146,7 +146,6 @@ defmodule CouchDBTest.Mixfile do
"unicode_util_compat",
"b64url",
"exxhash",
"bear",
"mochiweb",
"snappy",
"rebar",
Expand All @@ -155,8 +154,7 @@ defmodule CouchDBTest.Mixfile do
"meck",
"khash",
"hyper",
"fauxton",
"folsom"
"fauxton"
]

deps |> Enum.map(fn app -> "src/#{app}" end)
Expand Down
1 change: 0 additions & 1 deletion rebar.config.script
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,6 @@ DepDescs = [
{fauxton, {url, "https://github.com/apache/couchdb-fauxton"},
{tag, "v1.2.9"}, [raw]},
%% Third party deps
{folsom, "folsom", {tag, "CouchDB-0.8.4"}},
{hyper, "hyper", {tag, "CouchDB-2.2.0-7"}},
{ibrowse, "ibrowse", {tag, "CouchDB-4.4.2-5"}},
{jiffy, "jiffy", {tag, "CouchDB-1.0.9-2"}},
Expand Down
4 changes: 0 additions & 4 deletions rel/reltool.config
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,6 @@
%% couchdb
b64url,
exxhash,
bear,
chttpd,
config,
couch,
Expand All @@ -46,7 +45,6 @@
dreyfus,
ets_lru,
fabric,
folsom,
global_changes,
hyper,
ibrowse,
Expand Down Expand Up @@ -92,7 +90,6 @@
%% couchdb
{app, b64url, [{incl_cond, include}]},
{app, exxhash, [{incl_cond, include}]},
{app, bear, [{incl_cond, include}]},
{app, chttpd, [{incl_cond, include}]},
{app, config, [{incl_cond, include}]},
{app, couch, [{incl_cond, include}]},
Expand All @@ -110,7 +107,6 @@
{app, dreyfus, [{incl_cond, include}]},
{app, ets_lru, [{incl_cond, include}]},
{app, fabric, [{incl_cond, include}]},
{app, folsom, [{incl_cond, include}]},
{app, global_changes, [{incl_cond, include}]},
{app, hyper, [{incl_cond, include}]},
{app, ibrowse, [{incl_cond, include}]},
Expand Down
13 changes: 2 additions & 11 deletions src/chttpd/src/chttpd_node.erl
Original file line number Diff line number Diff line change
Expand Up @@ -159,7 +159,6 @@ handle_node_req(#httpd{path_parts = [_, _Node, <<"_config">>, _Section, _Key | _
chttpd:send_error(Req, not_found);
% GET /_node/$node/_stats
handle_node_req(#httpd{method = 'GET', path_parts = [_, Node, <<"_stats">> | Path]} = Req) ->
flush(Node, Req),
Stats0 = call_node(Node, couch_stats, fetch, []),
Stats = couch_stats_httpd:transform_stats(Stats0),
Nested = couch_stats_httpd:nest(Stats),
Expand All @@ -169,8 +168,8 @@ handle_node_req(#httpd{method = 'GET', path_parts = [_, Node, <<"_stats">> | Pat
handle_node_req(#httpd{path_parts = [_, _Node, <<"_stats">>]} = Req) ->
send_method_not_allowed(Req, "GET");
handle_node_req(#httpd{method = 'GET', path_parts = [_, Node, <<"_prometheus">>]} = Req) ->
Metrics = call_node(Node, couch_prometheus_server, scrape, []),
Version = call_node(Node, couch_prometheus_server, version, []),
Metrics = call_node(Node, couch_prometheus, scrape, []),
Version = call_node(Node, couch_prometheus, version, []),
Type = "text/plain; version=" ++ Version,
Header = [{<<"Content-Type">>, ?l2b(Type)}],
chttpd:send_response(Req, 200, Header, Metrics);
Expand Down Expand Up @@ -261,14 +260,6 @@ call_node(Node, Mod, Fun, Args) when is_atom(Node) ->
Else
end.

flush(Node, Req) ->
case couch_util:get_value("flush", chttpd:qs(Req)) of
"true" ->
call_node(Node, couch_stats_aggregator, flush, []);
_Else ->
ok
end.

get_stats() ->
Other =
erlang:memory(system) -
Expand Down
2 changes: 1 addition & 1 deletion src/couch_prometheus/src/couch_prometheus.app.src
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
{description, "Aggregated metrics info for Prometheus consumption"},
{vsn, git},
{registered, []},
{applications, [kernel, stdlib, folsom, couch_stats, couch_log, mem3, couch]},
{applications, [kernel, stdlib, couch_stats, couch_log, mem3, couch]},
{mod, {couch_prometheus_app, []}},
{env, []}
]}.
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,7 @@
% License for the specific language governing permissions and limitations under
% the License.

-module(couch_prometheus_server).

-behaviour(gen_server).
-module(couch_prometheus).

-import(couch_prometheus_util, [
couch_to_prom/3,
Expand All @@ -26,72 +24,15 @@
version/0
]).

-export([
start_link/0,
init/1,
handle_call/3,
handle_cast/2,
handle_info/2,
code_change/3,
terminate/2
]).

-ifdef(TEST).
-export([
get_internal_replication_jobs_stat/0
]).
-endif.

-include("couch_prometheus.hrl").

start_link() ->
gen_server:start_link({local, ?MODULE}, ?MODULE, [], []).

-record(st, {
metrics,
refresh
}).

init([]) ->
Metrics = refresh_metrics(),
RT = update_refresh_timer(),
{ok, #st{metrics = Metrics, refresh = RT}}.
-define(PROMETHEUS_VERSION, "2.0").

scrape() ->
{ok, Metrics} = gen_server:call(?MODULE, scrape),
Metrics.

version() ->
?PROMETHEUS_VERSION.

handle_call(scrape, _from, #st{metrics = Metrics} = State) ->
{reply, {ok, Metrics}, State};
handle_call(refresh, _from, #st{refresh = OldRT} = State) ->
timer:cancel(OldRT),
Metrics = refresh_metrics(),
RT = update_refresh_timer(),
{reply, ok, State#st{metrics = Metrics, refresh = RT}};
handle_call(Msg, _From, State) ->
{stop, {unknown_call, Msg}, error, State}.

handle_cast(Msg, State) ->
{stop, {unknown_cast, Msg}, State}.

handle_info(refresh, #st{refresh = OldRT} = State) ->
timer:cancel(OldRT),
Metrics = refresh_metrics(),
RT = update_refresh_timer(),
{noreply, State#st{metrics = Metrics, refresh = RT}};
handle_info(Msg, State) ->
{stop, {unknown_info, Msg}, State}.

terminate(_Reason, _State) ->
ok.

code_change(_OldVsn, State, _Extra) ->
{ok, State}.

refresh_metrics() ->
CouchDB = get_couchdb_stats(),
System = couch_stats_httpd:to_ejson(get_system_stats()),
couch_prometheus_util:to_bin(
Expand All @@ -103,6 +44,9 @@ refresh_metrics() ->
)
).

version() ->
?PROMETHEUS_VERSION.

get_couchdb_stats() ->
Stats = lists:sort(couch_stats:fetch()),
lists:flatmap(
Expand Down Expand Up @@ -416,29 +360,3 @@ get_distribution_stats() ->
get_ets_stats() ->
NumTabs = length(ets:all()),
to_prom(erlang_ets_table, gauge, "number of ETS tables", NumTabs).

drain_refresh_messages() ->
receive
refresh -> drain_refresh_messages()
after 0 ->
ok
end.

update_refresh_timer() ->
drain_refresh_messages(),
RefreshTime = 1000 * config:get_integer("prometheus", "interval", ?REFRESH_INTERVAL),
erlang:send_after(RefreshTime, self(), refresh).

-ifdef(TEST).

-include_lib("couch/include/couch_eunit.hrl").

drain_refresh_messages_test() ->
self() ! refresh,
{messages, Mq0} = erlang:process_info(self(), messages),
?assert(lists:member(refresh, Mq0)),
drain_refresh_messages(),
{messages, Mq1} = erlang:process_info(self(), messages),
?assert(not lists:member(refresh, Mq1)).

-endif.
15 changes: 0 additions & 15 deletions src/couch_prometheus/src/couch_prometheus.hrl

This file was deleted.

5 changes: 2 additions & 3 deletions src/couch_prometheus/src/couch_prometheus_http.erl
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@
handle_request/1
]).

-include("couch_prometheus.hrl").
-include_lib("couch/include/couch_db.hrl").

start_link() ->
Expand Down Expand Up @@ -63,13 +62,13 @@ handle_request(MochiReq) ->
end.

send_prometheus(MochiReq, Node) ->
Type = "text/plain; version=" ++ ?PROMETHEUS_VERSION,
Type = "text/plain; version=" ++ couch_prometheus:version(),
Headers =
couch_httpd:server_header() ++
[
{<<"Content-Type">>, ?l2b(Type)}
],
Body = call_node(Node, couch_prometheus_server, scrape, []),
Body = call_node(Node, couch_prometheus, scrape, []),
send_resp(MochiReq, 200, Headers, Body).

send_resp(MochiReq, Status, ExtraHeaders, Body) ->
Expand Down
4 changes: 1 addition & 3 deletions src/couch_prometheus/src/couch_prometheus_sup.erl
Original file line number Diff line number Diff line change
Expand Up @@ -27,9 +27,7 @@ start_link() ->
init([]) ->
{ok, {
{one_for_one, 5, 10},
[
?CHILD(couch_prometheus_server, worker)
] ++ maybe_start_prometheus_http()
[] ++ maybe_start_prometheus_http()
}}.

maybe_start_prometheus_http() ->
Expand Down
2 changes: 0 additions & 2 deletions src/couch_prometheus/src/couch_prometheus_util.erl
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,6 @@
to_prom_summary/2
]).

-include("couch_prometheus.hrl").

couch_to_prom([couch_log, level, alert], Info, _All) ->
to_prom(couch_log_requests_total, counter, "number of logged messages", {
[{level, alert}], val(Info)
Expand Down
11 changes: 5 additions & 6 deletions src/couch_prometheus/test/eunit/couch_prometheus_e2e_tests.erl
Original file line number Diff line number Diff line change
Expand Up @@ -88,8 +88,6 @@ setup_prometheus(WithAdditionalPort) ->
% It's already started by default, so restart to pick up config
ok = application:stop(couch_prometheus),
ok = application:start(couch_prometheus),
% Flush so that stats aggregator starts using the new, shorter interval
couch_stats_aggregator:flush(),
Ctx.

t_chttpd_port(Port) ->
Expand Down Expand Up @@ -175,17 +173,18 @@ t_starts_with_couchdb(Port) ->
).

t_survives_mem3_sync_termination(_) ->
ServerPid = whereis(couch_prometheus_server),
?assertNotEqual(undefined, ServerPid),
?assertNotEqual(undefined, whereis(mem3_sync)),
ok = supervisor:terminate_child(mem3_sup, mem3_sync),
?assertEqual(undefined, whereis(mem3_sync)),
?assertMatch(
[[_, _], <<"couchdb_internal_replication_jobs 0">>],
couch_prometheus_server:get_internal_replication_jobs_stat()
couch_prometheus:get_internal_replication_jobs_stat()
),
{ok, _} = supervisor:restart_child(mem3_sup, mem3_sync),
?assertEqual(ServerPid, whereis(couch_prometheus_server)).
?assertMatch(
[[_, _], <<"couchdb_internal_replication_jobs", _/binary>>],
couch_prometheus:get_internal_replication_jobs_stat()
).

node_local_url(Port) ->
Addr = config:get("chttpd", "bind_address", "127.0.0.1"),
Expand Down
21 changes: 8 additions & 13 deletions src/couch_stats/README.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,13 @@
# couch_stats

couch_stats is a simple statistics collection app for Erlang applications. Its
core API is a thin wrapper around a stat storage library (currently Folsom,) but
abstracting over that library provides several benefits:
couch_stats is a simple statistics collection app for Erlang applications. It
uses https://www.erlang.org/doc/man/counters.html to implement counters,
gauges and histograms. By default histograms record 10 seconds worth of data,
with a granularity of 1 second.

* All references to stat storage are in one place, so it's easy to swap
the module out.

* Some common patterns, such as tying a process's lifetime to a counter value,
are straightforward to support.

* Configuration can be managed in a single place - for example, it's much easier
to ensure that all histogram metrics use a 10-second sliding window if those
metrics are instantiated/configured centrally.
Stats can be fetched with `couch_stats:fetch()`. That returns the current
values of all the counters and gauges as well as the histogram statistics for
the last 10 seconds.

## Adding a metric

Expand All @@ -26,4 +21,4 @@ abstracting over that library provides several benefits:

2. Tell couch_stats to use your description file via application configuration.

2. Instrument your code with the helper functions in `couch_stats.erl`.
3. Instrument your code with the helper functions in `couch_stats.erl`.
2 changes: 1 addition & 1 deletion src/couch_stats/src/couch_stats.app.src
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
{description, "Simple statistics collection"},
{vsn, git},
{registered, [couch_stats_aggregator, couch_stats_process_tracker]},
{applications, [kernel, stdlib, folsom]},
{applications, [kernel, stdlib]},
{mod, {couch_stats_app, []}},
{env, []}
]}.
Loading

0 comments on commit 69b10f0

Please sign in to comment.