Skip to content
Permalink
Browse files
Add specification for BridgeDB's metrics format.
We implemented BridgeDB's metrics in <https://bugs.torproject.org/9316>
but haven't specified its format yet.
  • Loading branch information
NullHypothesis committed Sep 18, 2019
1 parent 317d09b commit 8e49989519ca9f38955cc5f240e5c7cb6fb9a7c7
Showing with 75 additions and 0 deletions.
  1. +75 −0 bridgedb-spec.txt
@@ -389,3 +389,78 @@
bucket name as value to indicate which file bucket a bridge is assigned
to.

9. BridgeDB metrics (version 1)

BridgeDB exports usage metrics once every 24 hours. These metrics
encode how many approximate requests BridgeDB has seen per
distribution mechanism, per pluggable transport, per country code or
email provider, and per success/fail. For example, one of these

This comment has been minimized.

Copy link
@cohosh

cohosh Sep 18, 2019

nitpick: the wording of "per success/fail" doesn't seem quite right. Isn't it more "approximate request successes/failures per..."?

This comment has been minimized.

Copy link
@NullHypothesis

NullHypothesis Sep 18, 2019

Author Owner

Agreed.

metrics lines can tell us that over the last 24 hours, BridgeDB has
seen between 21 and 30 successful requests for obfs4 over moat from
Zimbabwe.

This section specifies the format of BridgeDB's metrics. Each metrics
file is formatted as follows:

"bridgedb-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
[At start, exactly once.]

YYYY-MM-DD HH:MM:SS defines the end (in UTC) of the included
measurement interval of length NSEC seconds (86400 seconds by
default).

Example:
bridgedb-stats-end 2019-09-18 00:33:44 (86400 s)

"bridgedb-stats-version" VERSION NL
[Exactly once.]

VERSION determines the version of the metrics format. As the
format changes over time, we will increment VERSION. The latest
version is 1 -- the first iteration of the metrics format.

Example:
bridgedb-stats-version 1

"bridgedb-metric-count" METRICS_KEY COUNT NL

This comment has been minimized.

Copy link
@cohosh

cohosh Sep 18, 2019

Do we want this to be bridgedb-metric-count or bridgedb-metrics-count?

This comment has been minimized.

Copy link
@NullHypothesis

NullHypothesis Sep 18, 2019

Author Owner

I would like to keep the name bridgedb-metric-count because it refers to a specific metric. However, I'll rename the following:

  • METRICS_KEY to METRIC_KEY
  • bridgedb-stats-end to bridgedb-metrics-end
  • bridgedb-stats-version to bridgedb-metrics-version
[Any number.]

METRICS_KEY determines a metrics key, which consists of several
fields, separated by a period:

DISTRIBUTION "." TRANSPORT "." CC/EMAIL "." "success" | "fail" "." RESERVED

DISTRIBUTION is BridgeDB's distribution mechanism, which
includes "https", "email", and "moat". These distribution
mechanisms may change in the future.

TRANSPORT refers to a pluggable transport protocol. This
includes "obfs2", "obfs3", "obfs4", "scramblesuit", and "fte".
These pluggable transports will change in the future.

CC/EMAIL refers to a two-letter country code iff DISTRIBUTION is

This comment has been minimized.

Copy link
@cohosh

cohosh Sep 18, 2019

What is this the country code of? We should state that precisely.

If it is clients, do we do binning to make sure we're not logging individual client requests?

This comment has been minimized.

Copy link
@NullHypothesis

NullHypothesis Sep 18, 2019

Author Owner

Yes, it's the country code of client's IP addresses, and yes, we do binning. I agree that we should be explicit here.

"moat" or "https"; or to an email provider iff DISTRIBUTION is
"email". We use two reserved country codes, "??" and "zz".
"??" denotes that we couldn't map an IP address to its country,
e.g., because our geolocation API was unable to. "zz" denotes a
proxy IP address, e.g., Tor exit relays. The two allowed email
providers are "gmail" and "riseup".

The next field is either "success" or "fail", depending on if
the BridgeDB request was successful or not. A request is
successful if BridgeDB attempts to provide the user with
bridges, even if BridgeDB currently has no bridges available. A
request has failed if BridgeDB won't provide the user with
bridges, for example, if the user could not solve the CAPTCHA.

The field RESERVED is reserved for an anomaly score. It is
currently set to "none" and should be ignored by
implementations.

COUNT is the approximate number of requests for the given
METRICS_KEY.

This comment has been minimized.

Copy link
@cohosh

cohosh Sep 18, 2019

If you do binning (rounding to nearest multiple of something) for these requests, it would be nice to see exactly how in this spec.

This comment has been minimized.

Copy link
@NullHypothesis

NullHypothesis Sep 18, 2019

Author Owner

Agreed.


Examples:
bridgedb-metric-count https.scramblesuit.zz.fail.none 100
bridgedb-metric-count moat.obfs4.??.success.none 3550
bridgedb-metric-count email.fte.gmail.fail.none 10

0 comments on commit 8e49989

Please sign in to comment.