Skip to content
This repository has been archived by the owner on Oct 9, 2018. It is now read-only.

Clarifications on histograms format #123

Open
Yoric opened this issue Oct 21, 2015 · 4 comments
Open

Clarifications on histograms format #123

Yoric opened this issue Oct 21, 2015 · 4 comments

Comments

@Yoric
Copy link

Yoric commented Oct 21, 2015

I'm currently working on a clean-slate implementation of Telemetry as a self-contained library, for Servo, and I need a few clarifications regarding the format accepted for histograms. See this issue for this specific piece of work.

Fields that make no sense for some histograms

If I read the source code correctly, convert.py will set sum, sum_squares_lo, sum_squares_hi, log_sum, log_sum_squares to -1 if these fields cannot be found. Does this mean that histograms that have no meaningful values for either of these fields (e.g. enumerated histograms, count histograms, boolean histograms, flag histograms) can omit these fields? Will e.g. the dashboard still work?

Min/max

For count histograms, if I read correctly the C++ source, the min is harcoded to 1, the max is hardcoded to 2 and the number of buckets is hardcoded to 3. Can I deduce that all three are ignored?

(I may have other questions later)

@mreid-moz
Copy link
Contributor

Yes, the sum, sum_squares_lo, sum_squares_hi, log_sum, and log_sum_squares fields can be omitted.

I don't even see the min and max values in the submitted JSON payloads, so I believe these can be ignored.

There is a bucket_count field in the JSON, which I assume is the same as the "number of buckets" you describe, and if so, this field should be included.

@Yoric
Copy link
Author

Yoric commented Oct 31, 2015

@mreid-moz Where can I find examples of the format used for keyed histograms?

@mreid-moz
Copy link
Contributor

Here is an example from an actual submission (with some empty keyed histograms deleted for brevity):

{
  "SEARCH_COUNTS": {
    "google.searchbar": {
      "sum_squares_hi": 0,
      "values": {
        "1": 0,
        "0": 1
      },
      "histogram_type": 4,
      "bucket_count": 3,
      "sum_squares_lo": 1,
      "range": [
        1,
        2
      ],
      "sum": 1
    }
  },
  "ABOUT_ACCOUNTS_CONTENT_SERVER_LOADED_RATE": {},
  "DEVTOOLS_WEBIDE_CONNECTED_RUNTIME_ID": {},
  "DEVTOOLS_HUD_APP_MEMORY_NAVIGATIONLOADED_V2": {},
  "ABOUT_ACCOUNTS_CONTENT_SERVER_FAILURE_TIME_MS": {},
  "DEVTOOLS_HUD_APP_STARTUP_TIME_SCANEND": {},
  "BLOCKED_ON_PLUGIN_INSTANCE_DESTROY_MS": {},
  "DEVTOOLS_HUD_APP_STARTUP_TIME_NAVIGATIONLOADED": {},
  "BLOCKED_ON_PLUGIN_MODULE_INIT_MS": {},
  "ADDON_SHIM_USAGE": {
    "onepassword4@agilebits.com": {
      "sum_squares_hi": 0,
      "values": {
        "5": 0,
        "7": 0,
        "6": 4
      },
      "histogram_type": 1,
      "bucket_count": 16,
      "sum_squares_lo": 144,
      "range": [
        1,
        15
      ],
      "sum": 24
    },
    "{b9db16a4-6edc-47ec-a1f4-b86292ed211d}": {
      "sum_squares_hi": 0,
      "values": {
        "5": 2,
        "4": 0,
        "7": 0,
        "6": 7
      },
      "histogram_type": 1,
      "bucket_count": 16,
      "sum_squares_lo": 302,
      "range": [
        1,
        15
      ],
      "sum": 52
    },
    "{d10d0bf8-f5b5-c8b4-a8b2-2b9879e08c5d}": {
      "sum_squares_hi": 0,
      "values": {
        "9": 1,
        "3": 1,
        "2": 0,
        "5": 2,
        "10": 0,
        "6": 1
      },
      "histogram_type": 1,
      "bucket_count": 16,
      "sum_squares_lo": 176,
      "range": [
        1,
        15
      ],
      "sum": 28
    }
  },
  "TELEMETRY_TEST_KEYED_FLAG": {},
  "DEVTOOLS_WEBIDE_CONNECTED_RUNTIME_VERSION": {},
  "TELEMETRY_TEST_KEYED_RELEASE_OPTIN": {}
}

In general, the keyed histograms format should be:

{
  "HISTOGRAM_1_NAME": {
    "KEY_A": { ... same type of histogram as in payload.histograms ... },
    "KEY_B": { ... same type of histogram as in payload.histograms ... }
  },
  "HISTOGRAM_2_NAME": {
    ...
  },
  "HISTOGRAM_N_NAME": {
    ...
  }
}

@Yoric
Copy link
Author

Yoric commented Nov 3, 2015

Ok, so we get the histogram_type and bucket_count repeated for each key.

I am starting to see a few places where we can optimize the payload if we ever find time to work on this :)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants