Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability for clients to pass custom events #12

Merged
merged 10 commits into from Jun 14, 2018

Conversation

Projects
None yet
2 participants
@annthurium
Copy link
Contributor

commented Jun 12, 2018

There are some event types that are common across all client apps. usage events, ping events, and opt in / out events. Telemetry encapsulates the complexity around those so clients don't have to care.

However, apps might want to collect more flexible data. For example, Atom currently collects "file open" events, which preserve the grammar (e.g. language) of the opened file. That way we can answer questions like "what languages are most popular among users of Atom? (More on that in #8 (comment)).

This pull request adds the ability for telemetry to support custom/arbitrary event data that the clients pass in, which is then passed along to Central with the daily metrics dump.

Tilde Ann Thurium added some commits Jun 12, 2018

@@ -50,6 +51,9 @@ export interface IMetrics {
// metrics names are defined by the client and thus aren't knowable
// at compile time here.
measures: object;

// array of custom events that can be defined by the client
customEvents: object[];

This comment has been minimized.

Copy link
@annthurium

annthurium Jun 12, 2018

Author Contributor

note: I'm not 100% certain that Central is going to want the custom events in this format. However, that can easily be changed in the future so I'm not super worried about it.

@annthurium annthurium requested a review from jasonrudolph Jun 12, 2018

Tilde Ann Thurium added some commits Jun 12, 2018

Tilde Ann Thurium
Tilde Ann Thurium
@jasonrudolph
Copy link
Member

left a comment

👋 @annthurium: Thanks for digging into this. ⚡️ I've reviewed the diff and offered some suggestions below. I hope this helps.

README.md Outdated
@@ -43,8 +43,20 @@ store.setOptOut(true);

Please note that there are several methods of the `StatsStore` class that are public for unit testing purposes only. The methods describe above are the ones that clients should care about.

### Measures vs. custom events

There are some event types that are common across all client apps. usage events, ping events, and opt in / out events. `telemetry` encapsulates as much complexity around these as possible so clients don't have to deal with it.

This comment has been minimized.

Copy link
@jasonrudolph

jasonrudolph Jun 13, 2018

Member

There are some event types that are common across all client apps. usage events, ping events, and opt in / out events.

Suggestion: Use a colon, since the first clause is introducing a list of items.

There are some event types that are common across all client apps: usage events, ping events, and opt in / out events.

This comment has been minimized.

Copy link
@annthurium

annthurium Jun 13, 2018

Author Contributor

good suggestion!

README.md Outdated

Measures are a great fit for understanding the number of times a certain action happened. For example, how many times per day do users click a particular button?

However, apps might want to collect more complex metrics with arbitrary metadata. For example, Atom currently collects "file open" events, which preserve the grammar (e.g. language) of the opened file. For those use cases, the `addCustomEvent` function is your friend. `addCustomEvent` takes any object and stuffs it in the database, giving clients the flexibility to define their own data destiny. The events are sent to the metrics back end along with the daily payload.

This comment has been minimized.

Copy link
@jasonrudolph

jasonrudolph Jun 13, 2018

Member

which preserve the grammar (e.g. language)

"e.g." should probably be "i.e." here.

e.g. ➡️ "for example"
i.e. ➡️ "in other words"

In this case, "language" is not an example of a grammar. Instead, we're saying "grammar (in other words, language)".

This comment has been minimized.

Copy link
@annthurium

annthurium Jun 13, 2018

Author Contributor

ah oops, I always mix up i.e. and e.g. Will fix.

@@ -1,28 +1,45 @@
import * as loki from "lokijs";

export default class MeasuresDatabase {
private database: Collection<any>;
private measures: Collection<any>;
private customEvents: Collection<any>;

This comment has been minimized.

Copy link
@jasonrudolph

jasonrudolph Jun 13, 2018

Member

Naming is always hard. 🙈 In this case, I think measures is referring to the batch of simple counts that will get sent to the metrics backend on a daily basis. For example, if an app wants to record the number of commits that the app performs in a given day, and there's no need to record additional metadata (e.g., time of the commit, number of lines added, number of files involved, etc.), number of authors, then the app should use this batch of simple counts to store the commit count. If the app wants to record additional metadata, then the app would record an event instead.

Am I understanding correctly? If so, I wonder if there is alternative terminology that we could use to express this intent. Maybe dailyCounts and events instead of measures and customEvents?

In Atom's case, I anticipate the majority of the metrics will be events with metadata, as opposed to simple aggregate daily counts. Two reasons:

  1. Atom's metrics have this richness today, so it would be nice to keep that richness with the move to telemetry. 🙏
  2. You can never de-aggregate. 😇

This comment has been minimized.

Copy link
@annthurium

annthurium Jun 13, 2018

Author Contributor

measure are themselves sent as part of a usage event. So I kinda wanted to differentiate usage events (which are standard) from custom events (which can be whatever you want.) .

I'll think about what naming best captures that distinction. 🤔

This comment has been minimized.

Copy link
@jasonrudolph

jasonrudolph Jun 13, 2018

Member

Maybe measures could be usageCounts?

This comment has been minimized.

Copy link
@annthurium

annthurium Jun 13, 2018

Author Contributor

ooh, I like usageCounts!

README.md Outdated

```
const event = { type: "open", grammar: "javascript", timestamp: "now" };
await measuresDb.addCustomEvent(event);

This comment has been minimized.

Copy link
@jasonrudolph

jasonrudolph Jun 13, 2018

Member

I think we want these calls to always provide an event type. Optionally, callers would have the freedom to include additional metadata. I also think that we can automatically set the timestamp to the current time. If that makes sense, then I'd propose an API a bit more like the Google Analytics API or the Datadog API

await measuresDb.record('open', { grammar: 'javascript' })

What do you think?

This comment has been minimized.

Copy link
@annthurium

annthurium Jun 13, 2018

Author Contributor

so if I'm understanding you correctly, you are suggesting we enforce that each event has a type? That's do-able and probably a good idea.

This comment has been minimized.

Copy link
@jasonrudolph

jasonrudolph Jun 13, 2018

Member

Yes. I think each "class" of event should have some unique identifier (i.e., something that identifies the thing being measured). We could refer to that unique identifier as the event's type or its name or something else entirely. 😄

Tilde Ann Thurium and others added some commits Jun 13, 2018

Clarify documentation.
Co-Authored-By: Jason Rudolph <jasonrudolph@github.com>
Tilde Ann Thurium
@annthurium

This comment has been minimized.

Copy link
Contributor Author

commented Jun 14, 2018

@jasonrudolph, I believe I have addressed all your feedback and am ready to merge. Thanks again for your review!

@annthurium annthurium merged commit 0d424e8 into master Jun 14, 2018

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@annthurium annthurium deleted the 18-jun-custom-event branch Jun 14, 2018

@jasonrudolph

This comment has been minimized.

Copy link
Member

commented Jun 15, 2018

I believe I have addressed all your feedback and am ready to merge.

@annthurium: Thanks for the ping.

Following up on #12 (comment), I'd still like to recommend that we don't think of "custom events" as "custom." Quite the contrary: I think this should be the default/recommended way of recording events. These events provide richer information (by automatically including the timestamp of the event and supporting arbitrary metadata), and they're consistent with the event reporting approach in commonly-used tools like Google Analytics and Datadog.

To me, the simple usage counts are a more custom concept for a metrics library to provide. Along those lines, if we wanted to do so, I think we could implement the simple usage counts in terms of the more generic events implemented in this PR. For example, for events that should be aggregated locally and reported to the metrics backend on a daily basis:

  • We could "tag" those events with metadata:
    store.increment('some-count-to-be-aggregated', {aggregate: true})
  • Or, perhaps we could let clients specify which events should be aggregated when initializing the client:
    store = new StatsStore("atom", "1.24.1", {aggregatedEvents: [
      'some-count-to-be-aggregated',
      'some-other-count-to-be-aggregated',
    ]})
    
    store.increment('some-count-to-be-aggregated')       // will be aggregated
    store.increment('file-open', {grammar: 'source.js'}) // won't be aggregated
    store.increment('file-rename')                       // won't be aggregated
    store.increment('some-other-count-to-be-aggregated') // will be aggregated
    

Would you consider changing the terminology of "custom events" in the code and in the docs to remove the "custom" phrasing and refer to them as the primary way that clients should record event data?

@annthurium

This comment has been minimized.

Copy link
Contributor Author

commented Jun 15, 2018

I don't think "custom" implies "not the default." I think it implies "flexibility to send whatever data you want." I'm going to run these names past a few other devs and see how they interpret it and then get back to you - that way we have a few more data points about the clarity (or lack thereof) of this terminology. :-) thanks for bringing it up!

@jasonrudolph jasonrudolph referenced this pull request Jun 25, 2018

Merged

integrate telemetry #91

3 of 3 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.