-
-
Notifications
You must be signed in to change notification settings - Fork 598
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update Metrics Platform article instruments to use core interactions schema #4286
Update Metrics Platform article instruments to use core interactions schema #4286
Conversation
hi @sharvaniharan @cooltey (iirc @dbrant is on sabbatical?) -- pending schema changes (which I pushed to the For a nutshell update, the Metrics Platform team went through a re-org over the summer and pivoted our strategy from a monoschema approach to what we're calling a "data contract" approach wherein we are owning/maintaining base core interactions schemas (we will add interactions over time and update them accordingly pending data science approvals) that can be extended and owned by feature/product teams. This PR is to test the latest library release, provide examples of using the base interaction and extended custom schemas, and verify data on the consumer end of the pipeline. Since the first iteration of these article instruments sending events via MP was only ever released to the beta channel, I'm hoping it will be a fairly easy ask to merge this as it will not affect production. Happy to discuss if you have questions. I believe our respective PMs are/will-be in touch to discuss if/when the Android team would like to do further adoption of the Metrics Platform for current/future instrumentation efforts. |
…ruments-to-multischema
…ruments-to-multischema
…ruments-to-multischema
…ruments-to-multischema
Per discussion with @sharvaniharan, sounds like this MR will not be able to be reviewed and merged before the next release which is scheduled for Monday 11/27. It's possible another release could be scheduled before the end of the year (12/18?) in which case these changes could go out then pending review. Otherwise it is likely this MR will not go out until the following release in early January. |
…ruments-to-multischema
actionSubtype: String?, | ||
actionSource: String?, | ||
actionContext: String?, | ||
elementId: String?, | ||
elementFriendlyName: String?, | ||
funnelEntryToken: String?, | ||
funnelEventSequencePosition: Int? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For these nullable parameters, you can give them default values like:
actionSubtype: String? = null
And then In line 139, you can just call return InteractionData(action)
without sending null
for the other parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @cooltey -- i guess i can get rid of getInteractionDataMinimal()
altogether in L139 -- thanks!
…ruments-to-multischema
Hi @cjming, I have another question: how does the library handle exceptions? For example, will the app crash if we send the events that contain some special letters or if the custom schema ID does not match? We are planning to release the changes before the holiday and just want to make sure it is crash-free in the 2.0 library. |
…ruments-to-multischema
hi @cooltey - apologies I missed your comment (my GH notifications were going to an old email 😵💫 - now fixed) - responses inline:
All throughout the lifecycle of an event in the Java client library, if production stream configs don't match, or if event data is invalid, the event will basically not be added to the event queue (which is batch processed at a set interval) or the malformed event data is omitted from the event. You can see in the default event sender that any response from EventGate that is not in the 200s will log an exception.
The even data is validated by type in the event serializer -- if its value type doesn't validate, the data property key is omitted from the event.
The MetricsClient object fetches stream configs from prod (meta api) as a scheduled task so when an event is added to the queue, the stream name corresponding to the event is checked (and its corresponding schema ID which is pulled from prod config) and must be available as a production stream config or else it won't be added to the queue. So while an event can be created by passing in a custom schema id, if that custom schema id doesn't exist in production config, the event will not be submitted. This means that custom schemas must be in production along with their corresponding stream configs. Clear as mud?
Totally understand -- i think the worst that could happen is that we'll get validation errors in EventGate. If it makes everyone more comfortable, I am fine with waiting to ship this when we're all back in the new year. But I do think this is low-risk and based on the numbers I see in Grafana for the current article instruments sending events via MP (using the deprecated monoschema method), I don't anticipate a high volume of errors even in the unlikely situation that most of the events don't pass validation. |
Got it! That's super clear to me! I am going to +1 now and let @sharvaniharan review and merge if it looks good to her as well. Thanks for the PR and the information :) |
…ruments-to-multischema
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some inline comments but we can make a follow-up PR for these minor changes after merging.
action, | ||
null, | ||
"time_spent_ms.$timer.elapsedMillis", | ||
null, null, null, null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you have assigned a default null
values to these parameters, these null
can be ignored now.
getInteractionData(
"article_toolbar_interaction",
action,
null,
"time_spent_ms.$timer.elapsedMillis"
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here "time_spent_ms.${timer.elapsedMillis}"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool - updated in latest commit
null, | ||
null, | ||
null, | ||
null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above:
getInteractionData(
"article_link_preview_interaction",
action,
source.toString(),
"time_spent_ms.$timer.elapsedMillis"
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also another minor change would be to change it to "time_spent_ms.${timer.elapsedMillis}"
to get the ellapsedMillis value, else it will just give a string representation of the timer object.
Also curious, is this the variable where we can compose and add a string with extra data that might need to go with the event?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @cooltey ! removed extraneous nulls!
thanks @sharvaniharan ! updated the timer value
is this the variable where we can compose and add a string with extra data that might need to go with the event?
Yes - InteractionData maps to the MP common fragment which is referenced from the MP base app schema. As you can see, the properties of InteractionData
are mostly strings so data can be appended to any or all of them where appropriate.
@@ -219,10 +227,15 @@ class ArticleLinkPreviewInteraction : TimedMetricsEvent { | |||
private fun submitEvent(action: String) { | |||
submitEvent( | |||
"article_link_preview_interaction", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @cjming
The updates look good to me... just have a couple questions:
In the new way of wiring the events, I see that we are no more committing them to a specific stream. Does this mean we will not have an option to have multiple streams to the same schema? I ask this because, this is the easiest way for @Shay to work with a generic schema for multiple features.
However, I see that the schema name, event name and action and actionSubtype distinction which might help with the same. Just wanted to understand better, the role of streams here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi @sharvaniharan - thanks for the Qs - responses inline below:
Does this mean we will not have an option to have multiple streams to the same schema?
Multiple streams can be defined with the same schema in this context -- this will have to be done in config which would need to be in production before events can be submitted/validated/accepted.
So for example the MP base app schema can be defined as the schema for multiple stream names as long as there isn't any need for custom data. Here's an example:
'android.example_stream1' => [
'schema_title' => 'analytics/product_metrics/app/base',
'destination_event_service' => 'eventgate-analytics-external',
'producers' => [
'metrics_platform_client' => [
'provide_values' => [
'page_namespace_name,
'performer_is_temp',
'performer_groups',
],
],
],
],
'android.example_stream2' => [
'schema_title' => 'analytics/product_metrics/app/base',
'destination_event_service' => 'eventgate-analytics-external',
'producers' => [
'metrics_platform_client' => [
'provide_values' => [
'page_id',
'page_namespace_id,
'performer_is_logged_in',
'performer_session_id',
],
],
],
],
In the above examples, multiple streams can have events validated/accepted by the same base schema and even customize the contextual values submitted with their events. There definitely can be a one-to-many relationship between stream names and schema titles.
If custom data is needed however, then there will most likely be a 1-to-1 relationship between stream name and custom schema in config but this isn't a rule (see below).
this is the easiest way for @Shay to work with a generic schema for multiple features
So even in the case of custom schemas that Android might create, you could still have more than one stream defined that references the same custom schema similar to the examples above.
Let me know if this isn't clear of if you have more questions.
@sharvaniharan @cooltey thanks for the reviews -- just a friendly reminder that the schema patch should be merged before this MR - https://gerrit.wikimedia.org/r/c/schemas/event/secondary/+/974674 Per our docs (https://wikitech.wikimedia.org/wiki/Metrics_Platform/Creating_a_Custom_Schema#Namespacing), I placed the custom schemas for the 2 article instruments inside a new And per https://wikitech.wikimedia.org/wiki/Metrics_Platform/Creating_a_Custom_Schema#Ownership, these custom schemas will be owned by Android (the MP fragments they reference are owned by us) - hopefully that makes sense too. |
…ruments-to-multischema
…ruments-to-multischema
…ruments-to-multischema
…ruments-to-multischema
…ruments-to-multischema
The next major version bump of the Metrics Platform Java client library makes use of the core interactions schemas (instead of monoschema) to submit events.
This PR is a proof of concept to send events for validation against the base core interaction schema for apps as well as an example of extending the base interaction schema to capture custom data:
MetricsPlatform
to submit updatedClientData
tailored for appsMetricsEvent
to overloadsubmitEvent()
methodMetricsEvent
to pass in correct parameters for the newMetricsClient::submitInteraction()
methodsArticleEvent
instruments to submitInteractionData
to be validated against the core interaction schemaArticleEvent
instruments to submit custom data to be validated against the extended custom schemas respectively.Note: this PR depends on https://gerrit.wikimedia.org/r/c/schemas/event/secondary/+/974674
Bug: T351292