Add element-level overrides for variables for amp-analytics #1298

Closed
rudygalfi opened this Issue Jan 5, 2016 · 41 comments

Projects

None yet

9 participants

@rudygalfi
Contributor

Offshoot from #871.

More detail from @avimehta:

The idea here is to enable certain variable values to be specified in the DOM. For example, sometimes, links/buttons can have additional data-* attributes associated with them in the DOM. amp-analytics could read these to fill in some variable templates.

cc @btownsend

@rudygalfi rudygalfi added this to the M? milestone Jan 22, 2016
@rudygalfi rudygalfi modified the milestone: Up Next, Backlog Mar 4, 2016
@rudygalfi
Contributor

@avimehta I think this is one of the next important projects. Could you put together a design? Let me know if you need any input.

@georgecrawford

I'm interested in proving a barrier variable to analytics calls, which would depend on the result of the AUTHDATA(access) field as a result of the amp-access callback. So barrier would be true if access was not granted, and false if the user is able to read the full content. Would this request be satisfied by this ticket, perhaps? I could set a variable value for use in amp-analytics, but vary it using amp-access="request" and amp-access="NOT request" markup.

@rudygalfi
Contributor

@georgecrawford Can you detail the intended use of barrier?

@georgecrawford

Nothing special, just a value I can send in my analytics payload to indicate the presence of a paywall barrier on the current page.

@georgecrawford

Any further thoughts on this?

@avimehta avimehta was assigned by dvoytenko Mar 16, 2016
@avimehta
Collaborator

@georgecrawford I don't think this issue will be able to address what you are asking for. Lets continue the discussion on #2476.

@georgecrawford

👍

@cdukes
cdukes commented Apr 17, 2016

Hi, any word on when this will be complete?

I'm working on implementing KISSMetrics tracking using amp-analytics, and this feature would really help with pushing custom event data to KM. (For example, when we send a clicked link event to KM, we encode the title of the linked page in a data-title attribute on the tag.)

@rudygalfi
Contributor

@cdukes We haven't gotten to this yet. We're working on the element-level trigger support (#1297) first.

@avimehta
Collaborator

PRs are obviously welcome. We needed per-event variables to get this done and I sent out a PR for that recently. It should be a straightforward addition to support what this issue is asking for. Unfortunately as @rudygalfi mentioned, #1297 is higher priority for now.

@senthilp
senthilp commented Jun 7, 2016

One suggestion would be to have the on attribute on elements with values track: prefixed, which will be replaced in the request URL query string when the event happens. This is similar to how we attach events to components like lightbox, sidebar etc. For instance

<a href="#deals" class="deals" on="track:p1233|m222|l111">Daily Deals</a>

In this case if the requests.event URL in the config is https://example.com/analytics?eid=${eventId}&sid=${track}, when the event happens ${track} will be replaced by p1233|m222|l111 and the beacon will be fired.

One thing to note here, the same element can have other events associated with it. For e.g. it should also open a amp-lightbox component. In that case we should come up with a syntax to handle multiple events. Is there one already?

The other approach would be as mentioned above use a data-track attribute, whose value will be replaced in the request URL query string when the event happens

<a href="#deals" class="deals" data-track="p1233|m222|l111">Daily Deals</a>
@avimehta
Collaborator

I personally like the second proposal. on attribute means something different and overloading the meaning might not be desirable. for example on="click..." does something on clicks. track is not an action though. It is usually click or tap.

The negative of using the second proposal is that some data might have to be duplicated in on and data-*. I think duplication is okay because clarity wins.

@dvoytenko what do you say?

@dvoytenko
Collaborator

I'll be pretty happy with a track="" attribute - its meaning will be clear and all DOM bubbling/nesting behavior will be obvious. The format itself is tbd.

@rajkumarsrk
Contributor

@dvoytenko In that case will attribute "track" will be whitelisted from the AMP validation? or can we go with "data-track"

@dvoytenko
Collaborator

Yes, I'd prefer we whitelist track for validation. But first let's confirm a more complete format so that we can make this decision easier. @avimehta could you please summarize this solution?

@avimehta avimehta was unassigned by rudygalfi Jun 17, 2016
@rudygalfi
Contributor

@senthilp @rajkumarsrk Any interest in putting together a PR for this feature?

@senthilp

Yes that is the plan. We are waiting on @avimehta to finalize the format for the track attribute.

@avimehta
Collaborator
avimehta commented Jun 20, 2016 edited

We could implement this in two ways.

  1. Prescribe only one attribute that can be specified by markup that amp-analytics reads and fills in.
  2. Read all the data- attributes on the clicked element and provide the values for substitution.

A solution like (2) allows one to have markup like this:

<a href="#menu" data-event="click" data-eventLabel="menu" data-eventValue="1234">Link</a>

When the link is clicked, a request of format http://example.com?ev=${event}&el=${eventLabel}&ev=${eventValue} will be expanded to http://example.com?ev=click&el=menu&ev=1234

We can have prescribe use of camelCase for data- attributes and have a limit on the value of the attributes but I think both these are optional.

The disadvantage of (2) is that validation becomes difficult.

@rudygalfi
Contributor

Looping in validator folks to help with design direction from validation point of view: cc @Gregable @powdercloud.

To summarize so far, it sounds like we won't go with

on="track:p1233|m222|l111"

Instead, we'll use one or several attributes as @avimehta outlined. The decision point is whether to allow just one or many.

@avimehta, I'm assuming in your proposal (1) above, that would require something like

<a href="#menu" data-track="event=click,label=menu,value=1234">Link</a>

On the other hand, (2) is as you described:

<a href="#menu" data-event="click" data-eventLabel="menu" data-eventValue="1234">Link</a>

And the question hinges on whether (2) presents validation challenges.

I'm leaning toward (1) but have no issues with (2) if it's technically feasible.

Can we talk about the substitution variable part of the picture?

For (1), I'm assuming we'd expose as something like TRACK or DATA_TRACK.
For (2), it could look like DATA(varName) (inspired by AMP Access' AUTHDATA).

@Gregable
Member

Preface: I'm not certain I follow this discussion completely. The data-* attributes are allowed by the validator universally on all tags already with any value, so if all you want is that the validator allows these, we're done already.

If on the other hand, you want to validate the specific attribs/values, then we will be removing allowed currently states from validation. Today, any combination of these attribs are allowed, in the future, some combinations will not be allowed. That's not ideal if any of these are in use already, though in general that's unlikely.

@rudygalfi
Contributor

I think your first comment about data-* attributes is what I was wondering and good to know.

@senthilp

Option 1: "Prescribe only one attribute that can be specified by markup that amp-analytics reads and fills in" that @avimehta proposed looks like a good solve. But it should also have var substitutions as a part of it. Something like this would also work

<a href="#menu" data-track="sw=${screenWidth}&sh=${screenHeight}&timezone=${timezone}">Link</a>

Here the variables would be substituted and added to the request payload. Other than variable substitution we will not be doing any other processing on the data-track attribute and it will just be appended to the payload.

Please let us know if this approach is ok and then we will start implementing it.

@rajkumarsrk
Contributor
rajkumarsrk commented Jun 21, 2016 edited

@rudygalfi @avimehta Senthil and I liked option 1
<a href="#menu" data-track="event=click,label=menu,value=1234">Link</a>. Considering the below example provided on
https://github.com/ampproject/amphtml/blob/master/extensions/amp-analytics/amp-analytics.md

<amp-analytics>
<script type="application/json">
{
  "requests": {
    "pageview": "https://example.com/analytics?url=${canonicalUrl}&title=${title}&acct=${account}",
    "event": "https://example.com/analytics?eid=${eventId}&elab=${eventLabel}&acct=${account}"
  },
  "vars": {
    "account": "ABC123"
  },
  "triggers": {
    "trackPageview": {
      "on": "visible",
      "request": "pageview"
    },
    "trackAnchorClicks": {
      "on": "click",
      "selector": "a",
      "request": "event",
      "vars": {
        "eventId": "42",
        "eventLabel": "clicked on a link"
      }
    }
  }
}
</script>
</amp-analytics>

Will it be like

<a href=".menu" data-track="event=click,eventLabel=clicked on a link 1">Link 1</a>
<a href=".menu" data-track="event=click,eventLabel=clicked on a link 2">Link 2</a>
<a href=".menu">Link 3</a>

And can we have the "vars" on the "data-track" override the event level definition, for example here we have "eventLabel". On click of link with text "Link 1", the "eventLabel" will be "clicked on a link 1" and click on link with text "Link 3" will be "clicked on a link" based on the value provided on "amp-analytics" tag. This way we can have the inline one overriding the event level definition on "amp-analytics"

@rudygalfi
Contributor

@rajkumarsrk One concern I have with what you're proposing is that

data-track="event=click,label=menu,value=1234"

requires AMP to parse out name/value pairs.

I do, however, like that there's a way to fallback to other variables defined as part of the config.

I propose going with (2) mentioned previously, which is separate data-* attributes:

<a href="#menu" data-event="click" data-eventLabel="menu" data-eventValue="1234">Link</a>

The part of the attribute name after the "data-" part will be interpreted as a locally defined variable name to enable usage like ${event} in a hit request. This would override any other definition of event, e.g. in vars.

@cramforce @avimehta @dvoytenko Does this sound good to you?

@senthilp Does this work? I know you'd proposed an alternative in #1298 (comment), but it seems like the option I described can work just as well.

@avimehta
Collaborator

+1 to Rudy's comment. To recreate the example that @senthilp mentioned, we would do it as follows:

<a href="#menu" data-sw="${screenWidth}" data-sh="${screenHeight}" data-timezone="${timezone}">
  Link
</a>

This ensures that any values provided by the publishers are opaque to AMP and AMP can just encode and add the values to the URL. With this format, the values can contain ,, and whatever else pubs would like.

@dvoytenko
Collaborator

Given the complexity of values, +1 to option (2) with a per-var attribute. One question I have: should it be a data-varName or a data-track-varName? I'm looking for a way to ensure that tracking data- attributes never overlap unintentionally with other data- attributes.

@rudygalfi
Contributor

I guess a couple things to consider are:

  • possibility of use case collision as @dvoytenko points out (i.e. using data-* for non-analytics purposes and then being required to use the same format to take advantage of this feature)
  • data- requirement to use hyphens and be all lowercase.

Perhaps something so clearly non-standard for the usual data- case would be good, like data-track_varName which results in varName being exposed to amp-analytics with the value of the attribute. varName would be restricted in the same way amp-analytics vars usually are.

@cramforce
Member

Some thoughts:

  • casing should follow html5 rules for converting data attribute names to camelcase. So, it would not be data-track-varName but rather data-track-var-name.
  • data-var-var-name might be a bit more semantic.

QQ: This is only for user event based analytics, right? So, the algorithm is:

  • on event inspect the target
  • walk up the DOM from the target
  • collect data-var- (or whatever the name) variables.
  • closer to the target wins with overlapping values.
  • make the result available to variable substitution

correct?

Agree, that we should not do structured values (name-value pairs) using a custom syntax. If the pure attribute based approach is insufficient, then AMP typically uses JSON encoded attribute values.

@dvoytenko
Collaborator

If @cramforce 's flow is what we'd like, we should consider performance of scanning data attributes. In this case, a cheaper approach would be to:

  1. Have a single (JSON encoded) track attribute. This will allow us to optimize DOM navigation using closest.
  2. The track attribute should be lazily parsed and saved on the instance of the element. This would have good savings as well on parsing.
@avimehta
Collaborator

@senthilp @rajkumarsrk Do you think traversing up the tree and collecting all the variables along the way is needed for you use?

I was looking at the instrumentation code and I feel like we shouldn't do any traversal other than what is already done. We should look at only the element whose selector was specified in the config.

So, if the config said "#button", we only look at that one element. If the selector is '*' or something that results in matching multiple elements, the event target should be used. This is both cheap and keeps things simple.

We can use either track="<json-encoded vars object>" or data-var- for this. If json encoded attributes are already used in other places, I am okay with that. If they are not used elsewhere, data-var- is easier to read and understand.

@senthilp

@avimehta We do not need to traverse the tree and our use case doesn't require it. We will only look at the target element on which the event triggered and extract the data attribute out of it. Hope this will reduce the complexity.

@dvoytenko
Collaborator

@avimehta @senthilp While it might be better to use data- attributes when traversal is not needed, I'd still like to see if we can do a single track instead to be able to extend it in the future, since this seems like a natural thing. I'll loop in @cramforce to help make this decision.

@cramforce
Member

One more question: Which events would this apply to? I suppose it is urgent for click tracking, but I'm wondering if it would need to be supported elsewhere.

@avimehta
Collaborator

This would apply to click targets for now. I think it could be applied to selectors in viewability trigger as well(We can implement this in future though if a request for it comes along).

@cramforce
Member

Cool, then lets just implement the data-var- scheme.

@rudygalfi
Contributor

Thanks @cramforce. data-var- SGTM.

@senthilp @rajkumarsrk Do you have all of the decisions you need to put together a PR?

@rajkumarsrk
Contributor

@rudygalfi Some clarification needed here is can we have all the vars as one group like data-vars="{"eventId":"42","eventLabel":"clicked on a link"}"
or is it has to be of data-var-eventId="42" data-var-eventLabel="clicked on a link"

@rudygalfi
Contributor

I prefer data-var-eventId="42" data-var-eventLabel="clicked on a link"

@rudygalfi
Contributor

Confirmed with @cramforce and @avimehta off-thread. We will go with data-var-eventId="42" data-var-eventLabel="clicked on a link" as I wrote above.

@rajkumarsrk
Contributor

@rudygalfi Thanks for confirming

@senthilp

I think we have most of the information to begin implementation. We will work on it starting next week.

@rudygalfi rudygalfi modified the milestone: Current, Next Jul 29, 2016
@avimehta avimehta closed this Aug 8, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment