Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add active engaged time heuristics for amp-analytics #1296

Closed
rudygalfi opened this issue Jan 5, 2016 · 8 comments
Closed

Add active engaged time heuristics for amp-analytics #1296

rudygalfi opened this issue Jan 5, 2016 · 8 comments

Comments

@rudygalfi
Copy link
Contributor

Offshoot from #871.

Some way to provide engaged time data.

More from @breauxc:

How do I get the ball rolling on the implementation of active engaged time? In the last couple of check-ins with Josh from Chartbeat and Andrew from Parse.ly, we agreed to move forward with active engaged time once we had a universal definition; we think we are now there.

To summarize, each second (or time interval) we count someone as being engaged if the user has the window in focus and has interacted with the page during the last five seconds. Chartbeat uses [pageload, focus, mousedown, mousemove, scroll, keydown, resize] as a minimal set of event interactions. The reason we don’t include mobile-specific events (e.g., touchstart, touchenter) is that some of our client sites have had performance issues with the handling for these events. Instead, we leverage the fact that essentially all mobile browsers fire desktop mouse/scroll events at the end of a touch tap to better accommodate legacy web pages. The only other substantial difference between Parse.ly’s open source implementation (https://github.com/Parsely/time-engaged) is that we will not consider video in the AMP definition.

cc @avimehta @btownsend

@philwills
Copy link

Our current implementation is based on http://upworthy.github.io/2014/06/implementing-attention-minutes-part-1/ as that was the first published example we saw.

There's a bit of variation in the events used, but I think it's close enough to the Parsely definition for that to be comparable.

@rudygalfi
Copy link
Contributor Author

One thing I think we should consider in the design of this is if we can enable the calculation of active engaged time through more of a server-based solution. The idea would be to provide all of the relevant events (scrolling, touch start, a count of click events since last ping) on a timer trigger. Then each endpoint could take the data and apply its own standard on what should count as an active user. This means AMP doesn't need to own the logic of calculating something where there may be many varied opinions on how to do it.

cc @cramforce who I think shared the above views

@breauxc
Copy link
Contributor

breauxc commented Jan 14, 2016

Apologies for the delayed response on this.

From Chartbeat's perspective, pushing the calculation of active engaged time to the server side isn't ideal because the implementation of handling pings from AMP becomes more complex -- we'd have to handle them in a bespoke manner. However, if this is the way we wanted to go, we've had previous success for data science purposes using bit vectors of whether a user had activity and whether a user was in focus for each time interval as a minimal set of information for its calculation.

@msukmanowsky
Copy link
Contributor

$0.02 from Parsely - the server side implementation for engaged time would not be ideal and require non-trivial updates to our processing pipeline to handle the pixels that @rudygalfi or @breauxc are describing.

The benefit of the engaged time implementation Parsely open sourced is minimization of pixel requests on the client side. Currently, every 5 seconds, we send a single pixel request that includes an inc=<unsigned int> parameter which represents the amount of engaged time since the last engaged time ping. Server-side aggregation of these pixels then becomes fairly trivial.

From a performance perspective, one issue that's been raised with our implementation is the events we listen for may be too obtrusive for AMP. We're looking to implement changes suggested by @RByers that'll address some of those concerns at least on Android/Chrome.

If AMP pages register only a single listener for the "engaged events" and then follow a similar callback to what we use here, I'd assume that's a more amenable approach then an increased number of HTTP requests to analytics providers.

@cramforce
Copy link
Member

Just for my understanding: If I'm reading an article I spend most time reading (especially on a tablet with a larger screen) and a little bit of time scrolling. Do you only count the time around the scrolling as actively engaged?

I'm not really invested in making any of these calculations client side or server side, but I'd like a solution that has a little "opinion" about whether something counts as engaged or not.

@msukmanowsky How do you actually add up that inc parameter? Is it the same as what @breauxc writes above "each second (or time interval) we count someone as being engaged if the user has the window in focus and has interacted with the page during the last five seconds"?

@joshschwartz
Copy link
Contributor

On the client vs server side counting question: I believe that every company that tracks engaged time currently does it on the client side, so I think it's most useful if it's done in the client — otherwise everyone has to make their own significant server-side changes. Definitions are also pretty much uniform, so I think that we're all happy using a single methodology. If we want to allow for server-side implementations, we could also send through bit vectors telling whether an act of engagement occurred at each given second.

@cramforce On your first question, yes. The idea is to have a number that provides a rough lower bound on the amount of time visitor actually spent reading. Most readers make regular interactions with the device while reading.

@britice
Copy link

britice commented Feb 5, 2016

I've started a branch to explore addressing this issue. I have a service "skeleton" which will be used by url-replacements.js to provide a replacement for 'TOTAL_ENGAGED_TIME'.

In the interest of getting feedback, I'll likely put out a PR for this as soon as I have tests created for just passing a hard coded engaged time value.

@rudygalfi
Copy link
Contributor Author

#1818 addressed this. Thanks all for the feedback in building this and let us know if you spot any opportunities for improvement!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants