Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make page context resilient to quick navigation changes #852

Merged
merged 50 commits into from
Oct 5, 2023

Conversation

silesky
Copy link
Contributor

@silesky silesky commented Apr 28, 2023

Fix stale page context information for buffered events, so page data is resilient to quick navigation changes.

What is this????

Previously, we only captured page information at event flush time, and it was possible for page information like URL / params to be wrong by the time analytics was fully loaded. Now, we capture page information at function call time and store it in the arguments buffer, as if it was any other explicitly passed function argument like traits or properties.

Basically, every time there's a buffered method call (track, identify, etc), we gather the current page information like document.title, location.href, and use it to construct a BufferedPageContext object, that gets added as the last argument of the buffered argument list. Then, when we flush events, that last argument acts a kind of hidden parameter that we pop off the arguments list when the 'real' Analytics method is called -- and we use that stored page information to augment the event with the url, search params, title, etc.

Other Change list summary

  • Move page enrichment step to event factory, rename page enrichment plugin 'EnvironmentEnrichment'.
  • Parse UTM params into context.campaign if consumers override the default search params with an object.
  • No changes to auto-track, but fixed tests and refactored poorly configured, brittle tests with hidden dependencies
  • Fix warnings in UniversalStorage

In order for the snippets to get the same resilliency, they require this update

// using es5 to maximize bw compat
   !function () {
      // add this function - using single key names to save bytes
     function p() {
        var  c = document.querySelector("link[rel='canonical']");
        return {
           __t: 'bpc',  // discriminant key for extra uniqueness guarantee
          c: (c && c.getAttribute('href')) || undefined,
          p: location.pathname,
          u: location.href,
          s: location.search,
          t:  document.title,
          r:  document.referrer,
        }
      }
   
         // full snippet example in `index-buffered-page-ctx.html`
          analytics.factory = function (e) {
            return function () {
            ....
              var t = Array.prototype.slice.call(arguments);
             // add this statement
              if (['track', 'screen', 'alias', 'group', 'page', 'identify'].indexOf(e) > -1) t.push(p());
   

Manual Test

  1. cd examples/standalone-playground && yarn dev
  2. open index-buffered-page-ctx.html (and read the description)

@changeset-bot
Copy link

changeset-bot bot commented Apr 28, 2023

🦋 Changeset detected

Latest commit: d3da1c8

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 3 packages
Name Type
@segment/analytics-core Patch
@segment/analytics-next Minor
@segment/analytics-node Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@silesky silesky force-pushed the fix-utm-params branch 3 times, most recently from 0e248e8 to b76314b Compare April 29, 2023 22:56
@silesky silesky changed the title Fix utm params Make page context resilient to quick navigation changes Apr 29, 2023
@silesky silesky marked this pull request as ready for review May 1, 2023 22:09
@silesky silesky requested a review from oscb May 1, 2023 22:16
oscb

This comment was marked as outdated.

@silesky silesky changed the title Make page context resilient to quick navigation changes WIP: Make page context resilient to quick navigation changes May 8, 2023
@silesky silesky marked this pull request as draft May 11, 2023 21:13
@silesky silesky force-pushed the fix-utm-params branch 3 times, most recently from 0c985ca to 4329979 Compare September 12, 2023 21:15
@silesky silesky force-pushed the fix-utm-params branch 2 times, most recently from 2295176 to e7e9a4f Compare September 22, 2023 20:38
@@ -169,9 +167,7 @@ async function flushFinalBuffer(
): Promise<void> {
// Call popSnippetWindowBuffer before each flush task since there may be
// analytics calls during async function calls.
buffer.push(...popSnippetWindowBuffer())
Copy link
Contributor Author

@silesky silesky Sep 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI - Refactor: moved this into the PreInitMethodCall class, so hydration happens as part of ".getCalls()" or toArray()

@silesky silesky marked this pull request as ready for review September 22, 2023 20:53
@silesky silesky changed the title WIP: Make page context resilient to quick navigation changes Make page context resilient to quick navigation changes Sep 22, 2023
@silesky silesky force-pushed the fix-utm-params branch 2 times, most recently from 69c925c to 2566557 Compare September 26, 2023 21:17
update size limit

wip
Copy link
Contributor

@danieljackins danieljackins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Test page is awesome 👌

packages/browser/src/core/page/__tests__/index.test.ts Outdated Show resolved Hide resolved
Copy link
Contributor

@oscb oscb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments. Some architectural suggestins, but I don't think they are a blocker. good work

Comment on lines +382 to +383
this.integrations,
pageCtx
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These arguments lists are getting too long. While we cannot do away from the current method signatures for compatibility maybe we should consider making this last argument an extensible object, so that we don't have to add one more thing in the future.

Copy link
Contributor Author

@silesky silesky Oct 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point, but I maybe we can save that if we ever add another argument? It seems pretty unlikely that we will, though not totally out of the question. Also feels a little weird to have an object with one optional property. Also, I would imagine any other meta page information would likely be added to pageCtx. Let me know if you feel strongly!

).includes(method)
this.args =
shouldAddPageContext && !hasBufferedPageContextAsLastArg(args)
? [...args, getDefaultBufferedPageContext()]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also be adding this here? Isn't the snippet already doing this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is mostly for the scenarios outside of the snippet, if I'm reading this correctly, would it make more sense then to only do it only on PreInitMethodCallBuffer.push given that class already has most of the logic? Keep the data object simple I mean.

Copy link
Contributor Author

@silesky silesky Oct 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it make more sense then to only do it only on PreInitMethodCallBuffer.push given that class already has most of the logic? Keep the data object simple I mean.

That's a good refactor suggestion! done.

packages/browser/src/core/buffer/index.ts Show resolved Hide resolved
packages/browser/src/core/events/__tests__/index.test.ts Outdated Show resolved Hide resolved
packages/browser/src/core/page/get-page-context.ts Outdated Show resolved Hide resolved
Comment on lines 91 to 94
const a = document.createElement('a')
a.href = canonicalUrl
// remove leading slash from canonical URL path
return a.pathname[0] === '/' ? a.pathname : '/' + a.pathname
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious: why create an <a> to get the URL parts? can't new URL work here?

Copy link
Contributor Author

@silesky silesky Oct 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't know the answer -- I've always assumed this was copypasta via Netto from analytics.js, but looking at the old codebase -- it's not. So I'm not sure why it is the way it is. Browser Support wouldn't really explain it, unless it was just taken from some obsolete StackOverflow post.

There are some behaviorial differences:

image

The person who wrote this originally did not write tests do not cover edge cases like assigning a non-URL to canonical URL -- so it's probably safe to change, but I didn't want to add any risk to an already relatively risky PR.

Edit: I went down the rabbit hole and updated it (and added some tests to ensure the weird behavior is the same)

Comment on lines +11 to +32
export const addPageContext = (
event: SegmentEvent,
pageCtx = getDefaultPageContext()
): void => {
const evtCtx = event.context! // Context should be set earlier in the flow
let pageContextFromEventProps: Pick<EventProperties, string> | undefined
if (event.type === 'page') {
pageContextFromEventProps =
event.properties && pick(event.properties, Object.keys(pageCtx))

event.properties = {
...pageCtx,
...event.properties,
...(event.name ? { name: event.name } : {}),
}
}

evtCtx.page = {
...pageCtx,
...pageContextFromEventProps,
...evtCtx.page,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I prefer to keep these kind of functions immutable so that there's no invisible sideeffects

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I more or less copied and pasted this from the plugin, which is why it mutates. You mean have addPageContext return a new event, rather than mutate?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the answer is "it does not", but these mutations aren't modifying the properties object the customer passes in right?

Copy link
Contributor Author

@silesky silesky Oct 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These mutations aren't modifying the properties object the customer passes in right?

Correct

}

const query: string = evtCtx.page.search || ''
const search = evtCtx.page!.search || ''
Copy link
Contributor

@oscb oscb Sep 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have an integration test that can guarantee that we don't move this order I think this is fine. If we don't then it might be better to add the extra lines to check it and prevent a crash.

packages/browser/src/core/page/__tests__/index.test.ts Outdated Show resolved Hide resolved

const query: string = evtCtx.page.search || ''
const query =
typeof search === 'object' ? objectToQueryString(search) : search
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what scenario is search and object?

Copy link
Contributor Author

@silesky silesky Oct 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users currently will occasionally override search with a dictionary or something weird that contains parsed search params (e.g they use NextJS where it's on the router object) when they are making a page call.

Were you thinking that we want to support the URLSearchParams interface?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering how we'd support query strings that had multiple values for the same key. (Wasn't sure what was making the object in the first place for it to be a concern) For example ?foo=a&foo=b

Looks like next allows strings or string arrays for the value:

NextApiRequest.query: Partial<{
    [key: string]: string | string[];
}>

But new URLSearchParam() won't work with { 'foo': ['a', 'b'] } - it'd have to look like [['foo', 'a'], ['foo', 'b']]. It does accept Record<string, string>, but treats Record<string, string[]> oddly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably convert the object straight to the array of arrays version easily enough.

Copy link
Contributor Author

@silesky silesky Oct 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also be fine ripping this out, since it is a new feature. Just don't want to block on this.

Copy link
Contributor Author

@silesky silesky Oct 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chrisradek I updated the query string parsing logic to handle [param: string]: string[] + tests.

It retains the current logic (which is the same as classic, I checked), which is:
'utm_custom=123&utm_custom=456 -> {campaign: {utm_custom: 456}}


/**
* A list of the method calls before initialization for snippet users
* For example, [["track", "foo", {bar: 123}], ["page"], ["on", "ready", function(){..}]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Appreciate the example comments!

* Pull any buffered method calls from the window object, and use them to hydrate the instance buffer.
*/
private get calls() {
this._pushSnippetWindowBuffer()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 to this improvement

Copy link
Contributor

@chrisradek chrisradek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall think this looks great! Are you going to make a similar update to the snippet repo once this is merged?

@silesky
Copy link
Contributor Author

silesky commented Oct 5, 2023

Overall think this looks great! Are you going to make a similar update to the snippet repo once this is merged?

yep!

@silesky silesky merged commit 897f4cc into master Oct 5, 2023
5 checks passed
@silesky silesky deleted the fix-utm-params branch October 5, 2023 11:16
@github-actions github-actions bot mentioned this pull request Oct 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants