Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to reset pageViewId. #1125

Open
VIKIVIKA opened this issue Oct 25, 2022 · 7 comments
Open

Option to reset pageViewId. #1125

VIKIVIKA opened this issue Oct 25, 2022 · 7 comments
Labels
type:defect Bugs or weaknesses. The issue has to contain steps to reproduce.

Comments

@VIKIVIKA
Copy link

VIKIVIKA commented Oct 25, 2022

Describe the bug

If we create multiple tracker's using newTracker, and call window.snowplow("trackPageView") for the first time,

Snowplow creates similar pageViewIDs for all the pageViews,

where as for further request's if window.snowplow("trackPageView") is called,
we get different pageViewIds for different tracker's available.

To Reproduce
Consider the Following,

<script>
	;(function(p,l,o,w,i,n,g){if(!p[i]){p.GlobalSnowplowNamespace=p.GlobalSnowplowNamespace||[]; p.GlobalSnowplowNamespace.push(i);p[i]=function(){(p[i].q=p[i].q||[]).push(arguments) };p[i].q=p[i].q||[];n=l.createElement(o);g=l.getElementsByTagName(o)[0];n.async=1; n.src=w;g.parentNode.insertBefore(n,g)}}(window,document,"script","https://cdnjs.cloudflare.com/ajax/libs/snowplow/2.18.2/sp.min.js","snowplow"));
		
		snowplow('newTracker', 'tracker1', 'collector.tracker1.com', initOptions);
		snowplow('newTracker', 'tracker2, 'collector.tracker1.com', initOptions);
</script>

After this first step if we run snowplow('trackPageView');, tracker1 and tracker2 will have same pageViewID's.

but after the first trackPageView, if we run again the same snowplow('trackPageView');, tracker1 and tracker2 will have different pageViewID's respectively.

Above example is an use case for a single page application, where in we are calling trackPageView on page load as well as when a user navigates to different route without reloading the page.

@VIKIVIKA VIKIVIKA added the type:defect Bugs or weaknesses. The issue has to contain steps to reproduce. label Oct 25, 2022
@igneel64
Copy link
Contributor

Hello @VIKIVIKA , thanks for reporting this, we will be taking a look and update you soon :)

In the meantime a codesandbox with the reproduction would be really helpful. Cheers!

@VIKIVIKA
Copy link
Author

VIKIVIKA commented Oct 26, 2022

@igneel64 thank you for the response, here is the sandbox with reproducible code,

https://codesandbox.io/s/withered-cache-xjio6q?file=/src/App.js

Also, the screenshot's of the pageViewID's behaviour, check both on page load and on manually method call behaviour,

image

you can click on the Track PageView button to manually call to create Page Views.

image

@VIKIVIKA
Copy link
Author

@igneel64 even the event's we generate apart from page view are using the last or the recent tracker page view id,
even we i attach the event's to specific tracker.

tracker 1 page view id:

image

tracker 2 page view id:

image

SD event attached to tracker 1, but show's tracker2 page view id:

image

SD event attached to tracker 2, show's tracker2 page view id:

image

This way all the event's are being tracked as part of only the last page view id, even if we associate with specific tracker,

am i missing something here? or is there any way to link the event's to a particular tracker and with respective page view id?

@igneel64
Copy link
Contributor

@VIKIVIKA Thank you for the reproduction link!

From a quick look on the sandbox, it seems like you are logging the pageViewId at a point where the output of getPageViewId is expected to be different.

From our point of view, this feature is set up so that for each page view you have the same id in all trackers in a sort of shared state which makes analysis sane. So when you call trackPageView on a tracker, the pageViewId is updated and is shared in subsequent calls from other trackers as well, except for another pageView event.

In the sandbox as I can see, you log the pageViewId at a different point for each tracker and particularly after a new pageview event has been triggered. By changing the code to something like:

      window.trackSnowplowPageView = () => {
        window.snowplow("trackPageView:tracker1");
        window.snowplow(function () {
          console.log("pageViewId1", this.tracker1?.getPageViewId());
          console.log("pageViewId2", this.tracker2?.getPageViewId());
        });
        window.snowplow("trackPageView:tracker2");
        window.snowplow(function () {
          console.log("pageViewId1", this.tracker1?.getPageViewId());
          console.log("pageViewId2", this.tracker2?.getPageViewId());
        });
      };

you can see that the pageViewId remains consistent.

Now based on your initial comment:

tracker1 and tracker2 will have different pageViewID's respectively.

If there is any pageview event sent, the pageViewId will change but is shared.

This way all the event's are being tracked as part of only the last page view id, even if we associate with specific tracker.

Yes, events will be tracked using the last page view id.

or is there any way to link the event's to a particular tracker and with respective page view id

In this case, you would be using the tracker name attribute to differentiate between the trackers, but I suppose that your use case requires something different.

Would it be possible you give us a hint on what you are trying to achieve in your analysis ?

Note:

The initial pageViewIds are the same because when the tracker has not sent any pageview yet, we are using the initial one, so that there is a common start.

@VIKIVIKA
Copy link
Author

@igneel64 We are trying to load two different tracker's on a single page application, and implementing set of event's or page pings, and expecting the following behaviour,

Example:

window.snowplow("trackPageView:tracker1"), should give pageViewId1 from tracker1,

window.snowplow("trackPageView:tracker2"), should give pageViewId2 from tracker2,

Now, let's say if i create one self describing event using tracker1, window.snowplow("trackSelfDescribingEvent:tracker1"), my expectation would be that this event will use the pageViewId generated from tracker1, but as the last pageView was generated for tracker2, this will use tracker2's pageViewId.

As we are collecting the page pings or any other event's based on the pageViewID that we receive, we might miss tracker1 page pings and event's in this case.

Also as we are using different collector's for these tracker's, and we won't get tracker1 and tracker2 at same place, in this scenario, we won't be able to map page pings based on pageViewID in collector used for tracker1, as they are referencing the pageViewId's for tracker2.

@igneel64
Copy link
Contributor

Thank you for the detailed description. We will get back to you soon :)
Just to mention that this does not seem to be a technical issue of the pageViewId per se.

@GideonShils
Copy link

GideonShils commented Oct 6, 2023

Hey @igneel64, bumping this issue as we're seeing it in our production app as we're trying to migrate collectors. We're pretty convinced this is in fact a bug in the tracker code related to how pageViewIds are generated in multi collector setups.

We call trackPageView on each url change in our SPA, including on initial page load. Here's an example illustrating what we're seeing:

Code:

  const trackerOne = newTracker('sp1', ...);
  const trackerTwo = newTracker('sp2', ...);

  const track = () => {
    console.log("pageViewIdOne: ", trackerOne.getPageViewId());
    console.log("pageViewIdTwo: ", trackerTwo.getPageViewId());

    // Since we don't specify the tracker here, it should use both
    trackPageView();

    console.log("pageViewIdOne: ", trackerOne.getPageViewId());
    console.log("pageViewIdTwo: ", trackerTwo.getPageViewId());
  }

Console output on initial page load:

pageViewId1: 52657899-e540-40ff-993b-f494fe7f7297
pageViewId2: 52657899-e540-40ff-993b-f494fe7f7297
pageViewId1: 52657899-e540-40ff-993b-f494fe7f7297
pageViewId2: 52657899-e540-40ff-993b-f494fe7f7297

Console output on page change:

pageViewId1: 52657899-e540-40ff-993b-f494fe7f7297
pageViewId2: 52657899-e540-40ff-993b-f494fe7f7297
pageViewId1: 9fd9459b-8d58-4f34-86bd-9efed31b5590
pageViewId2: 9fd9459b-8d58-4f34-86bd-9efed31b5590

The above output is correct, and what we would expect to see. On initial page load, we get a single pageViewId that is shared across all of our trackers. The initial call to trackPageView uses this initial page view ID. On page change, we re-generate the pageViewId`, which is updated for both trackers.

However, this misses what's actually going on behind the scenes. If you look at the actual pageViewIds that are sent with the events, they do not all align with the above.

Sent page views on initial page load:

trackerOne event: 52657899-e540-40ff-993b-f494fe7f7297
trackerTwo event: 52657899-e540-40ff-993b-f494fe7f7297

Sent page views on page change:

trackerOne event: 6b08f421-d347-4cb2-8a04-5746c0ebb3bd
trackerTwo event: 9fd9459b-8d58-4f34-86bd-9efed31b5590

Notice that on page change, trackerOne is sending a pageViewId that is not represented in the console.log output.

Here's what we think is going on here:

Behind the scenes, for any one call on our end to trackPageView, the javascript tracker is iterating over each collector and regenerating the pageViewId once for each. First, we track using the first collector and generate a pageViewId 6b08f421-d347-4cb2-8a04-5746c0ebb3bd, then we track using the second collector and generate a new pageViewId: 9fd9459b-8d58-4f34-86bd-9efed31b5590. By the time we console.log the output at the end, the global pageViewId (which is rightfully shared across all trackers) has been updated to the newer ID, so the print statements obscure what's actually happening.

We would expect thte browser tracker to have handling to ensure we generate a single pageViewId per external call to trackPageView. If we call trackPageView with multiple collectors, we would expect them to only generate a single new pageViewId, rather than re-generating on a per collector basis. The page view event is being sent as a single unit, so it would therefore make sense that the event has a single ID.

I'm not very familiar with the internals of this repo, but I think what needs to be added is some kind of additional check that ensures we only call resetPageView for the first tracker in the list all trackers. A super naive implementation to illustrate what I mean could look something like:

export function trackPageView(event) {
  firstTracker = trackers[0];
  remainingTrackers = trackers.slice(1);

  // Set some global state val indicating that this is the first in a set of trackers
  config.isFirstTracker = true;

  firstTracker.trackPageView(event);

  // Reset the global state val
  config.isFirstTracker = false;

  dispatchToTrackers(remainingTrackers, (t) => {
    t.trackPageView(event);
  });
}

export function logPageView(event) {
  if (pageViewSent && isFirstTracker) {
        resetPageView();
  }
}

Thanks, and lemme know your thoughts!


Edit:

One more note on why I think this is a bug as opposed to expected behavior:

The page ping events for the trackers fire using the latest global pageViewId. This means that the pageView event from all but the last tracker will be orphaned, and have no associated page ping events after page change.

Jack-Keene pushed a commit that referenced this issue Jun 5, 2024
…e page URL to account for events tracked before page views in SPAs (close #1307 and #1125)

PR #1308 
* Add an option to generate the page view ID according to changes in the page URL to account for events tracked before page views in SPAs (close #1307)

* Generate a new page view ID on the second page view tracked on the same page regardless of whether preservePageViewIdForUrl is enabled

* Handle multiple trackers with shared state such that they share the page view IDs for events tracked after each other
greg-el pushed a commit that referenced this issue Jun 17, 2024
…e page URL to account for events tracked before page views in SPAs (close #1307 and #1125)

PR #1308 
* Add an option to generate the page view ID according to changes in the page URL to account for events tracked before page views in SPAs (close #1307)

* Generate a new page view ID on the second page view tracked on the same page regardless of whether preservePageViewIdForUrl is enabled

* Handle multiple trackers with shared state such that they share the page view IDs for events tracked after each other
matus-tomlein added a commit that referenced this issue Jun 25, 2024
…e page URL to account for events tracked before page views in SPAs (close #1307 and #1125)

PR #1308 
* Add an option to generate the page view ID according to changes in the page URL to account for events tracked before page views in SPAs (close #1307)

* Generate a new page view ID on the second page view tracked on the same page regardless of whether preservePageViewIdForUrl is enabled

* Handle multiple trackers with shared state such that they share the page view IDs for events tracked after each other
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:defect Bugs or weaknesses. The issue has to contain steps to reproduce.
Projects
None yet
Development

No branches or pull requests

3 participants