Problem
Starting in January 2024, the www site has been incorrectly setting the Segment userId to the Segment anonymousId by calling analytics.identify(analytics.user.anonymousId()). Per the Segment documentation, the identify method is not meant to be used in this way, and the result is that the our analytics engineers can't properly track the acquisition funnel for these users.
Background
We started working to fix this issue in February 2025 by removing the problematic code from the www site. There were a few straggler code bits of in-page JS that aren't tracked in this repo which were cleaned up in early March 2025.
Solution
Now that we believe we've removed all of the spots where we were creating bad data, we need to write a script to remove the bad userIds for returning visitors who have already had their userId mis-assigned. Unfortunately, Segment doesn't officially support removing the userId from a user so we'll have to take a somewhat hacky approach.
Prerequisite: The main site-wide header and footer scripts for the www site are not currently tracked in this repo, so we should first correct that problem. This is important because while the script outlined below is not large, it could have a substantial negative impact (i.e., erasing correct userIds) if not written properly and therefore I feel it's particularly important that it get a proper code review.
Write a script that:
- Checks if
analytics.user.userId() is non-null and doesn’t start with user_. Bail out if the userId is blank or looks okay.
- For the remaining "bad" userIds, capture
analytics.user.anonymousId().
- Call
analytics.reset() to reset both the userId and the anonymousId.
- Call
analytics.setAnonymouseId(oldAnonymousId) to set their anonymous ID back to what it was.
This should result in a Segment user with no userId and the correct anonymousId.
We should confirm that Segment doesn't cache the userId server side and restore it when it sees the anonymousId again. If they do, we could instead just reset both IDs by calling analytics.reset(), if the userID is non-null but doesn’t start with user_, and not attempting to restore the anonymousId. Unfortunately this would mean we lose those users previous analytics data, so hopefully it's not necessary.
Problem
Starting in January 2024, the www site has been incorrectly setting the Segment userId to the Segment anonymousId by calling
analytics.identify(analytics.user.anonymousId()). Per the Segment documentation, theidentifymethod is not meant to be used in this way, and the result is that the our analytics engineers can't properly track the acquisition funnel for these users.Background
We started working to fix this issue in February 2025 by removing the problematic code from the www site. There were a few straggler code bits of in-page JS that aren't tracked in this repo which were cleaned up in early March 2025.
Solution
Now that we believe we've removed all of the spots where we were creating bad data, we need to write a script to remove the bad userIds for returning visitors who have already had their userId mis-assigned. Unfortunately, Segment doesn't officially support removing the userId from a user so we'll have to take a somewhat hacky approach.
Prerequisite: The main site-wide header and footer scripts for the www site are not currently tracked in this repo, so we should first correct that problem. This is important because while the script outlined below is not large, it could have a substantial negative impact (i.e., erasing correct userIds) if not written properly and therefore I feel it's particularly important that it get a proper code review.
Write a script that:
analytics.user.userId()is non-null and doesn’t start withuser_. Bail out if the userId is blank or looks okay.analytics.user.anonymousId().analytics.reset()to reset both the userId and the anonymousId.analytics.setAnonymouseId(oldAnonymousId)to set their anonymous ID back to what it was.This should result in a Segment user with no userId and the correct anonymousId.
We should confirm that Segment doesn't cache the userId server side and restore it when it sees the anonymousId again. If they do, we could instead just reset both IDs by calling analytics.reset(), if the userID is non-null but doesn’t start with user_, and not attempting to restore the anonymousId. Unfortunately this would mean we lose those users previous analytics data, so hopefully it's not necessary.