Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change percentages should match service UI. #1280

Closed
adamsilverstein opened this issue Mar 24, 2020 · 30 comments
Closed

Change percentages should match service UI. #1280

adamsilverstein opened this issue Mar 24, 2020 · 30 comments
Labels
P1 Medium priority Type: Bug Something isn't working
Milestone

Comments

@adamsilverstein
Copy link
Collaborator

adamsilverstein commented Mar 24, 2020

Bug Description

The "Change" percent displayed in Site Kit in the Analytics detail page does not match what is shown in the Analytics product UI. The time periods are correct, the calculations seem off.

A related issue where these numbers may be swapped on the Search Console details page: #870

Steps to reproduce

  1. Go to the Analytics detail page on a site with moderate traffic.
  2. Check reports for 28 days and 90 days.
  3. Compare the percent change shown for Users, Sessions, Bounce Rate and Session Duration against the numbers shown in the Analytics UI.
  4. Note the numbers do not match.

Screenshots

90 day view for my personal site:
image

Users, Sessions, Bounce Rate and Session: 129.7, 128.9, 0 (green), 21.3

same 90 days in Analytics UI:
image

Users, Sessions, Bounce Rate and Session: 133.6, 133.6, 0 (red), 19.5

Additional Context


Do not alter or remove anything below. The following sections will be managed by moderators only.

Acceptance criteria

Implementation Brief

  • In includes/Core/Modules/Module.php:parse_date_range add a new day_align parameter that defaults to false. when true, the previous period's time range will be adjusted to match the closest period with the same days of the week (ie starting on the same day of the week).
  • Compare the current period's end (yesterday) day of the week to the calculated end of the previous date range. When they don't match, adjust the date...
$previous_day_of_week = date( 'w', strtotime( $date_end ) );
$yesterday_day_of_week = date( 'w', strtotime( 'yesterday' ) );
if ( $day_align && $previous_day_of_week !== $yesterday_day_of_week ) { ...
  • To adjust the date, offset the period to move it to the closest period with matching days. If the previous period is earlier in the week and less than 4 days away, move the period forward. If the previous period is later in the week and more than 4 days away, move the period forward. Otherwise, move the period backwards. Move the period by the difference in the days of the week between the two periods.
// Adjust the date to closest period that matches the same days of the week.
if (
	// Previous day of the week earlier and less than 4 days away: move forward.
	(
		$previous_day_of_week < $yesterday_day_of_week &&
		$yesterday_day_of_week - $previous_day_of_week < 4
	) ||
	// Previous day of the week later and more than 4 days away: move forward.
	(
		$previous_day_of_week > $yesterday_day_of_week &&
		$previous_day_of_week - $yesterday_day_of_week > 4
	)
) {
		// Move the past date forward to the same day of the week.
		$date_end = gmdate( 'Y-m-d', strtotime( '+' . absint( $yesterday_day_of_week - $previous_day_of_week ) . "days", strtotime( $date_end ) ) );
		$date_start = gmdate( 'Y-m-d', strtotime( '+' . absint( $yesterday_day_of_week - $previous_day_of_week ) . "days", strtotime( $date_start ) ) );
	} else {
		// Move the past date backwards to the same day of the week.
		$date_end = gmdate( 'Y-m-d', strtotime( '-' . absint( $yesterday_dayofweek - $previous_day_of_week ) . "days", strtotime( $date_end ) ) );
		$date_start = gmdate( 'Y-m-d', strtotime( '-' . absint( $yesterday_dayofweek - $previous_day_of_week ) . "days", strtotime( $date_start ) ) );
	}
}
  • Only Analytics uses this alignment (as far as we know). In includes/Modules/Analytics.php in the 'GET:report' case, pass $day_align as true:
// Add the previous period date range, aligned to the day of the week.
if ( ! empty( $data['multiDateRange'] ) ) {
	$date_ranges[] = $this->parse_date_range( $date_range, 1, 1, true, true );
}

Note this will fix all change percentages except for the users datapoint.

QA Brief

  • The values displayed for Analytics Users, Sessions, Bounce rate etc. should match the value in the Analytics frontend for a 90 day period.
    • For the other available date ranges this was already the case prior to this issue.

Changelog entry

  • Fix inconsistency where Analytics numbers displayed for the last 90 days were slightly off from the values in the Analytics frontend.
@adamsilverstein adamsilverstein added the Type: Bug Something isn't working label Mar 24, 2020
@felixarntz felixarntz added the P1 Medium priority label Mar 26, 2020
@eclarke1 eclarke1 added this to the Sprint 21 milestone Apr 9, 2020
@felixarntz
Copy link
Member

IB ✅

@felixarntz felixarntz removed their assignment Apr 10, 2020
@ryanwelcher ryanwelcher self-assigned this Apr 13, 2020
@ryanwelcher
Copy link
Contributor

ryanwelcher commented Apr 13, 2020

In my working PR, I have added/updated tests for the functions that ultimately handle the output of the percentage and they are behaving correctly.

As such, I believe that this is a data issue that may be similar to #1202. When I compare the data in Site Kit with the UI, the only percentage that is incorrect is for Users:

1280-compare

The values being passed for the previous month and last month, 310 and 370 respectively, correctly produce a percent change of 19.4%.

However, unless I am mistaken, the value for last month should be 325, not 370 and which would also mean that 310 is incorrect as well.

@felixarntz
Copy link
Member

@ryanwelcher I think the percentage for "Users" is indeed related to #1202 not covering that part - see https://github.com/google/site-kit-wp/pull/1277/files#diff-e44d069f62420f6d21b8b9f50383e354L68 where the percentage still comes from the old API request. @adamsilverstein Could you open a PR against this issue here that ensures the percentage is calculated from your new ga:users API call as well?

Regarding the other percentages, which are also off though, that must be caused by something else. Maybe the date range we're querying for to compare is slightly off from what Analytics internally queries?

@adamsilverstein
Copy link
Collaborator Author

@ryanwelcher Thanks for looking into this. As you discovered, the issue here is with the data we are getting rather than the calculation itself.

Do you see the same percentage changes for all fields when you increase the time period to 28 or 90 days? The issue with the queries is more apparent then. I think I was seeing other changes listed incorrectly as well.

In #1202 I discovered that the query we use here which we pass the date dimension (to retrieve the daily chart data) aggregates calculations into daily totals.

I expect that we need to adjust the query I added there to use the feature that queries the previous range as well (plus add any other metrics that are off) so we can do the percent calculation on the correct numbers. I will work on this more to see if I can get the correct data.

Regarding the other percentages, which are also off though, that must be caused by something else. Maybe the date range we're querying for to compare is slightly off from what Analytics internally queries?

I've checked the date ranges repeatedly and am pretty confident we use the correct date spans. I suspect the date dimension may be the source of the issue (because it changes the way analytics calculates these changes).

@ryanwelcher
Copy link
Contributor

@adamsilverstein @felixarntz I just did a comparison of the ranges in Site Kit locally vs Analytics Dashboard. Taking into account that the percentage for Users is already known as incorrect, the only other range that was incorrect was 90 days.

7-Days
7-days

14-days
14-days

28-days
28-days

90-days
90-days

This seems to point to an issue with the date range. I'll dig into it further on my end.

@adamsilverstein
Copy link
Collaborator Author

This seems to point to an issue with the date range. I'll dig into it further on my end.

@ryanwelcher Probably worth checking the dates again, it is possible something is off, although I have looked at them carefully several times.

I suspect you will find the issue has to do with how the numbers are returned from our query due to the date dimension in our query. We probably need to add a similar query that doesn't include this dimension or expand the query I added to fix the user count. Also: in analytics, you should be able to drill down to see what numbers are used to calculate the percent change.

I found the query explorer invaluable in troubleshooting the user count issue: https://ga-dev-tools.appspot.com/query-explorer/ - here you can try the constructed query directly, with and without the date dimension to see how the totals are reported and compare them.

@ryanwelcher
Copy link
Contributor

I was able to confirm that the dates being requested match the data being returned and displayed. There now seems to be a discrepancy between the Analytics UI Home and Audience overview screens.

For example, the screenshot below shows data from the last 90 days:
site-kite-analytics-overview-90

Compare to the Analytics Home view, the percent changes are incorrect:
analytics-home-90

However, when comparing the same date ranges we request using the Audience Overview page, we can see that the percent changes match ( after some rounding ) Site Kit:
analytics-ui-90-compared

@adamsilverstein
Copy link
Collaborator Author

Very interesting. I confirmed the same for my site. The user percent change number in Site Kit matches what I see on the detail report, but the Analytics Dashboard shows a different number. I am starting to suspect the number showing in Analytics is incorrect or using some hidden calculation approach. Perhaps reaching out to the Analytics team is appropriate at this point to verify how this number is being calculated.

Site Kit reports 39.6% change:
image

Analytics Audience Overview report shows 39.05%:
image

While the Analytics Dashboard shows 41%:
image

@ryanwelcher were you ever able to replicate a query or calculation that matches the number reported on the Dashboard page?

@felixarntz
Copy link
Member

This issue is now blocked because further exploration is necessary on how Analytics is determining these numbers. I have merged the existing PR because it included useful test coverage, but I'll move this back to ACs, to be defined once we know what is happening here.

@felixarntz felixarntz removed this from the Sprint 21 milestone Apr 23, 2020
@felixarntz felixarntz removed their assignment Apr 23, 2020
@adamsilverstein
Copy link
Collaborator Author

A quick note that I have been working to track down this issue by opening a ticket with Analytics support. I will update here when we get more information.

@adamsilverstein
Copy link
Collaborator Author

One more update: Working on the IB I found that with the date modified I can reproduce every number exactly in my 90 day report except user percent change:

image

Still trying to determine the correct date/query/calculation used for this number, I've tried several variations.

Worth noting the date ranges shown in analytics suggest an aggregation by nth week (week of the year). In this short screencast you can see each data point is a week of data and the first and final weeks are partial weeks.

Screen Recording 2020-05-21 at 11 48 AM

@adamsilverstein
Copy link
Collaborator Author

I have an update about the data discrepancy we are seeing here:

An apparent bug in the Analytics dashboard causes inaccurate date ranges used on the Analytics home page for the 90 day report. In particular, there is a one day gap between compared date ranges.

For example:

On May 15th I ran the 90 day report and by hovering over the chart I can see the start and end date of the reporting period.

The 90 day period compares:
Feb15-May14 (yesterday->90 days ago)
Nov 16-Feb 13 (92 days ago ->182 days ago?) - note Feb 14th is missing

When I use these date ranges for the 90 day period, the percent changes match - other than the user datapoint, which I am guessing is due to aggregation, like we saw with the total. Testing that and then I will add a new IB for this issue.

@adamsilverstein
Copy link
Collaborator Author

@felixarntz / @ryanwelcher I updated the IB with instructions to fix the change percentages for 90 day reports. This fixes everything except the user change percentage which I am still investigating (awaiting help from support).

Maybe we can act on the existing IB, then re-open the issue to figure out the last data point?

@felixarntz felixarntz self-assigned this Jun 15, 2020
@felixarntz felixarntz removed this from the Sprint 28 milestone Jul 20, 2020
@adamsilverstein
Copy link
Collaborator Author

I have discovered the underlying issue with our Analytics percent change numbers being off... Turns out the past period used for comparison is always aligned on the same days of the week as the current period

So if the current period starts on a monday, the past period also starts on a monday

This makes some sense, especially when you look at a day by day comparison, because traffic patters will vary by day of the week, eg weekends might be busier times. By aligning on days of the week, this difference is reduced.

For 7 or 28 day reports, this works out that the previous period ends exactly before the current period - because 7 and 28 are whole peek periods (number of days is evenly divisible by 7).

When we try 90 days (or 30 in the Analytics UI), the previous period is "offset" to match the start day of the week; depending on the alignment this could mean moving the period forward, or backward by up to 3 days

I will add an IB for this change as well as the other user data changes you specified.

@felixarntz
Copy link
Member

@adamsilverstein IB ✅ , just one thing, let's name the parameter $weekday_align as it's a bit more precise.

@adamsilverstein adamsilverstein removed their assignment Jul 24, 2020
@adamsilverstein adamsilverstein removed their assignment Jul 28, 2020
@adamsilverstein adamsilverstein removed their assignment Aug 5, 2020
@aaemnnosttv aaemnnosttv self-assigned this Aug 6, 2020
@adamsilverstein adamsilverstein removed their assignment Aug 11, 2020
@felixarntz felixarntz assigned felixarntz and unassigned felixarntz Aug 11, 2020
@cole10up cole10up self-assigned this Aug 13, 2020
@cole10up
Copy link

Tested as a part of this ticket: #1681

Passed QA ✅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1 Medium priority Type: Bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants