-
-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Analyst SQL files chapter SEO #103
Conversation
|
||
#standardSQL | ||
|
||
SELECT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rviscomi This query is 5.2TB to test. I tried with the sample data (query below) but it seems pages_desktop_1k contains different urls then pages_mobile_1k.
What I try to do is look at the mobile request response URL and the desktop request response URL, and flag if they are different (e.g. a custom mobile site).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right, a website may only exist in one of the tables. The full mobile table actually has 1M more websites than desktop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am stuck in this one. To see if mobile is different from mobile, I look at the response and the 'redirectURL' field. If they don't match up, mobile is served a different URL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One possible approach may be to compare the page weights of desktop vs mobile sites having the same URL:
#standardSQL
SELECT
APPROX_QUANTILES(desktop_bytes - mobile_bytes, 10) AS bytes_diff
FROM
(SELECT url, bytesTotal as desktop_bytes FROM `httparchive.summary_pages.2019_07_01_desktop`)
JOIN
(SELECT url, bytesTotal as mobile_bytes FROM `httparchive.summary_pages.2019_07_01_mobile`)
USING (url)
Ideally we'll see fewer bytes on mobile.
Another approach could be to detect media query usage:
#standardSQL
SELECT
client,
num_urls,
pct_urls
FROM
`httparchive.blink_features.usage`
WHERE
yyyymmdd = '20190701' AND
feature = 'CSSAtRuleMedia'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The metric is more about answering whether websites serve a 'custom mobile' website. Page weight or media queries don't answer that imho.
Another round. More work than expected, they all have their own quirks. |
|
||
#standardSQL | ||
|
||
SELECT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One possible approach may be to compare the page weights of desktop vs mobile sites having the same URL:
#standardSQL
SELECT
APPROX_QUANTILES(desktop_bytes - mobile_bytes, 10) AS bytes_diff
FROM
(SELECT url, bytesTotal as desktop_bytes FROM `httparchive.summary_pages.2019_07_01_desktop`)
JOIN
(SELECT url, bytesTotal as mobile_bytes FROM `httparchive.summary_pages.2019_07_01_mobile`)
USING (url)
Ideally we'll see fewer bytes on mobile.
Another approach could be to detect media query usage:
#standardSQL
SELECT
client,
num_urls,
pct_urls
FROM
`httparchive.blink_features.usage`
WHERE
yyyymmdd = '20190701' AND
feature = 'CSSAtRuleMedia'
Is this ready for another review? |
Yes.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
apologies if these points were already discussed and not necessary :)
sql/2019/10_SEO/10_03.sql
Outdated
|
||
SELECT | ||
COUNTIF(hasAmpLink(payload)) AS score_sum, | ||
COUNTIF(hasAmpLink(payload)) / COUNT(0) AS score_percentage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we want all the percentage queries as ROUND(numerator * 100 / denominator, 2)
COUNTIF(parseStructuredData(payload)) AS occurence, | ||
ROUND(COUNTIF(parseStructuredData(payload)) * 100 / SUM(COUNT(0)) OVER (), 2) AS occurence_perc | ||
FROM | ||
`httparchive.pages.2019_07_01_*` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rick recommended grouping by _TABLE_SUFFIX AS client
in all of my queries, I'm not sure if that's something you might be considering as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
@ymschaap can you resolve this feedback ASAP?
@ymschaap have you resolved @patrickhulce's feedback? Let me know if this is ready for another review. |
COUNTIF(parseStructuredData(payload)) AS occurence, | ||
ROUND(COUNTIF(parseStructuredData(payload)) * 100 / SUM(COUNT(0)) OVER (), 2) AS occurence_perc | ||
FROM | ||
`httparchive.pages.2019_07_01_*` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
@ymschaap can you resolve this feedback ASAP?
Re: #12
Re: #91
See my comment below for latest state of queries: