Skip to content

SEO 2024 queries#3791

Merged
tunetheweb merged 8 commits intoHTTPArchive:mainfrom
henryp25:adding-SEO-files
Nov 10, 2024
Merged

SEO 2024 queries#3791
tunetheweb merged 8 commits intoHTTPArchive:mainfrom
henryp25:adding-SEO-files

Conversation

@henryp25
Copy link
Copy Markdown
Contributor

@henryp25 henryp25 commented Oct 14, 2024

Makes progress on #3600

This PR adds the finalized SQL files which now include an is_root_page element that differentiates between the homepage and secondary pages. All SQL files utilize the June dataset, as it was the originating dataset used during the construction of these queries.

Context:
These changes were made to finalize the SQL queries for the 2024 SEO analysis. The new is_root_page element improves data separation between homepages and other pages, enhancing the overall analysis accuracy. Additionally, minor updates were applied to the SQL queries from 2022 to align with the new dataset structure. Common Table Expressions (CTEs) were introduced to improve efficiency and query readability.

Changes Made:

  • Introduced an is_root_page element to separate homepage and secondary page data.
  • Updated all queries to use the June dataset for consistency with the original development.
  • Slight modifications to 2022 SQL files for better compatibility with the new dataset and added CTEs to improve efficiency.

@tunetheweb tunetheweb changed the title Add finalized SQL files with is_root_page element for improved efficiency SEO 2024 queries Oct 14, 2024
@tunetheweb tunetheweb added the analysis Querying the dataset label Oct 14, 2024
Copy link
Copy Markdown
Member

@tunetheweb tunetheweb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM with a couple of small comments.

Let me know when good to merge.

Comment thread sql/2024/seo/image-loading-property-usage-2024.sql Outdated
Comment thread sql/2024/seo/robots-text-size-2024.sql Outdated
Comment thread sql/2024/seo/html-response-content-language-2024.sql
Comment thread sql/2024/seo/html-response-vary-header-used-2024.sql
henryp25 and others added 2 commits November 7, 2024 20:13
Co-authored-by: Barry Pollard <barrypollard@google.com>
Co-authored-by: Barry Pollard <barrypollard@google.com>
Comment thread sql/2024/seo/lighthouse-seo-stats-2024.sql Outdated
@tunetheweb tunetheweb merged commit 083de67 into HTTPArchive:main Nov 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

analysis Querying the dataset

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants