Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bump up domain clustering #241

Merged
merged 1 commit into from
Jun 6, 2023
Merged

bump up domain clustering #241

merged 1 commit into from
Jun 6, 2023

Conversation

ohnorobo
Copy link
Collaborator

@ohnorobo ohnorobo commented Jun 6, 2023

Cluster earlier by domain. This will make FE queries generally cheaper since we have domain filtering in the default view.

@ohnorobo ohnorobo requested a review from fortuna June 6, 2023 12:22
@ohnorobo ohnorobo merged commit 57c1c0b into master Jun 6, 2023
1 check passed
@ohnorobo
Copy link
Collaborator Author

ohnorobo commented Jun 6, 2023

merging so we have everything in place to turn nightly back on

@@ -38,7 +38,7 @@ CREATE OR REPLACE TABLE `PROJECT_NAME.DERIVED_DATASET.merged_reduced_scans_v2`
PARTITION BY date
# Columns `source` and `country_name` are always used for filtering and must come first.
# `network` and `domain` are useful for filtering and grouping.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be worth mentioning that the order matters and link to https://cloud.google.com/bigquery/docs/clustered-tables#cluster_column_ordering
Also that we need domain to come after country to benefit from the default domain selection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants