New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add checks for DAU increased by more than 50% versus previous date. #5548
Add checks for DAU increased by more than 50% versus previous date. #5548
Conversation
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one comment, otherwise lgtm!
@@ -57,3 +57,33 @@ SELECT | |||
), | |||
NULL | |||
); | |||
|
|||
{% raw -%} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this raw
in the correct place? I'm not familiar with it's use, it it seems different from the other files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case raw
is used to indicate that the text between this and the corresponding end block should not be modified by jinja when generating the output of this template. We have to do this in some cases, for example in this case we generate another jinja template so we want to preserve some of the template features like:
`{{ project_id }}.{{ dataset_id }}.{{ table_name }}`
@lucia-vargas-a you may have not see this change. A change has been added to My question, if we're trying to check for duplication should we use the |
Thanks for the link. This checks are more oriented towards avoiding overcounting when joining the baseline and metrics ping. I think we could use |
Integration report for "Merge branch 'main' into DENG-2989_checks_for_possible_duplicates"
|
) | ||
SELECT | ||
IF( | ||
ABS((SELECT SUM(dau) FROM dau_current) / (SELECT SUM(dau) FROM dau_previous)) > 1.5, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why you would need the ABS
here?
But if it's to account for the unlikely case of negative DAU (?) what about securing against dividing by zero?
Add checks for DAU increased by more than 50% vs. previous date which could indicate duplicates.
Address the review comment in PR-5396 related to the join of baseline or main ping and metrics ping in active_users_aggregates.
Checklist for reviewer:
<username>:<branch>
of the fork as parameter. The parameter will also show upin the logs of the
manual-trigger-required-for-fork
CI task together with more detailed instructions.For modifications to schemas in restricted namespaces (see
CODEOWNERS
):┆Issue is synchronized with this Jira Task