Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix cohorts stuck on empty querysets #2441

Merged
merged 2 commits into from
Nov 19, 2020
Merged

Fix cohorts stuck on empty querysets #2441

merged 2 commits into from
Nov 19, 2020

Conversation

timgl
Copy link
Collaborator

@timgl timgl commented Nov 19, 2020

Changes

  • Catch the EmptyResultsSet error (see sentry)
  • Continuously run cohort update 15 at a time to avoid overwhelming clickhouse every hour.

Checklist

  • All querysets/queries filter by Organization, Team, and User (if this PR affects ANY querysets/queries).
  • Django backend tests (if this PR affects the backend).
  • Cypress end-to-end tests (if this PR affects the frontend).

@timgl timgl requested a review from macobo November 19, 2020 10:27
@timgl timgl temporarily deployed to posthog-fix-cohort-empt-bep7rc November 19, 2020 10:29 Inactive
Q(is_calculating=False) | Q(last_calculation__lte=timezone.now() - relativedelta(minutes=max_age_minutes))
).order_by("id"):
Q(is_calculating=False) | Q(last_calculation__lte=timezone.now() - relativedelta(minutes=MAX_AGE_MINUTES))
).order_by("last_calculation")[0:15]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's extract const: PARALLEL_COHORTS.

Q(is_calculating=False) | Q(last_calculation__lte=timezone.now() - relativedelta(minutes=max_age_minutes))
).order_by("id"):
Q(is_calculating=False) | Q(last_calculation__lte=timezone.now() - relativedelta(minutes=MAX_AGE_MINUTES))
).order_by("last_calculation")[0:15]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does .order_by put nulls first or last? If last some cohorts may never be processed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. By default I think yes, but I've made it super explicit now

@timgl timgl temporarily deployed to posthog-fix-cohort-empt-bep7rc November 19, 2020 10:38 Inactive
@macobo
Copy link
Contributor

macobo commented Nov 19, 2020

🥳

@timgl timgl merged commit 5ce7b47 into master Nov 19, 2020
@timgl timgl deleted the fix-cohort-empty-query branch November 19, 2020 10:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants