Find out which tests are flaky so that we know what to look into. Find a good flake cutoff. We might be able to use prevent for this: https://sentry.sentry.io/prevent/tests/?integratedOrgName=getsentry&preventPeriod=7d&repository=sentry-python