Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Functional tests that create/update/delete resources are flaky #1033

Closed
nejch opened this issue Feb 25, 2020 · 3 comments · Fixed by #1205
Closed

Functional tests that create/update/delete resources are flaky #1033

nejch opened this issue Feb 25, 2020 · 3 comments · Fixed by #1205

Comments

@nejch
Copy link
Member

nejch commented Feb 25, 2020

I realized the weird unrelated failure in #1020 (see https://travis-ci.org/python-gitlab/python-gitlab/jobs/652064210) was a flaky test for rate limits. I'm getting it locally sometimes but only rarely.

I haven't had the time to investigate but I'll try to have a look at some point. Maybe something in the environment slows down the requests sometimes and it does not trigger 429 or something.

@max-wittig
Copy link
Member

This is not the only flaky test in these functional tests. Many of them fail randomly, but just in CI, never locally.

I spend a lot of effort a while ago (even switched CI systems), but haven't figured it out, yet
I would be happy for every hint I get 😄

Or maybe we got to completely change this functional test setup, not sure.

@nejch
Copy link
Member Author

nejch commented Feb 29, 2020

Ah I see :( I do get some failures locally, maybe my measly laptop is closer to Travis VM specs :P
It's hard to mark things for re-run now because certain tests prepare state for others, so maybe a refactor will help, let's see :)

@nejch nejch changed the title Functional test for rate limits is flaky Functional tests that create/update/delete resources are flaky Mar 1, 2020
@nejch
Copy link
Member Author

nejch commented Mar 1, 2020

I now think that a lot of these failures happen when GitLab hands off tasks to Sidekiq asynchronously, and getting a response from the server does not mean that something is actually done (like deleting a project etc). If the next step/assert relies on that, it fails.

I see this more often now as I try to convert asserts to test cases for #1024 (I will push that once it gets greener :P ). Especially as I try to delete a resource in teardown and re-create it for the next test. If I run tests with pytest-random-order it goes green whent those 2 tests are further apart.

For example with the rate limit, if I try to reuse tools/reset_gitlab.py for teardown and add some checks with projects.list(), it takes almost 10-20 seconds in my case to delete all the projects that the test creates (around 20-40 projects?).

Have you or anyone here had any similar experience? Maybe we could have some kind of wait_resource_available() and wait_resource_deleted() helpers, and have as much of that in fixtures as possible to avoid having sleeps in tests. It would probably slow down the tests quite a bit, but make them more robust. Anyway, splitting these into testcases is already some progress it seems :)

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 7, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants