-
Notifications
You must be signed in to change notification settings - Fork 371
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[hold] Bug: Randomly failing E2E tests #376
Comments
moving this back in, gil is saying this still happens, sometimes, with the server; |
I couldn't reproduce random failing, so we decided to move it to the done for now. |
First we need to resolve the issue with wrong backend payment url. |
hey marcin, e2e fixed, can you resume investigation? |
I looked at it for a few hours and could found the real issue. Failures are completely random. One time some test fail and then it works for few runs. In the meantime, another one fails and the story repeats. Will look more at this issue in next weeks and might check some alternatives to protractor that are much more stable and easier to debug and work with. Current solution slows whole development too much. We lose way to much time on triggering pipelines and the process of writing tests is also not efficient. We need to check our work line by line and writing whole tests that just runs in the first try is almost impossible. After finding some better replacement I will introduce it to a wider audience and we will decide the e2e tests future in the project. |
@marlass Please take into account that it could not be a protractor to blame (at least the only one), but underlying selenium. So if you'll decide on an alternative library that also uses selenium, the same issues will probably resurface after some time, where the number of tests will grow, especially in case of our app, where almost every part of the page is created dynamically (we need to make a backend call first). What I wanted to point out is that there may be no easy/obvious solution to this, so please take caution on any miraculous alternatives. And yup, writing them is hard (and it would be good to make it easier) but take into account, that testing all the stuff manually at the same rate is not just hard, it's impossible. Also, there is a good one to read:
|
Yeah. I first want to check Cypress that does not use selenium under the hood. In the previous project, we didn't have a failure rate that is even close to the current situation in the project and one thing that it improved dramatically was ease of debugging and writing tests. I will try first to move the happy path to it and if it will bring enough improvement, we will discuss it and decide if we want to move it. |
According to the link above that @dunqan provided and a couple other if you do some research, flakiness on selenium based tests is normal and expected. Rewriting the tests in another framework is not in scope for this ticket or something we should consider at this point. I'd propose to consider the following:
|
Agree with @hackergil, that changing testing framework is not in the scope of this ticket, but... |
I checked our build statistics with Travis API and got the following results (last 3000 builds of 3725 total): However this doesn't give the whole picture. Travis API doesn't return any information about jobs/builds restarts, so we only see the final result (which might be sometimes result of 3-4 restarts). Almost 50% failure rate doesn't look good. Most of them fails at the 'Unit tests' stage that probably indicates the flaky e2e tests. Source code: https://github.com/marlass/travis-build-stats We can inspect more the tests, introduce some sort of auto retries or try some new solutions, because it really have a great impact on everyones work. Waiting sometimes 30 minutes to merge and then finding out that your branch is again not up to date and retrying this process is extremly frustrating. In my opinion this one thing is the biggest factor to our current development speed. Let me know what in your opinion we should do next? |
As we discussed: because of the fact that Travis runs for each commit (and even runs twice if we have a PR), those stats include failures from each commit on "work in progress" branches that could have not yet adjusted unit (or e2e) tests. So I'd vote for implementing a more robust logging mechanism, that could take into account protractor output, job id, branch, commit -> then we could use that info to find most often failing tests, most flaky ones (that finally works after some restarts), etc. And of course, test/implement a retry mechanism for protractor as soon as it is possible, to ease developers life. It's a part of this ticket (#580), but IMO deserves its own ticket. |
After a quick search, I found that implementing better logging mechanism is pretty easy. |
Status of issue: I will review the stats in next week and prepare some script for the future. |
no longer needed as we moved from protractor to cypress |
Expected Results
Stop E2E from failing randomly.
Observed Results
Randomly, random tests are failing locally and in the pipeline. There isn't any case, which this can be reproduced. Sometimes it works, sometimes not. We have to check if that is caused by our E2E tests, connection issues or backend problems.
The text was updated successfully, but these errors were encountered: