-
Notifications
You must be signed in to change notification settings - Fork 8.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FTR: test suite teardown takes significant amount of time #171743
Comments
dmlemeshko
changed the title
FTR:
FTR: test suite teardown takes significant amount of time
Nov 22, 2023
Pinging @elastic/appex-qa (Team:QA) |
dmlemeshko
added a commit
that referenced
this issue
Jan 12, 2024
…xists (#171672) ## Summary Part of #171743 Since we no longer collecting code coverage for functional tests, I removed flushing on stop signal. We also noticed a failure on test completion and it is probably due to resubscribing for logs polling when WebDriver session is already destroyed. See #171743 for actual logs example My guess, our current approach to keep fetching WebDriver/Browser logs based on FTR lifecycle is not very robust: `takeUntil(lifecycle.cleanup.after$)` keeps emitting until the `cleanup` stage is finished, and internally we keep resubscribing on WebDriver failures due to `NoSuchSessionError` and keep trying to fetch logs again & again. Since we don't re-create WebDriver session, it simply doesn't make sense. Resubscribing seems useless. I'm replacing `takeUntil(lifecycle.cleanup.after$)` with `takeWhile((loggingEntry: logging.Entry) =>!loggingEntry.message.startsWith(FINAL_LOG_ENTRY_PREFIX))` so that we stop as soon as we get NoSuchSessionError even if it is before cleanup.after stage in FTR lifecycle (e.g. browser crashed during the test run) For **reviewers**: please give it a try locally and check that you are able to see browser logs, but no errors/delay on FTR teardown. How to test: - start servers & test runner - let the browser window to be opened by WebDriver and then close it manually - check console logs: `browser` logs should be stopped immediately, you don't see any browser/webdriver logs other than general FTR error
kibanamachine
pushed a commit
to kibanamachine/kibana
that referenced
this issue
Jan 12, 2024
…xists (elastic#171672) ## Summary Part of elastic#171743 Since we no longer collecting code coverage for functional tests, I removed flushing on stop signal. We also noticed a failure on test completion and it is probably due to resubscribing for logs polling when WebDriver session is already destroyed. See elastic#171743 for actual logs example My guess, our current approach to keep fetching WebDriver/Browser logs based on FTR lifecycle is not very robust: `takeUntil(lifecycle.cleanup.after$)` keeps emitting until the `cleanup` stage is finished, and internally we keep resubscribing on WebDriver failures due to `NoSuchSessionError` and keep trying to fetch logs again & again. Since we don't re-create WebDriver session, it simply doesn't make sense. Resubscribing seems useless. I'm replacing `takeUntil(lifecycle.cleanup.after$)` with `takeWhile((loggingEntry: logging.Entry) =>!loggingEntry.message.startsWith(FINAL_LOG_ENTRY_PREFIX))` so that we stop as soon as we get NoSuchSessionError even if it is before cleanup.after stage in FTR lifecycle (e.g. browser crashed during the test run) For **reviewers**: please give it a try locally and check that you are able to see browser logs, but no errors/delay on FTR teardown. How to test: - start servers & test runner - let the browser window to be opened by WebDriver and then close it manually - check console logs: `browser` logs should be stopped immediately, you don't see any browser/webdriver logs other than general FTR error (cherry picked from commit d34915b)
semd
pushed a commit
to semd/kibana
that referenced
this issue
Jan 12, 2024
…xists (elastic#171672) ## Summary Part of elastic#171743 Since we no longer collecting code coverage for functional tests, I removed flushing on stop signal. We also noticed a failure on test completion and it is probably due to resubscribing for logs polling when WebDriver session is already destroyed. See elastic#171743 for actual logs example My guess, our current approach to keep fetching WebDriver/Browser logs based on FTR lifecycle is not very robust: `takeUntil(lifecycle.cleanup.after$)` keeps emitting until the `cleanup` stage is finished, and internally we keep resubscribing on WebDriver failures due to `NoSuchSessionError` and keep trying to fetch logs again & again. Since we don't re-create WebDriver session, it simply doesn't make sense. Resubscribing seems useless. I'm replacing `takeUntil(lifecycle.cleanup.after$)` with `takeWhile((loggingEntry: logging.Entry) =>!loggingEntry.message.startsWith(FINAL_LOG_ENTRY_PREFIX))` so that we stop as soon as we get NoSuchSessionError even if it is before cleanup.after stage in FTR lifecycle (e.g. browser crashed during the test run) For **reviewers**: please give it a try locally and check that you are able to see browser logs, but no errors/delay on FTR teardown. How to test: - start servers & test runner - let the browser window to be opened by WebDriver and then close it manually - check console logs: `browser` logs should be stopped immediately, you don't see any browser/webdriver logs other than general FTR error
kibanamachine
added a commit
that referenced
this issue
Jan 12, 2024
…onger exists (#171672) (#174770) # Backport This will backport the following commits from `main` to `8.12`: - [[ftr/remote] stop WebDriver logs polling when its session no longer exists (#171672)](#171672) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Dzmitry Lemechko","email":"dzmitry.lemechko@elastic.co"},"sourceCommit":{"committedDate":"2024-01-12T15:12:35Z","message":"[ftr/remote] stop WebDriver logs polling when its session no longer exists (#171672)\n\n## Summary\r\n\r\nPart of #171743\r\n\r\nSince we no longer collecting code coverage for functional tests, I\r\nremoved flushing on stop signal.\r\n\r\nWe also noticed a failure on test completion and it is probably due to\r\nresubscribing for logs polling when WebDriver session is already\r\ndestroyed. See #171743 for actual logs example\r\n\r\nMy guess, our current approach to keep fetching WebDriver/Browser logs\r\nbased on FTR lifecycle is not very robust:\r\n`takeUntil(lifecycle.cleanup.after$)` keeps emitting until the `cleanup`\r\nstage is finished, and internally we keep resubscribing on WebDriver\r\nfailures due to `NoSuchSessionError` and keep trying to fetch logs again\r\n& again. Since we don't re-create WebDriver session, it simply doesn't\r\nmake sense. Resubscribing seems useless.\r\n\r\nI'm replacing `takeUntil(lifecycle.cleanup.after$)` with\r\n`takeWhile((loggingEntry: logging.Entry)\r\n=>!loggingEntry.message.startsWith(FINAL_LOG_ENTRY_PREFIX))` so that we\r\nstop as soon as we get NoSuchSessionError even if it is before\r\ncleanup.after stage in FTR lifecycle (e.g. browser crashed during the\r\ntest run)\r\n\r\nFor **reviewers**: please give it a try locally and check that you are\r\nable to see browser logs, but no errors/delay on FTR teardown.\r\n\r\nHow to test:\r\n- start servers & test runner\r\n- let the browser window to be opened by WebDriver and then close it\r\nmanually\r\n- check console logs: `browser` logs should be stopped immediately, you\r\ndon't see any browser/webdriver logs other than general FTR error","sha":"d34915bec78de95af02e775924d83ef95e6ea5bb","branchLabelMapping":{"^v8.13.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","FTR","v8.12.1","v8.13.0"],"title":"[ftr/remote] stop WebDriver logs polling when its session no longer exists","number":171672,"url":"#171672 stop WebDriver logs polling when its session no longer exists (#171672)\n\n## Summary\r\n\r\nPart of #171743\r\n\r\nSince we no longer collecting code coverage for functional tests, I\r\nremoved flushing on stop signal.\r\n\r\nWe also noticed a failure on test completion and it is probably due to\r\nresubscribing for logs polling when WebDriver session is already\r\ndestroyed. See #171743 for actual logs example\r\n\r\nMy guess, our current approach to keep fetching WebDriver/Browser logs\r\nbased on FTR lifecycle is not very robust:\r\n`takeUntil(lifecycle.cleanup.after$)` keeps emitting until the `cleanup`\r\nstage is finished, and internally we keep resubscribing on WebDriver\r\nfailures due to `NoSuchSessionError` and keep trying to fetch logs again\r\n& again. Since we don't re-create WebDriver session, it simply doesn't\r\nmake sense. Resubscribing seems useless.\r\n\r\nI'm replacing `takeUntil(lifecycle.cleanup.after$)` with\r\n`takeWhile((loggingEntry: logging.Entry)\r\n=>!loggingEntry.message.startsWith(FINAL_LOG_ENTRY_PREFIX))` so that we\r\nstop as soon as we get NoSuchSessionError even if it is before\r\ncleanup.after stage in FTR lifecycle (e.g. browser crashed during the\r\ntest run)\r\n\r\nFor **reviewers**: please give it a try locally and check that you are\r\nable to see browser logs, but no errors/delay on FTR teardown.\r\n\r\nHow to test:\r\n- start servers & test runner\r\n- let the browser window to be opened by WebDriver and then close it\r\nmanually\r\n- check console logs: `browser` logs should be stopped immediately, you\r\ndon't see any browser/webdriver logs other than general FTR error","sha":"d34915bec78de95af02e775924d83ef95e6ea5bb"}},"sourceBranch":"main","suggestedTargetBranches":["8.12"],"targetPullRequestStates":[{"branch":"8.12","label":"v8.12.1","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v8.13.0","branchLabelMappingKey":"^v8.13.0$","isSourceBranch":true,"state":"MERGED","url":"#171672 stop WebDriver logs polling when its session no longer exists (#171672)\n\n## Summary\r\n\r\nPart of #171743\r\n\r\nSince we no longer collecting code coverage for functional tests, I\r\nremoved flushing on stop signal.\r\n\r\nWe also noticed a failure on test completion and it is probably due to\r\nresubscribing for logs polling when WebDriver session is already\r\ndestroyed. See #171743 for actual logs example\r\n\r\nMy guess, our current approach to keep fetching WebDriver/Browser logs\r\nbased on FTR lifecycle is not very robust:\r\n`takeUntil(lifecycle.cleanup.after$)` keeps emitting until the `cleanup`\r\nstage is finished, and internally we keep resubscribing on WebDriver\r\nfailures due to `NoSuchSessionError` and keep trying to fetch logs again\r\n& again. Since we don't re-create WebDriver session, it simply doesn't\r\nmake sense. Resubscribing seems useless.\r\n\r\nI'm replacing `takeUntil(lifecycle.cleanup.after$)` with\r\n`takeWhile((loggingEntry: logging.Entry)\r\n=>!loggingEntry.message.startsWith(FINAL_LOG_ENTRY_PREFIX))` so that we\r\nstop as soon as we get NoSuchSessionError even if it is before\r\ncleanup.after stage in FTR lifecycle (e.g. browser crashed during the\r\ntest run)\r\n\r\nFor **reviewers**: please give it a try locally and check that you are\r\nable to see browser logs, but no errors/delay on FTR teardown.\r\n\r\nHow to test:\r\n- start servers & test runner\r\n- let the browser window to be opened by WebDriver and then close it\r\nmanually\r\n- check console logs: `browser` logs should be stopped immediately, you\r\ndon't see any browser/webdriver logs other than general FTR error","sha":"d34915bec78de95af02e775924d83ef95e6ea5bb"}}]}] BACKPORT--> Co-authored-by: Dzmitry Lemechko <dzmitry.lemechko@elastic.co>
CoenWarmer
pushed a commit
to CoenWarmer/kibana
that referenced
this issue
Feb 15, 2024
…xists (elastic#171672) ## Summary Part of elastic#171743 Since we no longer collecting code coverage for functional tests, I removed flushing on stop signal. We also noticed a failure on test completion and it is probably due to resubscribing for logs polling when WebDriver session is already destroyed. See elastic#171743 for actual logs example My guess, our current approach to keep fetching WebDriver/Browser logs based on FTR lifecycle is not very robust: `takeUntil(lifecycle.cleanup.after$)` keeps emitting until the `cleanup` stage is finished, and internally we keep resubscribing on WebDriver failures due to `NoSuchSessionError` and keep trying to fetch logs again & again. Since we don't re-create WebDriver session, it simply doesn't make sense. Resubscribing seems useless. I'm replacing `takeUntil(lifecycle.cleanup.after$)` with `takeWhile((loggingEntry: logging.Entry) =>!loggingEntry.message.startsWith(FINAL_LOG_ENTRY_PREFIX))` so that we stop as soon as we get NoSuchSessionError even if it is before cleanup.after stage in FTR lifecycle (e.g. browser crashed during the test run) For **reviewers**: please give it a try locally and check that you are able to see browser logs, but no errors/delay on FTR teardown. How to test: - start servers & test runner - let the browser window to be opened by WebDriver and then close it manually - check console logs: `browser` logs should be stopped immediately, you don't see any browser/webdriver logs other than general FTR error
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
We noticed a strange delay on FTR suite teardown:
Test runner finish the suite at
18:29:39 CEST
and Kibana server is stopped at18:33:58 CEST
, which means something took 4+ minutes to finalise the teardown.https://buildkite.com/elastic/kibana-pull-request/builds/175830#018bcebe-dd02-4045-b36a-bd9c7d78a7fe
Few problems to fix:
Back in August teardown was quite fast:
The text was updated successfully, but these errors were encountered: