Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ftr/remote] stop WebDriver logs polling when its session no longer exists #171672

Merged
merged 20 commits into from
Jan 12, 2024

Conversation

dmlemeshko
Copy link
Member

@dmlemeshko dmlemeshko commented Nov 21, 2023

Summary

Part of #171743

Since we no longer collecting code coverage for functional tests, I removed flushing on stop signal.

We also noticed a failure on test completion and it is probably due to resubscribing for logs polling when WebDriver session is already destroyed. See #171743 for actual logs example

My guess, our current approach to keep fetching WebDriver/Browser logs based on FTR lifecycle is not very robust: takeUntil(lifecycle.cleanup.after$) keeps emitting until the cleanup stage is finished, and internally we keep resubscribing on WebDriver failures due to NoSuchSessionError and keep trying to fetch logs again & again. Since we don't re-create WebDriver session, it simply doesn't make sense. Resubscribing seems useless.

I'm replacing takeUntil(lifecycle.cleanup.after$) with takeWhile((loggingEntry: logging.Entry) =>!loggingEntry.message.startsWith(FINAL_LOG_ENTRY_PREFIX)) so that we stop as soon as we get NoSuchSessionError even if it is before cleanup.after stage in FTR lifecycle (e.g. browser crashed during the test run)

For reviewers: please give it a try locally and check that you are able to see browser logs, but no errors/delay on FTR teardown.

How to test:

  • start servers & test runner
  • let the browser window to be opened by WebDriver and then close it manually
  • check console logs: browser logs should be stopped immediately, you don't see any browser/webdriver logs other than general FTR error

@dmlemeshko dmlemeshko added FTR v8.12.0 v8.11.2 release_note:skip Skip the PR/issue when compiling release notes labels Nov 22, 2023
@dmlemeshko dmlemeshko changed the title [ftr/remote] improve poll_for_log_entry [ftr/remote] stop WebDriver logs polling as soon as its session no longer exists Nov 22, 2023
@dmlemeshko dmlemeshko changed the title [ftr/remote] stop WebDriver logs polling as soon as its session no longer exists [ftr/remote] stop WebDriver logs polling when its session no longer exists Nov 22, 2023
@dmlemeshko dmlemeshko marked this pull request as ready for review November 22, 2023 16:05
@dmlemeshko dmlemeshko requested a review from a team as a code owner November 22, 2023 16:05
@dmlemeshko dmlemeshko requested a review from a team November 22, 2023 16:05
@dmlemeshko
Copy link
Member Author

Adding Operations team to be aware of the change.

@dmlemeshko
Copy link
Member Author

@pheyos, I think it is ready for another look

});
// pause 10 seconds then resubscribe
Rx.of(1).pipe(
delay(10 * 1000),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Short delay (e.g. 10 ms) will help to reproduce logs duplication in main. With this change you won't see multiple messages like

warn browser[SEVERE] ERROR FETCHING BROWSR LOGS: This driver instance does not have a valid session ID (did you call WebDriver.quit()?) and may no longer be used.

@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

Copy link
Member

@pheyos pheyos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested locally and works fine.
LGTM

@dmlemeshko dmlemeshko merged commit d34915b into elastic:main Jan 12, 2024
20 checks passed
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Jan 12, 2024
…xists (elastic#171672)

## Summary

Part of elastic#171743

Since we no longer collecting code coverage for functional tests, I
removed flushing on stop signal.

We also noticed a failure on test completion and it is probably due to
resubscribing for logs polling when WebDriver session is already
destroyed. See elastic#171743 for actual logs example

My guess, our current approach to keep fetching WebDriver/Browser logs
based on FTR lifecycle is not very robust:
`takeUntil(lifecycle.cleanup.after$)` keeps emitting until the `cleanup`
stage is finished, and internally we keep resubscribing on WebDriver
failures due to `NoSuchSessionError` and keep trying to fetch logs again
& again. Since we don't re-create WebDriver session, it simply doesn't
make sense. Resubscribing seems useless.

I'm replacing `takeUntil(lifecycle.cleanup.after$)` with
`takeWhile((loggingEntry: logging.Entry)
=>!loggingEntry.message.startsWith(FINAL_LOG_ENTRY_PREFIX))` so that we
stop as soon as we get NoSuchSessionError even if it is before
cleanup.after stage in FTR lifecycle (e.g. browser crashed during the
test run)

For **reviewers**: please give it a try locally and check that you are
able to see browser logs, but no errors/delay on FTR teardown.

How to test:
- start servers & test runner
- let the browser window to be opened by WebDriver and then close it
manually
- check console logs: `browser` logs should be stopped immediately, you
don't see any browser/webdriver logs other than general FTR error

(cherry picked from commit d34915b)
@kibanamachine
Copy link
Contributor

💚 All backports created successfully

Status Branch Result
8.12

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

semd pushed a commit to semd/kibana that referenced this pull request Jan 12, 2024
…xists (elastic#171672)

## Summary

Part of elastic#171743

Since we no longer collecting code coverage for functional tests, I
removed flushing on stop signal.

We also noticed a failure on test completion and it is probably due to
resubscribing for logs polling when WebDriver session is already
destroyed. See elastic#171743 for actual logs example

My guess, our current approach to keep fetching WebDriver/Browser logs
based on FTR lifecycle is not very robust:
`takeUntil(lifecycle.cleanup.after$)` keeps emitting until the `cleanup`
stage is finished, and internally we keep resubscribing on WebDriver
failures due to `NoSuchSessionError` and keep trying to fetch logs again
& again. Since we don't re-create WebDriver session, it simply doesn't
make sense. Resubscribing seems useless.

I'm replacing `takeUntil(lifecycle.cleanup.after$)` with
`takeWhile((loggingEntry: logging.Entry)
=>!loggingEntry.message.startsWith(FINAL_LOG_ENTRY_PREFIX))` so that we
stop as soon as we get NoSuchSessionError even if it is before
cleanup.after stage in FTR lifecycle (e.g. browser crashed during the
test run)

For **reviewers**: please give it a try locally and check that you are
able to see browser logs, but no errors/delay on FTR teardown.

How to test:
- start servers & test runner
- let the browser window to be opened by WebDriver and then close it
manually
- check console logs: `browser` logs should be stopped immediately, you
don't see any browser/webdriver logs other than general FTR error
kibanamachine added a commit that referenced this pull request Jan 12, 2024
…onger exists (#171672) (#174770)

# Backport

This will backport the following commits from `main` to `8.12`:
- [[ftr/remote] stop WebDriver logs polling when its session no longer
exists (#171672)](#171672)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Dzmitry
Lemechko","email":"dzmitry.lemechko@elastic.co"},"sourceCommit":{"committedDate":"2024-01-12T15:12:35Z","message":"[ftr/remote]
stop WebDriver logs polling when its session no longer exists
(#171672)\n\n## Summary\r\n\r\nPart of #171743\r\n\r\nSince we no longer
collecting code coverage for functional tests, I\r\nremoved flushing on
stop signal.\r\n\r\nWe also noticed a failure on test completion and it
is probably due to\r\nresubscribing for logs polling when WebDriver
session is already\r\ndestroyed. See #171743 for actual logs
example\r\n\r\nMy guess, our current approach to keep fetching
WebDriver/Browser logs\r\nbased on FTR lifecycle is not very
robust:\r\n`takeUntil(lifecycle.cleanup.after$)` keeps emitting until
the `cleanup`\r\nstage is finished, and internally we keep resubscribing
on WebDriver\r\nfailures due to `NoSuchSessionError` and keep trying to
fetch logs again\r\n& again. Since we don't re-create WebDriver session,
it simply doesn't\r\nmake sense. Resubscribing seems useless.\r\n\r\nI'm
replacing `takeUntil(lifecycle.cleanup.after$)`
with\r\n`takeWhile((loggingEntry:
logging.Entry)\r\n=>!loggingEntry.message.startsWith(FINAL_LOG_ENTRY_PREFIX))`
so that we\r\nstop as soon as we get NoSuchSessionError even if it is
before\r\ncleanup.after stage in FTR lifecycle (e.g. browser crashed
during the\r\ntest run)\r\n\r\nFor **reviewers**: please give it a try
locally and check that you are\r\nable to see browser logs, but no
errors/delay on FTR teardown.\r\n\r\nHow to test:\r\n- start servers &
test runner\r\n- let the browser window to be opened by WebDriver and
then close it\r\nmanually\r\n- check console logs: `browser` logs should
be stopped immediately, you\r\ndon't see any browser/webdriver logs
other than general FTR
error","sha":"d34915bec78de95af02e775924d83ef95e6ea5bb","branchLabelMapping":{"^v8.13.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","FTR","v8.12.1","v8.13.0"],"title":"[ftr/remote]
stop WebDriver logs polling when its session no longer
exists","number":171672,"url":"#171672
stop WebDriver logs polling when its session no longer exists
(#171672)\n\n## Summary\r\n\r\nPart of #171743\r\n\r\nSince we no longer
collecting code coverage for functional tests, I\r\nremoved flushing on
stop signal.\r\n\r\nWe also noticed a failure on test completion and it
is probably due to\r\nresubscribing for logs polling when WebDriver
session is already\r\ndestroyed. See #171743 for actual logs
example\r\n\r\nMy guess, our current approach to keep fetching
WebDriver/Browser logs\r\nbased on FTR lifecycle is not very
robust:\r\n`takeUntil(lifecycle.cleanup.after$)` keeps emitting until
the `cleanup`\r\nstage is finished, and internally we keep resubscribing
on WebDriver\r\nfailures due to `NoSuchSessionError` and keep trying to
fetch logs again\r\n& again. Since we don't re-create WebDriver session,
it simply doesn't\r\nmake sense. Resubscribing seems useless.\r\n\r\nI'm
replacing `takeUntil(lifecycle.cleanup.after$)`
with\r\n`takeWhile((loggingEntry:
logging.Entry)\r\n=>!loggingEntry.message.startsWith(FINAL_LOG_ENTRY_PREFIX))`
so that we\r\nstop as soon as we get NoSuchSessionError even if it is
before\r\ncleanup.after stage in FTR lifecycle (e.g. browser crashed
during the\r\ntest run)\r\n\r\nFor **reviewers**: please give it a try
locally and check that you are\r\nable to see browser logs, but no
errors/delay on FTR teardown.\r\n\r\nHow to test:\r\n- start servers &
test runner\r\n- let the browser window to be opened by WebDriver and
then close it\r\nmanually\r\n- check console logs: `browser` logs should
be stopped immediately, you\r\ndon't see any browser/webdriver logs
other than general FTR
error","sha":"d34915bec78de95af02e775924d83ef95e6ea5bb"}},"sourceBranch":"main","suggestedTargetBranches":["8.12"],"targetPullRequestStates":[{"branch":"8.12","label":"v8.12.1","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v8.13.0","branchLabelMappingKey":"^v8.13.0$","isSourceBranch":true,"state":"MERGED","url":"#171672
stop WebDriver logs polling when its session no longer exists
(#171672)\n\n## Summary\r\n\r\nPart of #171743\r\n\r\nSince we no longer
collecting code coverage for functional tests, I\r\nremoved flushing on
stop signal.\r\n\r\nWe also noticed a failure on test completion and it
is probably due to\r\nresubscribing for logs polling when WebDriver
session is already\r\ndestroyed. See #171743 for actual logs
example\r\n\r\nMy guess, our current approach to keep fetching
WebDriver/Browser logs\r\nbased on FTR lifecycle is not very
robust:\r\n`takeUntil(lifecycle.cleanup.after$)` keeps emitting until
the `cleanup`\r\nstage is finished, and internally we keep resubscribing
on WebDriver\r\nfailures due to `NoSuchSessionError` and keep trying to
fetch logs again\r\n& again. Since we don't re-create WebDriver session,
it simply doesn't\r\nmake sense. Resubscribing seems useless.\r\n\r\nI'm
replacing `takeUntil(lifecycle.cleanup.after$)`
with\r\n`takeWhile((loggingEntry:
logging.Entry)\r\n=>!loggingEntry.message.startsWith(FINAL_LOG_ENTRY_PREFIX))`
so that we\r\nstop as soon as we get NoSuchSessionError even if it is
before\r\ncleanup.after stage in FTR lifecycle (e.g. browser crashed
during the\r\ntest run)\r\n\r\nFor **reviewers**: please give it a try
locally and check that you are\r\nable to see browser logs, but no
errors/delay on FTR teardown.\r\n\r\nHow to test:\r\n- start servers &
test runner\r\n- let the browser window to be opened by WebDriver and
then close it\r\nmanually\r\n- check console logs: `browser` logs should
be stopped immediately, you\r\ndon't see any browser/webdriver logs
other than general FTR
error","sha":"d34915bec78de95af02e775924d83ef95e6ea5bb"}}]}] BACKPORT-->

Co-authored-by: Dzmitry Lemechko <dzmitry.lemechko@elastic.co>
CoenWarmer pushed a commit to CoenWarmer/kibana that referenced this pull request Feb 15, 2024
…xists (elastic#171672)

## Summary

Part of elastic#171743

Since we no longer collecting code coverage for functional tests, I
removed flushing on stop signal.

We also noticed a failure on test completion and it is probably due to
resubscribing for logs polling when WebDriver session is already
destroyed. See elastic#171743 for actual logs example

My guess, our current approach to keep fetching WebDriver/Browser logs
based on FTR lifecycle is not very robust:
`takeUntil(lifecycle.cleanup.after$)` keeps emitting until the `cleanup`
stage is finished, and internally we keep resubscribing on WebDriver
failures due to `NoSuchSessionError` and keep trying to fetch logs again
& again. Since we don't re-create WebDriver session, it simply doesn't
make sense. Resubscribing seems useless.

I'm replacing `takeUntil(lifecycle.cleanup.after$)` with
`takeWhile((loggingEntry: logging.Entry)
=>!loggingEntry.message.startsWith(FINAL_LOG_ENTRY_PREFIX))` so that we
stop as soon as we get NoSuchSessionError even if it is before
cleanup.after stage in FTR lifecycle (e.g. browser crashed during the
test run)

For **reviewers**: please give it a try locally and check that you are
able to see browser logs, but no errors/delay on FTR teardown.

How to test:
- start servers & test runner
- let the browser window to be opened by WebDriver and then close it
manually
- check console logs: `browser` logs should be stopped immediately, you
don't see any browser/webdriver logs other than general FTR error
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FTR release_note:skip Skip the PR/issue when compiling release notes v8.12.0 v8.12.1 v8.13.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants