acceptance: make `TestDockerCLI_test_exec_log` wait for unambiguous l… #107563

xinhaoz · 2023-07-25T19:14:56Z

…og line

The TestDockerCLI_test_exec_log test had a section where it waits for the last executed stmt to appear in the exec_log file before proceeding to the next section. This is done by grep'ing the query string of the last stmt. This works fine most of the time, however, the 2nd last and last stmt executed happen to have the same query string, so it's possible that we are only seeing the 2nd to last stmt in the log. This can cause the test to fail as in the next section, we expect all query logs to be present in the file.

This commit fixes this by issuing a distinct query as the last query so that we can grep for a distinct query string to verify all stmts should have been written.

A new proc is also added to common.tcl: flush_and_sync_logs is a proc that takes 2 arguments, filename and grep_text. It will trigger a signal hang up to trigger a log flush and then attempt to find the grep_text in the specified file.

Epic: none
Fixes: #106367

cockroach-teamcity · 2023-07-25T19:15:03Z

This change is

abarganier

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @xinhaoz)

pkg/cli/interactive_tests/test_exec_log.tcl line 117 at r1 (raw file):

# succeeds.
system "for i in `seq 1 3`; do
  grep 'SELECT ..*111..*' $logfile && exit 0;

Here's my mental model for this test:

Issue some queries that will be logged to the exec log.
Those query logs are buffered in CRDB, waiting to be flushed at some unpredictable time to the exec log.
Trigger flush_server_logs, which returns once cockroach.log is flushed.
Use the same loop logic to make assertions against the exec log, but this time look for a different query from the same file.

Don't we need to synchronize on the flush of the exec log to ensure that this race condition doesn't appear again? Maybe there's something about adding an additional query to the buffer of things waiting to be flushed to the exec log that I'm not seeing, let me know if so!

Based on what flush_server_logs is doing, what if we did something similar that's a bit more configurable? e.g.

proc flush_and_sync_file {filename, grep_text} {
    report "BEGIN FLUSH LOGS"
    system "kill -HUP `cat server_pid` 2>/dev/null"
    # Wait for flush to occur.
    system "for i in `seq 1 3`; do
              grep '$grep_text' $filename && exit 0;
              echo still waiting
              sleep 1
            done
            echo 'server failed to flush logs?'
            exit 1"
    report "END FLUSH LOGS"
}

# Then call in our test... NB: I'm not sure the exact filename
flush_and_sync_file "logs/db/logs/exec.log" "SELECT ..*550..* +"

I'm no expert when it comes to these TCL tests of course, just an idea. Lmk what you think!

…og line The `TestDockerCLI_test_exec_log` test had a section where it waits for the last executed stmt to appear in the exec_log file before proceeding to the next section. This is done by grep'ing the query string of the last stmt. This works fine most of the time, however, the 2nd last and last stmt executed happen to have the same query string, so it's possible that we are only seeing the 2nd to last stmt in the log. This can cause the test to fail as in the next section, we expect all query logs to be present in the file. This commit fixes this by issuing a distinct query as the last query so that we can grep for a distinct query string to verify all stmts should have been written. A new proc is also added to `common.tcl`: flush_and_sync_logs is a proc that takes 2 arguments, `filename` and `grep_text`. It will trigger a signal hang up to trigger a log flush and then attempt to find the `grep_text` in the specified file. Epic: none Fixes: cockroachdb#106367

xinhaoz · 2023-07-27T18:58:51Z

pkg/cli/interactive_tests/test_exec_log.tcl line 117 at r1 (raw file):

Previously, abarganier (Alex Barganier) wrote…

Here's my mental model for this test:

Issue some queries that will be logged to the exec log.

Those query logs are buffered in CRDB, waiting to be flushed at some unpredictable time to the exec log.

Trigger flush_server_logs, which returns once cockroach.log is flushed.

Use the same loop logic to make assertions against the exec log, but this time look for a different query from the same file.

Don't we need to synchronize on the flush of the exec log to ensure that this race condition doesn't appear again? Maybe there's something about adding an additional query to the buffer of things waiting to be flushed to the exec log that I'm seeing, let me know if so!

Based on what flush_server_logs is doing, what if we did something similar that's a bit more configurable? e.g.
proc flush_and_sync_file {filename, grep_text} {
    report "BEGIN FLUSH LOGS"
    system "kill -HUP `cat server_pid` 2>/dev/null"
    # Wait for flush to occur.
    system "for i in `seq 1 3`; do
              grep '$grep_text' $filename && exit 0;
              echo still waiting
              sleep 1
            done
            echo 'server failed to flush logs?'
            exit 1"
    report "END FLUSH LOGS"
}

# Then call in our test... NB: I'm not sure the exact filename
flush_and_sync_file "logs/db/logs/exec.log" "SELECT ..*550..* +"
I'm no expert when it comes to these TCL tests of course, just an idea. Lmk what you think!

Sure, I like the idea of making it more generic 👍 I kept the grep text as the newly added line to make we wait for all logs to appear and increased the max attempts to 5. I'm not familiar with the log flushing process at that step so it might not be necessary so lmk!

abarganier

- nice work & thank you!

Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @xinhaoz)

pkg/cli/interactive_tests/test_exec_log.tcl line 117 at r1 (raw file):

Previously, xinhaoz (Xin Hao Zhang) wrote…

Sure, I like the idea of making it more generic 👍 I kept the grep text as the newly added line to make we wait for all logs to appear and increased the max attempts to 5. I'm not familiar with the log flushing process at that step so it might not be necessary so lmk!

I have no problem with increasing the # of attempts to 5. The most brittle piece of this is assuming that the flush will take under X seconds, which isn't guaranteed (although it is likely). Increasing from 3->5 should help with that!

xinhaoz · 2023-07-28T19:08:38Z

TFTR!
bors r=abarganier

craig · 2023-07-28T19:57:41Z

Build succeeded:

Bazel Essential CI (Cockroach)

xinhaoz marked this pull request as ready for review July 25, 2023 19:33

xinhaoz requested a review from a team July 25, 2023 20:08

maryliag requested a review from abarganier July 25, 2023 20:17

abarganier reviewed Jul 27, 2023

View reviewed changes

xinhaoz force-pushed the fix-exec-log branch from 25af579 to bb5d813 Compare July 27, 2023 18:58

xinhaoz requested a review from abarganier July 27, 2023 20:59

abarganier approved these changes Jul 28, 2023

View reviewed changes

craig bot merged commit 7ba8059 into cockroachdb:master Jul 28, 2023

xinhaoz deleted the fix-exec-log branch April 1, 2024 17:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

acceptance: make `TestDockerCLI_test_exec_log` wait for unambiguous l… #107563

acceptance: make `TestDockerCLI_test_exec_log` wait for unambiguous l… #107563

Uh oh!

xinhaoz commented Jul 25, 2023 •

edited

Loading

Uh oh!

cockroach-teamcity commented Jul 25, 2023

Uh oh!

abarganier left a comment •

edited

Loading

Uh oh!

xinhaoz commented Jul 27, 2023

Uh oh!

abarganier left a comment

Uh oh!

xinhaoz commented Jul 28, 2023

Uh oh!

craig bot commented Jul 28, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

acceptance: make TestDockerCLI_test_exec_log wait for unambiguous l… #107563

acceptance: make TestDockerCLI_test_exec_log wait for unambiguous l… #107563

Uh oh!

Conversation

xinhaoz commented Jul 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cockroach-teamcity commented Jul 25, 2023

Uh oh!

abarganier left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xinhaoz commented Jul 27, 2023

Uh oh!

abarganier left a comment

Choose a reason for hiding this comment

Uh oh!

xinhaoz commented Jul 28, 2023

Uh oh!

craig bot commented Jul 28, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

acceptance: make `TestDockerCLI_test_exec_log` wait for unambiguous l… #107563

acceptance: make `TestDockerCLI_test_exec_log` wait for unambiguous l… #107563

xinhaoz commented Jul 25, 2023 •

edited

Loading

abarganier left a comment •

edited

Loading