feat(server): allow tests to be stopped in all test pipeline steps #3504

schoren · 2024-01-08T20:23:24Z

This PR adds code to handle test stop request in all the test pipeline steps

Changes

Fixes

Checklist

tested locally
added new dependencies
updated the docs
added a test

Loom video

Add your loom video here if your work can be visualized

mathnogueira

LGTM

danielbdias

The code seems ok! Just a question: where do we handle the Test status change? It is before or after these operations.
Why I'm asking this: is there any chance a way to have a race condition and have the test status overrriden to something else?

xoscar · 2024-01-08T20:52:27Z

@schoren I think this is a simple way of enabling the stopping mechanism, my only concern is that this is only going to be executed if one of the workers is currently executing the job, the problem that I see here is that if the job is idle state there won't be something able to stop it. Have you thought about it?

xoscar · 2024-01-08T20:53:15Z

We are facing a similar behavior described here

https://github.com/kubeshop/tracetest-cloud-frontend/issues/154

schoren · 2024-01-08T21:45:28Z

@xoscar i'm just extending the existing behavior, not modifying it. if the existing behavior is a source of bugs, then we'll need to invest some time fixing it. That being said, it should be very difficult to hit the "skip test" button just at the exact time where it is between steps so I don't think it's an issue with the test being "idle" at the moment. What can happen is that the context cancellation happens after the "context cancelation handling" has been executed, and so the request might be "ignored", and the test should have a "trace skipped" state but still loop through the polling process. I know this can happen "in theory" but I wasn't able to reproduce this behavior. In any case, an easy fix could be to add a run state validation before each step, so we can be sure that a test that's supposed to be stopped or skipping traces will skip to the correct step. Do you think this should be in the scope of this PR?

@danielbdias the cancel handling function is this:

tracetest/server/executor/queue.go

Line 368 in 6d6143a

cancelCtx(nil)

. The first thing that happens is the cancelation of the context, so any ongoing operations will be canceled. After that we update the status, but it shouldn't be possible to have a race condition related to this specifically.

feat(server): allow tests to be stopped in all test pipeline steps

869bc16

mathnogueira approved these changes Jan 8, 2024

View reviewed changes

jorgeepc approved these changes Jan 8, 2024

View reviewed changes

danielbdias approved these changes Jan 8, 2024

View reviewed changes

xoscar approved these changes Jan 9, 2024

View reviewed changes

schoren merged commit db11183 into main Jan 9, 2024
37 checks passed

schoren deleted the allow-test-stop-on-all-steps branch January 9, 2024 18:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(server): allow tests to be stopped in all test pipeline steps #3504

feat(server): allow tests to be stopped in all test pipeline steps #3504

schoren commented Jan 8, 2024

mathnogueira left a comment

danielbdias left a comment •

edited

xoscar commented Jan 8, 2024

xoscar commented Jan 8, 2024

schoren commented Jan 8, 2024

feat(server): allow tests to be stopped in all test pipeline steps #3504

feat(server): allow tests to be stopped in all test pipeline steps #3504

Conversation

schoren commented Jan 8, 2024

Changes

Fixes

Checklist

Loom video

mathnogueira left a comment

Choose a reason for hiding this comment

danielbdias left a comment • edited

Choose a reason for hiding this comment

xoscar commented Jan 8, 2024

xoscar commented Jan 8, 2024

schoren commented Jan 8, 2024

danielbdias left a comment •

edited