New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: allow users to stop a running test #2367
Conversation
@@ -825,7 +825,7 @@ paths: | |||
application/yaml: | |||
schema: | |||
type: string | |||
/test/{testId}/run/{runId}/stop: | |||
/tests/{testId}/run/{runId}/stop: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all test endpoints are actually /tests
with a trailing s
@@ -13,7 +13,7 @@ exporters: | |||
loglevel: debug | |||
|
|||
otlp/1: | |||
endpoint: host.docker.internal:21321 | |||
endpoint: tracetest:21321 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was probablly a leftover from someone doing docker tests, it should point to the internal url
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: is this file intended just for local testing, or do we use it in some pipelines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
local testing. It's used when you run the ./run.sh
script for example, that is mainly targetted to run an env similar to CI, so you can run the tracetesting tests locally easily.
@@ -4,3 +4,12 @@ spec: | |||
name: OpenTelemetry Collector | |||
type: otlp | |||
isdefault: true | |||
--- | |||
type: Demo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addin a default demo locally eased the testing process
ctx context.Context | ||
test model.Test | ||
run model.Run | ||
subscriptionManager *subscription.Manager |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this var was unused
@@ -132,14 +131,18 @@ func getNewCtx(ctx context.Context) context.Context { | |||
} | |||
|
|||
func (r persistentRunner) Run(ctx context.Context, test model.Test, metadata model.RunMetadata, environment model.Environment) model.Run { | |||
ctx = getNewCtx(ctx) | |||
ctx, cancelCtx := context.WithCancel( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we want to make the context cancellable, since it isn't by default
server/executor/trace_poller.go
Outdated
select { | ||
default: | ||
case <-job.ctx.Done(): | ||
fmt.Println("trace poller. Context cancelled.") | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle context cancelation
@@ -190,7 +196,13 @@ func (tp tracePoller) runAssertions(job PollingRequest) { | |||
func (tp tracePoller) handleTraceDBError(job PollingRequest, err error) (bool, string) { | |||
run := job.run | |||
|
|||
pp := *tp.ppGetter.GetDefault(job.ctx).Periodic | |||
profile := tp.ppGetter.GetDefault(job.ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sometimes the context can get cancelled in the middle of this func call, causing a nil pointer panic. This new approach handles the case more gracefully
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to update the OpenAPI specs to include the new run state
Line 124 in b4f9814
enum: |
@jorgeepc fixed! |
server/executor/trace_poller.go
Outdated
select { | ||
default: | ||
case <-job.ctx.Done(): | ||
fmt.Println("trace poller. Context cancelled.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we should do an early return here in case of cancelation? Or do we need to execute the Job process even after receiving the cancel signal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something to think about in another PR: is there a way to add a unit test that checks if the runner behaved correctly on a cancel case?
29ab437
to
8b38d1e
Compare
@@ -4,13 +4,13 @@ import "sync" | |||
|
|||
type Manager struct { | |||
subscriptions map[string][]Subscriber | |||
mutex sync.Mutex | |||
mutex sync.RWMutex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All changes in this file are to fix the race condition making some tests fail
server/executor/eventemitter.go
Outdated
em.mutex.Lock() | ||
defer em.mutex.Unlock() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this lock is moved to the publisher now
@@ -32,26 +32,27 @@ func (r *fakeTestRunner) Run(ctx context.Context, test model.Test, metadata mode | |||
} | |||
|
|||
go func() { | |||
run := newRun // make a local copy to avoid race conditions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by using a copy instead of the original var, we avoid race conditions
<-done | ||
|
||
assert(t, transactionRun) | ||
done := make(chan model.TransactionRun, 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of using the channel as a done
signal, we use it to return the relevant value. by not sharing a variable between 2 goroutines, we eliminate a race condition
case finalRun = <-done: | ||
subscriptionManager.Unsubscribe(transactionRun.ResourceID(), sf.ID()) //cleanup to avoid race conditions | ||
fmt.Println("run completed") | ||
case <-time.After(10 * time.Second): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also it allows to set a timeout instead of waiting forever
server/subscription/manager.go
Outdated
m.mutex.RLock() | ||
defer m.mutex.RUnlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
Co-authored-by: Daniel Baptista Dias <danielbdias@users.noreply.github.com>
Co-authored-by: Daniel Baptista Dias <danielbdias@users.noreply.github.com>
4bfd856
to
2464636
Compare
This PR implements the stop running request feature. The endpoint was already mocked, but the URL was inconistent with the rest of the tests urls.
This still needs some FE work. I was able to make the web not crash with the new stopped run state, but the UI doesn’t show the trace once stopped, even when it is saved to the run info. Also we probably need a button or something.
Fixes
Checklist