Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: allow users to stop a running test #2367

Merged
merged 18 commits into from Apr 13, 2023
Merged

feat: allow users to stop a running test #2367

merged 18 commits into from Apr 13, 2023

Conversation

schoren
Copy link
Collaborator

@schoren schoren commented Apr 11, 2023

This PR implements the stop running request feature. The endpoint was already mocked, but the URL was inconistent with the rest of the tests urls.

This still needs some FE work. I was able to make the web not crash with the new stopped run state, but the UI doesn’t show the trace once stopped, even when it is saved to the run info. Also we probably need a button or something.

Fixes

Checklist

  • tested locally
  • added new dependencies
  • updated the docs
  • added a test

@@ -825,7 +825,7 @@ paths:
application/yaml:
schema:
type: string
/test/{testId}/run/{runId}/stop:
/tests/{testId}/run/{runId}/stop:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all test endpoints are actually /tests with a trailing s

@@ -13,7 +13,7 @@ exporters:
loglevel: debug

otlp/1:
endpoint: host.docker.internal:21321
endpoint: tracetest:21321
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was probablly a leftover from someone doing docker tests, it should point to the internal url

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: is this file intended just for local testing, or do we use it in some pipelines?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

local testing. It's used when you run the ./run.sh script for example, that is mainly targetted to run an env similar to CI, so you can run the tracetesting tests locally easily.

@@ -4,3 +4,12 @@ spec:
name: OpenTelemetry Collector
type: otlp
isdefault: true
---
type: Demo
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addin a default demo locally eased the testing process

ctx context.Context
test model.Test
run model.Run
subscriptionManager *subscription.Manager
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this var was unused

@@ -132,14 +131,18 @@ func getNewCtx(ctx context.Context) context.Context {
}

func (r persistentRunner) Run(ctx context.Context, test model.Test, metadata model.RunMetadata, environment model.Environment) model.Run {
ctx = getNewCtx(ctx)
ctx, cancelCtx := context.WithCancel(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we want to make the context cancellable, since it isn't by default

Comment on lines 135 to 142
select {
default:
case <-job.ctx.Done():
fmt.Println("trace poller. Context cancelled.")
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handle context cancelation

@@ -190,7 +196,13 @@ func (tp tracePoller) runAssertions(job PollingRequest) {
func (tp tracePoller) handleTraceDBError(job PollingRequest, err error) (bool, string) {
run := job.run

pp := *tp.ppGetter.GetDefault(job.ctx).Periodic
profile := tp.ppGetter.GetDefault(job.ctx)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sometimes the context can get cancelled in the middle of this func call, causing a nil pointer panic. This new approach handles the case more gracefully

Copy link
Contributor

@jorgeepc jorgeepc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to update the OpenAPI specs to include the new run state

enum:

@schoren
Copy link
Collaborator Author

schoren commented Apr 11, 2023

@jorgeepc fixed!

server/executor/run_stop.go Outdated Show resolved Hide resolved
server/executor/run_stop.go Outdated Show resolved Hide resolved
select {
default:
case <-job.ctx.Done():
fmt.Println("trace poller. Context cancelled.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we should do an early return here in case of cancelation? Or do we need to execute the Job process even after receiving the cancel signal?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something to think about in another PR: is there a way to add a unit test that checks if the runner behaved correctly on a cancel case?

@schoren schoren linked an issue Apr 11, 2023 that may be closed by this pull request
@@ -4,13 +4,13 @@ import "sync"

type Manager struct {
subscriptions map[string][]Subscriber
mutex sync.Mutex
mutex sync.RWMutex
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All changes in this file are to fix the race condition making some tests fail

Comment on lines 37 to 39
em.mutex.Lock()
defer em.mutex.Unlock()

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this lock is moved to the publisher now

@@ -32,26 +32,27 @@ func (r *fakeTestRunner) Run(ctx context.Context, test model.Test, metadata mode
}

go func() {
run := newRun // make a local copy to avoid race conditions
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by using a copy instead of the original var, we avoid race conditions

<-done

assert(t, transactionRun)
done := make(chan model.TransactionRun, 1)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of using the channel as a done signal, we use it to return the relevant value. by not sharing a variable between 2 goroutines, we eliminate a race condition

case finalRun = <-done:
subscriptionManager.Unsubscribe(transactionRun.ResourceID(), sf.ID()) //cleanup to avoid race conditions
fmt.Println("run completed")
case <-time.After(10 * time.Second):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also it allows to set a timeout instead of waiting forever

Comment on lines 51 to 52
m.mutex.RLock()
defer m.mutex.RUnlock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

@schoren schoren merged commit ba47e42 into main Apr 13, 2023
26 checks passed
@schoren schoren deleted the add-stop-test-feature branch April 13, 2023 15:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Error Handling] - Backend - Stopping a Test Run
3 participants