New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(controller): Workflow stop and resume by node didn't properly support offloaded nodes. Fixes #2543 #2548
Conversation
Am starting off by running some builds with the increased timeouts to look at the logs. |
Codecov Report
@@ Coverage Diff @@
## master #2548 +/- ##
==========================================
+ Coverage 11.16% 11.19% +0.02%
==========================================
Files 83 83
Lines 32673 32645 -28
==========================================
+ Hits 3649 3653 +4
+ Misses 28525 28495 -30
+ Partials 499 497 -2
Continue to review full report at Codecov.
|
I'm currently unable to progress with this as I can't reproduce locally and the CI build is not executing. |
Try merging master in? |
bf09b61
to
88dd2d9
Compare
e7d340b
to
961000b
Compare
@alexec it looks like the issue here was related to node offload. Why is there a different set of tests in CLIWithServerSuite as compared to CLISuite? It seems we would want to run this test in both. |
The difference is that in if !s.Persistence.Enabled() {
s.T().SkipNow()
} check to protect tests that you only want to run in an environment where the Argo Server has (or doesn't have) persistence enabled. I just did something similar, take a look at https://github.com/argoproj/argo/pull/2645/files#diff-4f326ca1865929af6dabf0119581b09cR473 for an example. |
Thanks. I came across that, but in this case I would think the test should
run in both cases but I don’t want to duplicate the code. I’ll have a think
how to do this.
…On Sat, 11 Apr 2020 at 17:06, Simon Behar ***@***.***> wrote:
@alexec <https://github.com/alexec> it looks like the issue here was
related to node offload. Why is there a different set of tests in
CLIWithServerSuite as compared to CLISuite? It seems we would want to run
this test in both.
The difference is that CLIWithServerSuite the Argo Server used has access
to the Offload Node Status Repo, while in the CLISuite it does not. If
your test needs access to the repo, add it to CLIWithServerSuite, if it
doesn't add it to CLISuite. You can also add a:
if !s.Persistence.Enabled() {
s.T().SkipNow()
}
check to protect tests that you only want to run in an environment where
the Argo Server has (or doesn't have) persistence enabled.
I just did something similar, take a look at
https://github.com/argoproj/argo/pull/2645/files#diff-4f326ca1865929af6dabf0119581b09cR473
for an example.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#2548 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABO7LWHDD3U62I4MWTM2VTRMCIPTANCNFSM4LXCH5OA>
.
|
961000b
to
6eeae7c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I right in thinking there was a bug in stopping workflows? If so, can the title of the PR be updated to reflect this?
The issue was workflow stop and resume by node didn't properly support offloaded nodes, I'll change the name. |
0df8aaa
to
e84e313
Compare
@alexec FYI I have just rebased on master. |
workflow/util/util.go
Outdated
wf.Status.Nodes = nil | ||
} | ||
|
||
err = packer.CompressWorkflowIfNeeded(wf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This ordering is incorrect, CompressWorkflowIfNeeded
should always come before repo.Save
.
This file misses unit tests. Would you please be able to add them? "No - it is too hard" would be a fine answer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In terms of the ordering, wouldn't that result in it always compressing the nodes if they're too big, whereas currently if offload is enabled for the workflow then it won't try to compress the nodes.
By missing tests, are you talking about adding a test to util_test.go that covers the updateWorkflowNodeByKey function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should be able to see this pattern elsewhere - compression can fail - then we offload
@alexec please take another look now. |
Please fix lint error:
|
Can you login to https://app.circleci.com/? I think you might need to do that to prevent the unauthorized error. |
@alexec I get a 404 at https://app.circleci.com but I have logged into https://circleci.com/dashboard - unfortunately it didn't help. |
OK. I've just merged a fix to master that I think will fix it. Can you sync with master please? |
9e3e284
to
aad287b
Compare
Kudos, SonarCloud Quality Gate passed! 0 Bugs |
@alexec |
Boom! Merged! 🚀 |
Checklist:
"fix(controller): Updates such and such. Fixes #1234"
.Fixes #2621