Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zbctl: extend deadline for withResult with flag #5415

Closed
jwulf opened this issue Sep 24, 2020 · 2 comments · Fixed by #8823
Closed

zbctl: extend deadline for withResult with flag #5415

jwulf opened this issue Sep 24, 2020 · 2 comments · Fixed by #8823
Labels
good first issue Marks an issue as simple enough for first time contributors kind/feature Categorizes an issue or PR as a feature, i.e. new behavior

Comments

@jwulf
Copy link
Member

jwulf commented Sep 24, 2020

Is your feature request related to a problem? Please describe.
I'm waiting for the result from a workflow that takes longer than 15 seconds, using zbctl. I get a timeout, and there is no flag in the help to set a longer deadline to wait for the response.

Describe the solution you'd like
zbctl create instance <process.id> --withResult --timeout <ms>

@jwulf jwulf added the kind/feature Categorizes an issue or PR as a feature, i.e. new behavior label Sep 24, 2020
@npepinpe npepinpe added Impact: Usability good first issue Marks an issue as simple enough for first time contributors hacktoberfest Marks an issue as a candidate to be a Hacktoberfest contribution and removed Status: Needs Triage labels Sep 24, 2020
@haardikdharma10
Copy link
Contributor

@jwulf, I would like to help in solving this issue. Can you help me on how to get started?

@jwulf
Copy link
Member Author

jwulf commented Sep 30, 2020

Falko and I did a deep dive into the Zeebe Go Client source code the other day: https://www.youtube.com/watch?v=u-ObMrsZQR0

This will help you get an idea of the source code. We didn't look at the section where the zbctl command interface is implemented, but it will give you a sense of it.

@menski menski removed the hacktoberfest Marks an issue as a candidate to be a Hacktoberfest contribution label Aug 16, 2021
ghost pushed a commit that referenced this issue Feb 23, 2022
8823: Improve zbctl integration tests r=npepinpe a=npepinpe

## Description

This PR improves the visibility of `zbctl`'s tests.

At the moment, the container suite works by printing out the container logs when a test fails, at the suite level. This means, for zbctl's tests, when `TestCommonCommands` fails. However, `TestCommonCommands` runs multiple sub tests. This means, when one of the sub tests fails, the failed container logs are printed at the end for the parent test, which makes it harder to aggregate properly in Jenkins. Furthermore, when multiple sub tests fail, you still only print the logs once at the end after all tests ran.

This PR suppresses this behavior and instead prints out all the container logs for every failed sub test. This will guarantee the logs for the failed tests are grouped with the test, and it will allow us to distinguish the logs between failed tests. To accomplish, a new method was added on the suite to reuse the internal `printFailedContainerLogs` function for the suite.

Additionally, we now properly detect that a time out occurred. There's some debate about the behavior of `command.Run()`, and the consensus is the error returned should express the status of the process. When killed via a context, this means the error just says the process was killed. To know it was via timeout, we need to also check the context used to spawn the command. However, it's perfectly to just bail out when the exit code was -1, which according to the Go documentation, means the process was killed by a signal - in our case, most likely the time out. At any rate, it's safe to bail out if the status code was less than 0.

To be able to modify the command's timeout, a new flag to `zbctl` is also added, I've made `--requestTimeout` a global flag. This supersedes the previous command specific timeout flags, but since the flags are the same, replacing them is perfectly backwards compatible.

I also disabled long polling for the `zbctl` tests. It wasn't really useful, and caused some tests to hang for a long time. I think this might have been one of the reason for the flaky tests, actually, since the timeout of the command and of the requests with long polling were quite close. To disable long polling, I extended the container suite to allow passing arbitrary environment variables.

Finally, I also added a new flag to have the worker for `zbctl` exit early: `--maxJobsHandle`. When omitted, it behaves as expected, hanging forever. When specified, as soon as it has handled the max number of jobs, it will exit. This means the `createWorker` test doesn't have to wait until the command is killed to exit, which may also cause some flakiness.

This PR tentatively fixes #8284 - I think disabling long polling, proper time out handling might fix the whole thing, and exiting early might solve it.

NOTE: removing `zbc.DefaultRequestTimeout`, a constant which was not used anywhere in our code, is technically a breaking change. We could leave it for people using it - but again, it's not used in our code, so it has very little value...let me know what you think.

I realize this is quite a large PR, so let me know if you'd like me to break it off.  I don't think the changes are complex, but it's quite a few little ones, so I understand.

## Related issues

closes #8284
closes #5415



Co-authored-by: Nicolas Pepin-Perreault <nicolas.pepin-perreault@camunda.com>
@ghost ghost closed this as completed in 3c3622a Feb 24, 2022
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Marks an issue as simple enough for first time contributors kind/feature Categorizes an issue or PR as a feature, i.e. new behavior
Projects
None yet
6 participants