Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Major version change: task_helper.sh takes full output control #2

Merged
merged 16 commits into from
May 13, 2022

Conversation

reidmv
Copy link
Contributor

@reidmv reidmv commented Aug 23, 2021

This commit refactors task_helper.sh to take full and complete control of output, by trapping EXIT. Review code changes for details.

This is a proposed significant change, and so would mandate a major version bump. After this change, the user experience is as follows.

  1. Source the script
  2. To use task parameters, optionally use ${parameter_name} (rather than e.g. ${PT_parameter_name})
  3. To set output keys, call task-output key value
  4. Call any of task-succeed, task-fail, task-exit, or just let the script run out (rely on the default EXIT trap)

Regardless of how the user exits the script, the helper will ensure valid JSON output is printed to stdout, with:

  • Each key-value string pair the user set at any point in the script by calling task-output $key $value
  • If the user called task-succeed:
    • A status key containing "success"
    • An _output key containing a user message $1
  • If the user called task-fail:
    • A status key containing "error"
    • An _output key containing all stdout/stderr output, ending with a user message $1
  • If neither task-succeed nor task-fail is called but the script terminates with a non-zero exit code:
    • An _output key containing all stdout/stderr output

This PR also adds a task-json-escape helper function which should work on any of the supported platforms. Design intent is lowest-common-denominator posix tools only. Has been tested on OSX.

@reidmv reidmv requested a review from a team as a code owner August 23, 2021 17:26
@reidmv reidmv force-pushed the full-control branch 6 times, most recently from 9d653fc to d65c894 Compare August 25, 2021 03:59
@adreyer
Copy link

adreyer commented Aug 25, 2021

commenting here instead of code since it's more design based.

All script output in an _output key

This is beneficial when things go wrong but I think it is a worst practice.

  1. if the task computes a more meaningful value than its intermediate runs that will be hidden from the user by default since _output is expected to be the human readable result.
  2. It will result in significant memory bloat in orchestrator which is already a problem for plan authors.

task-output "message" "${1:-(no message given)}"

A primary message from a task is what _output is supposed to be used for.

@reidmv
Copy link
Contributor Author

reidmv commented Aug 25, 2021

@adreyer thanks for the feedback. Trying to translate that into suggested changes, here's what I'm coming up with.

All script output in an _output key

This is beneficial when things go wrong but I think it is a worst practice.

Is this a worst practice for a helper, or is that just commentary on bash script writing practices? "Avoid unnecessary output" is already one of Eric Raymond's 17 Unix Rules, and shell script writers should ideally already be following that.

I assert, however, that the helper should faithfully relay whatever output a task writer chooses to permit. It shouldn't be up to the helper to try and suppress anything, and therefore "all script output in an _output key" is helper-correct.

Note that as written, if a user doesn't have any output, _output will be an empty string.

#!/bin/bash
source "$(dirname $0)/../../bash_task_helper/files/task_helper.sh"

task-output "key" "value"

will currently result in

{
  "key": "value",
  "_output": ""
}

Adding task-succeed without an argument will currently do something like this:

{
  "key": "value",
  "status": "success",
  "message": "(no message given)",
  "_output": ""
}

A primary message from a task is what _output is supposed to be used for.

If that's the case, perhaps task-fail and task-succeed should simply echo any message given, sending the data to _output, rather than setting a "message" key?

@@ -45,7 +45,7 @@
 #
 task-fail() {
   task-output "status" "error"
-  task-output "message" "${1:-(no message given)}"
+  echo -n "$1"
   exit ${2:-1}
 }

@@ -63,7 +63,7 @@ task-fail() {
 #
 task-succeed() {
   task-output "status" "success"
-  task-output "message" "${1:-(no message given)}"
+  echo -n "$1"
   exit 0
 }

That would change the output from the above, to this:

{
  "key": "value",
  "status": "success",
  "_output": ""
}

If the user calls task-succeed "we did it!" instead, the output would become:

{
  "key": "value",
  "status": "success",
  "_output": "we did it!"
}

Would that be better aligned with intended idiomatic approach to task return data?

@adreyer
Copy link

adreyer commented Aug 25, 2021

I think it's a worst practice for the helper for the two reasons I outlined.
Even appending isn't great what about when you run across a dozen targets? You have a wall of text instead of a nice message. If you really want every step to be printed every time you run the task you should probably be writing a script not a task.
These sorts of tasks are fundamentally bad for use in plans since they blow up the memory requirements and at this point I think tasks primarily exist to be high quality building blocks for plans.

I think these trivial examples are misleading

yum install dependent library
rsync 
make
task-output  "successfully built new app version `bin/app --version`"

The output of the yum, rsync, and make commands should be thrown away during successful runs not displayed to the user as an important message.

#
task-succeed() {
task-output "status" "success"
task-output "message" "${1:-(no message given)}"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO the message should be returned to _output that is what the _output special key is for. In that case (no message given) should not be the default so the user can let task runners print the full data if they don't have an important message

for i in "${!_task_output_keys[@]}"; do
printf ' "%s": "%s",\n' "${_task_output_keys[$i]}" "${_task_output_values[$i]}"
done
printf ' "_output": "%s"\n' "$(task-json-escape < "$_output_tmpfile")"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic should only be called on failure by default.
I think a reasonable API for this would be to have a full-script-output function that returns this output so users can opt into setting on success if they want. either as _output with task-succeed full-script-output or setting it to the variable they want.

@reidmv
Copy link
Contributor Author

reidmv commented Aug 26, 2021

@adreyer made the following changes:

  • _output does not contain full task output if the task is successful
  • task-succeed "message" will cause _output to be set to "message"
  • task-fail "message" will print "message", then set _output to all output the task has emitted
  • If the task succeeds, and no message was given, there will be no _output key
  • If the user invokes the task-verbose-output function in their bash code, _output will contain all output the task has emitted, regardless of exit code

@reidmv reidmv changed the title Refactor: task_helper.sh takes full output control Major version change: task_helper.sh takes full output control Aug 27, 2021
@m0dular
Copy link
Contributor

m0dular commented Aug 28, 2021

After thinking it over, the trap requirement is simply too restrictive and I cannot approve using it. One example would be here where we use it to clean up temp files. This is an extremely common paradigm in shell, and taking that away for the purpose of printing an output message is not an acceptable use.

Personally, if I were looking into this module and saw that it takes trap and redirection away from me, I wouldn't even consider it.

@CLAassistant
Copy link

CLAassistant commented Sep 1, 2021

CLA assistant check
All committers have signed the CLA.

@reidmv
Copy link
Contributor Author

reidmv commented Sep 7, 2021

Hey @m0dular,

Okay. How about making task-exit() a publicly-callable function, and documenting that if you trap EXIT, you need to call it yourself at the end of your trap?

The documentation then changes from:

Consumers MUST NOT trap EXIT

To:

The task helper sets up a default EXIT trap. If consumers trap EXIT themselves, they MUST call task-exit at the end of their trap to trigger the helper's output finalization routine.

@m0dular
Copy link
Contributor

m0dular commented Sep 7, 2021

That's a good question, and maybe it would help to answer it if we thought about what we want in terms of exiting and output messages at a high level? Currently, ca_extend and several of our modules call fail as part of error checking. This function:

  • takes a non-json string as a parameter
  • Uses that and $_tmp (if non-empty) to construct a json object and prints it
  • Exits non-zero

Otherwise, they just call success with a JSON string. I agree that success doesn't make much sense or really provide anything, it's just how it ended up working over the years. I really like the fail approach because you only care about stderr if something fails, and the exec 2>$_tmp gets you stderr from all processes run by the task. You could also integrate this with a trap on ERR if you don't want to add || fail after everything. A feature request automated trap handling sounds good to me.

So I think my main question is: we know success sucks (lol), what do we want from it? I think something similar to fail where we take a non-json string and construct a predicable json object makes sense.

@reidmv
Copy link
Contributor Author

reidmv commented Sep 7, 2021

@m0dular I've updated the PR description with the following stanza, which I think clarifies the proposed value of the helper in whole, as well as task-success, task-fail, and the new task-output specifically. I believe this is aligned to what you've described.

The helper will ensure valid JSON output is always printed to stdout, with:

  • Each key-value string pair the user set by calling task-output $key $value at any point in the script
  • If the user called task-succeed:
    • A status key containing "success"
    • An _output key containing a user message $1
  • If the user called task-fail:
    • A status key containing "error"
    • An _output key containing all stdout/stderr output, ending with a user message $1
  • If neither task-succeed nor task-fail is called but the script terminates with a non-zero exit code:
    • An _output key containing all stdout/stderr output

In this model, the user does not worry about either stdout or stderr. Both task-fail and task-succeed are less important than task-output; both are little more than syntactic sugar wrapping calls to task-output. To construct return keys, the user calls task-output $key $value whenever they want to set a key.

The current implementation only supports strings (and will quote + escape them), but could be expanded in the future to support passing through a "raw" value (user must make sure it is valid JSON) when setting a key.

This commit refactors task_helper.sh to take full and complete control
of output, by trapping EXIT. See commit for details.
This helps keep it very clear when a task helper function is being used.
We have two functions, one for when a task fails and one for when it
succeeds. We should standardize on whether our function names are nouns
or verbs.

This commit makes the names verbs. If we do nouns, it would be
failure/success.
Using the special key _output, we can coerce Bolt into displaying
terminal output in a human-friendly way when printing to the terminal.
Bolt will un-escape the text string for us, showing the end user exactly
what they expect to see when they run the script by hand.
Matches better with the _output key this content populates
Make it so that task-output handles re-setting output variables,
removing the risk of duplicate keys in the output.
This commit ensures that on task success, _output is minimal, and
contains a value only if the user has intentionally set one. On task
failure, _output will contain all output for the entire task, as well as
any user-set value.

On success, if there is no user-set value, the _output key will not be
included.
This permits consumers to set their tasks to always return full output
The 2.x API is not fully backwards compatible with the 1.x API, but
providing deprecated stubs for these functions should help.
Permit the traps by documenting that consumers must call `task-exit` in
their EXIT trap, if they set one.
Otherwise sed will only replace the _first_ instance of that character
type, rather than all of them.
This function permits advanced users to set output keys to pre-formatted
JSON values.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants