Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add status reason messaging #833

Merged
merged 31 commits into from Dec 9, 2022
Merged

Add status reason messaging #833

merged 31 commits into from Dec 9, 2022

Conversation

vshah23
Copy link
Contributor

@vshah23 vshah23 commented Dec 1, 2022

Checklist

Version

Requires a MINOR version update

Context

This change introduces better status reason messaging for steps for the build log grouping feature.

The relation between step run status, status_reasons and errors are here:

status status_reason errors
success x x
skipped This Step was skipped, because a previous Step failed, and this Step was not marked "is_always_run". x
skipped_with_run_if This Step was skipped, because its "run_if" expression evaluated to false.\nThe "run_if" expression was: {run_if_exp} x
failed_skippable This Step failed, but it was marked as "is_skippable", so the build continued. {error}
failed x {error}
preparation_failed x {error}
aborted_with_custom_timeout x This Step timed out after {timeout}.
aborted_with_no_output x This Step failed, because it has not sent any output for {timeout}.

Changes

  • The StatusReasons method is now part of the StepRunResultsModel struct, which returns the status_reason and/or errors related to the given step run status.
  • A new time formatting method is introduced to accommodate the formatting of (int) seconds to a string like this example:
    3661 -> "1h 1m 1s"
  • The step run results footer in both the step run output and in summary now contains the status as either Failed or Skipped.
  • Step run errors and status reasons are printed into the step log box.

Investigation details

Decisions

bitrise/print.go Outdated Show resolved Hide resolved
bitrise/print.go Outdated Show resolved Hide resolved
log/models.go Outdated Show resolved Hide resolved
vshah23 and others added 7 commits December 7, 2022 09:54
* Remove unnecessary fields from StepFinishedParams

* Update status reasons

* Introduce the new StepRunResultsModel.StatusReasons function

* Version -> FormatVersion

* Remove unused method

* Handle switch cases exhaustively

* Update unit tests

* Update unit test

Co-authored-by: vikas <vikas.shah@bitrise.io>
@vshah23 vshah23 closed this Dec 8, 2022
@vshah23 vshah23 deleted the CI-318 branch December 8, 2022 18:50
@vshah23 vshah23 restored the CI-318 branch December 9, 2022 07:19
@vshah23 vshah23 reopened this Dec 9, 2022
Comment on lines +34 to +35
// TODO: if len(titleWithSuffix) > titleBoxWidth because of a long suffix,
// might we trim too much from the title
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still a todo?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it is. The mentioned issue won't come up in real Bitrise builds, if we consider the current max box width (stepRunSummaryBoxWidthInChars), the lenght of titles, and title statuses. But still in itself won't work as expected for arbitrary input.

Comment on lines -78 to -89
isExitStatusError := true
if err != nil {
isExitStatusError = errorutil.IsExitStatusError(err)
}

// Print step preparation errors before the step header box,
// other errors are printed within the step box.
if status == models.StepRunStatusCodePreparationFailed && err != nil {
if !isExitStatusError {
log.Errorf("Preparing Step (%s) failed: %s", pointers.StringWithDefault(stepInfoCopy.Step.Title, "missing title"), err)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer this way when every step related error appears inside the step box. Even if technically for these step preparation errors the step was not started. But does this new behaviour needs to be coordinated with the FE team? Because previously this error was appearing as a bitrise cli error before the step start event and now the error only appears in the step finished event.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK bitrise CLI logs are not shown when structured logging is enabled, this mostly changes how the console log is presented.

cli/run_test.go Outdated
return strings.Contains(output, expectedEvent)
}
assert.Eventuallyf(t, condition, 5*time.Second, 150*time.Millisecond, "", "")
//assert.True(t, strings.Contains(output, expectedEvent), output)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removable comment?

Comment on lines +1633 to +1637
condition := func() bool {
output := buf.String()
return strings.Contains(output, expectedEvent)
}
assert.Eventuallyf(t, condition, 5*time.Second, 150*time.Millisecond, "", "")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this new assert needed? From what I can see in the test that we save the loggers output into a buffer and run a test workflow. By the time we reach this assertion the workflow run should be over and we should have the output in the buffer. I think we do not do any concurrent operation in the cli (workflow related).

I was curious and restored the previous assertion and did 5 test runs locally and it was always passing. Is this flaky on the CI?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test case (GivenPluginRegisteredForMultipleTriggers_ThenPluginTriggeredTwice ) is flaky on CI, here is an example build: https://app.bitrise.io/build/0e707cfb-7591-4465-b6ab-e57f6e28925f.

Comment on lines +710 to +720
if mergedStep.RunIf != nil {
stepInfoPtr.Step.RunIf = pointers.NewStringPtr(*mergedStep.RunIf)
}

if mergedStep.Timeout != nil {
stepInfoPtr.Step.Timeout = pointers.NewIntPtr(*mergedStep.Timeout)
}

if mergedStep.NoOutputTimeout != nil {
stepInfoPtr.Step.NoOutputTimeout = pointers.NewIntPtr(*mergedStep.NoOutputTimeout)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are needed because they were not set on the stepInfoPtr previously? I was trying to see where we are reading these from the stepInfoPtr but could not see it. Is this a change to just make the code technically correct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are needed because they were not set on the stepInfoPtr previously?

Yes, if you search for stepInfoPtr.Step. in the activateAndRunSteps func, you can see that we only set the step properties that are later used for presentation (step boxes).

stepInfoPtr is passed to the buildRunResultCollector.registerStepRunResults, which uses it for creating the stepResults:

Then the added fields are used for creating the status reason: https://github.com/bitrise-io/bitrise/blob/CI-318/cli/build_run_result_collector.go#L184

Comment on lines +96 to +101
for _, stepError := range params.Errors {
lines = append(lines, colorstring.Red(stepError.Message))
}
if params.StatusReason != "" {
lines = append(lines, colorstring.Blue(params.StatusReason))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to admit you tricked me with this clever solution. 😂

I set a step preparation failed error and was debugging who prints the error inside the step box because at first look it did not make sense. The step did not run and still someone printed the error there. And if I ran it with json output then there were only the step started and step finished events and nothing in between.

But then I pieced it together and realised that the step footer now prints the errors and status reason as part of the main step body. It is was tricky for me to understand because my mind thought that the step footer always starts with the step timing information.

I do not have a better solution but just wanted highlight that this might be a little bit confusing the first time you encounter it.

Comment on lines 6 to 15
const (
StepRunStatusCodeSuccess StepRunStatus = iota
StepRunStatusCodeFailed
StepRunStatusCodeFailedSkippable
StepRunStatusCodeSkipped
StepRunStatusCodeSkippedWithRunIf
StepRunStatusCodePreparationFailed
StepRunStatusAbortedWithCustomTimeout // step times out due to a custom timeout
StepRunStatusAbortedWithNoOutputTimeout // step times out due to no output received (hang)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to highlight that in the original constant list the value 6 was missing. So StepRunStatusCodePreparationFailed was 5 and StepRunStatusAbortedWithCustomTimeout 7. With this change the last two statuses will not have their original value.

And this value reordering and using the iota might be dangerous ot use. If someone inserts a new state in the middle of the list (or reintroduces the missing 6 value) then the whole status is off. And this is used by (based on GitHub code search):

  • build log service
  • bitrise analytics plugin
    and based on the code they use it for mapping purposes.

Either lets use dedicated integer values like before or add a big comment to warn not to modify the order of the list because other tools might depend on them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for spotting this, it was not intentional, restored here: b58b743

models/models.go Outdated
return ""
case StepRunStatusCodeFailed, StepRunStatusCodePreparationFailed, StepRunStatusCodeFailedSkippable:
return fmt.Sprintf("exit code: %d", exitCode)
func (s StepRunResultsModel) StatusReasons() (string, []StepError) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method returns more than what the name suggest. I do not have a great suggestion but I would rename it to StatusReasonAndErrors or FinalStatus or something else.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We decided to use this name because both the status_reason and the errors are reasonings for the given status.

tothszabi
tothszabi previously approved these changes Dec 9, 2022
@godrei godrei merged commit 432a240 into master Dec 9, 2022
@godrei godrei deleted the CI-318 branch December 9, 2022 13:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants