Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add BuildpackOutput #721

Merged
merged 100 commits into from
Feb 14, 2024
Merged

Add BuildpackOutput #721

merged 100 commits into from
Feb 14, 2024

Conversation

schneems
Copy link
Contributor

@schneems schneems commented Nov 6, 2023

This adds a new build_output module containing a BuildOutput interface, whose goal is to provide a standard format for outputting build information to the end user while a buildpack is running.

This implementation is an evolution of the approach prototyped in the Ruby CNB:
https://github.com/heroku/buildpacks-ruby/tree/dda4ede413fc3fe4d6d2f2f63f039c7c1e5cc5fd/commons/src/output

@schneems schneems force-pushed the schneems/output branch 4 times, most recently from 45bf3db to 7130e45 Compare November 6, 2023 22:18
@schneems schneems marked this pull request as ready for review November 12, 2023 16:13
@schneems schneems requested a review from a team as a code owner November 12, 2023 16:13
@Malax Malax self-assigned this Nov 14, 2023
@edmorley edmorley added enhancement New feature or request libherokubuildpack labels Nov 16, 2023
@edmorley edmorley requested a review from Malax January 15, 2024 11:19
@schneems schneems force-pushed the schneems/output branch 4 times, most recently from 8888f99 to 45b3405 Compare January 24, 2024 23:00
Copy link
Member

@edmorley edmorley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this.

I've only had an initial review look so far - at 1600+ lines long this PR is pretty large (and I see that's after several simplifications already) - which will take some time to review, and is also larger than I was expecting for an initial landing of a logger feature.

The comments I've left so far focus on ways we can try to reduce the scope of this PR to make it easier to land something sooner. For example, by removing the print_interval feature (and all the associated background threads, locking, ... handling).

I think when first landing a new feature it's best to keep the feature set smaller, and then incrementally add additional features in follow-up PRs, since:

  • the time taken to review a PR is exponential wrt PR size
  • each new feature is another potential place where design decisions/discussions need to be had, particularly for advanced features - such as the print_interval feature
  • the print_interval feature is also something that not all buildpacks will even need (as mentioned in a meeting a few months ago) - so the chances are that it will require more discussion than some of the other more core logger features

Once those removals/simplifications are made, it will be easier for me to continue the review.

libherokubuildpack/examples/print_style_guide.rs Outdated Show resolved Hide resolved
libherokubuildpack/examples/print_style_guide.rs Outdated Show resolved Hide resolved
libherokubuildpack/examples/print_style_guide.rs Outdated Show resolved Hide resolved
.github/workflows/ci.yml Outdated Show resolved Hide resolved
libherokubuildpack/src/output/background.rs Outdated Show resolved Hide resolved
libherokubuildpack/Cargo.toml Outdated Show resolved Hide resolved
libherokubuildpack/Cargo.toml Outdated Show resolved Hide resolved
libherokubuildpack/Cargo.toml Outdated Show resolved Hide resolved
libherokubuildpack/src/output/util.rs Outdated Show resolved Hide resolved
libherokubuildpack/src/output/background.rs Outdated Show resolved Hide resolved
Richard Schneeman and others added 11 commits January 30, 2024 13:39
This is a port of https://github.com/heroku/buildpacks-ruby/tree/016ee969647afc6aa9565ff3a44594e308f83380/commons/src/output from the Ruby build pack.

The goal is to provide a standard format for outputting build information to the end user while a buildpack is running.
* Refactor background timer logic

The background timer prints dots to the UI while the buildpack performs some expensive operation. This requires we use threading and Rust really wants to be sure that we're not going to have data races and that our data is secure.

The original version of the code relied on wrapping our IO in an Arc<Mutex<W>> so that we don't "lose" it when we pass it to the background thread. This worked, but was overly complicated based on my limited understanding of working with lifetimes and threads. This version instead relies on the ability to retrieve a struct when we join the thread to get the IO object back out. This reduces complexity that the BuildLog interface needs to know about.

# This is the commit message #2:

Inject tick state

The tick values for the printer was hardcoded. Passing in the values allows us to remove the `style` module dependency and makes the tests simpler.

# This is the commit message #3:

Rename modules and add docs

The prior name `print_dots` isn't technically valid anymore as it could be printing something else. I also moved the guard struct into it's own module to make it a bit safer (ensure that Option is always Some at creation time).

# This is the commit message #4:

Clarify intention that option is Some in code

The debug_assert adds a stronger signal that the option must be Some when it is being constructed.

* [Stacked PR] Remove warn_later from output

The `warn_later` functionality relies on global state in a way that we're not thrilled about putting into everyone's build packs by default. This removes the feature, plain and simple.
…ruct returns.

Boxed traits proved to be difficult to work with. On the user side: the buildpack user mostly interacts with associated methods on `BuildLog` they cannot be called unless the traits that implement them are in scope so the user was forced to have a `*` import. On the implementation side, boxed traits are limited. For example, there was no way to have an associated function that accepts a `Fn` or `FnOnce` to call it and return itself.

The boxed trait state machine pattern was a useful exercise in refining the shape of the API, but it is no longer bringing joy. With that, we wish it farewell.
The ReadYourWrite struct aided in testing as it implemented `Write` and produced a way to read the contents that were written. This struct was used for testing purposes and Manuel identified that it would be preferable to have a way to retrieve the Write IO struct directly. Previously this would have meant introducing a public interface to the build log trait as all type information was hidden behind a boxed trait return. As the trait implementation is removed we can now add a testing function that exposes the writer and can remove this single purpose class.

Fun fact: Prior to introducing the ReadYourWrite struct, earlier versions of the code used a file to pass into the BuildLog as a `Write` that could later be read since the filename was not consumed and could be read even once the `BuildLog` and associated file handle are not accessible.
* cull

* Add changes from pairing
- Annotate that it's only used in tests
- Rename to make it clearer that it operates on all lines
- Refactor to not rely on regex
This was used for the style guide printing.
@schneems
Copy link
Contributor Author

Alright. I'm ready for another round. We all missed a HUGE piece of the public interface being that log was still in place. I feel like that's the level I'm aiming to get feedback on originally.

Really I'm asking for help getting this over the line, whatever that looks like. I know you're out of steam (maybe projecting? I'm out of steam).

We still want the `log` feature to exist until all buildpacks have
transitioned over to `build_output`, since:
1. It will ease the transition for repositories where there are
   multiple buildpacks, since each buildpack can be migrated
   to `build_output` individually, rather than needing to do
   all of them at once.
2. It means any buildpacks that cannot switch to `build_output`
   right away, can still stay up to date with other libcnb changes.

The change to `on_error` has also been reverted, since:
1. It's a change that's standalone from introducing the
   `build_output` module (ie: lets first introduce a new thing,
   then afterwards handle migrating things to use it)
2. It's otherwise yet another thing we need to review in this
   already very-long lived PR, and it would be best to avoid
   further delays. (I can see at least one issue with the change
   that would need fixing, and I haven't even reviewed it fully
  yet.)
This was introduced when the changelog was updated as part of merging
`main` into this branch in:
a76c490
Since:
- both myself and Manuel find the previous implementation clearer
- the rustdocs say "If an empty string is provided ...", rather than
  "if there are no lines after splitting ...", so the previous implementation
  makes even more sense in that light
- the previous implementation was what was code reviewed already
I think this might have been a leftover from:
1c27a74
Previously `write_paragraph` would use the incorrect prefix when the
line wasn't empty but had trailing whitespace in the input string.

ie for this input:

`foo\n\t\t\nbar`

Previously the output would be:

`! foo\n!\t\t\n! bar`

And now it will be:

`! foo\n! \t\t\n! bar`

I've adjusted the tests to test this case too.

I've also added a comment to the implementation to make it clearer
why the prefix is being adjusted at all.
@edmorley
Copy link
Member

We all missed a HUGE piece of the public interface being that log was still in place. I feel like that's the level I'm aiming to get feedback on originally.

We didn't miss this when reviewing - we want the log feature to continue to exist until all buildpacks have transitioned over to build_output, since:

  1. It will ease the transition for repositories where there are multiple buildpacks, since each buildpack can be migrated to build_output individually, rather than needing to do all of them at once.
  2. It means any buildpacks that cannot switch to build_output right away, can still stay up to date with other libcnb changes.

We can definitely remove it in the future though :-)

To speed things up I've reverted the log removal (and the on_error change; more on that in the commit message), in:
1da805e

Copy link
Member

@edmorley edmorley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the updates!

To help speed things up, I've pushed some commits to resolve a few things - more on those in these comments (feel free to resolve the comment threads once you've had a chance to read them):

This PR now very close to being done :-D The one last part to resolve is this possible bug (but the fix should be easy, if we decide it's something that should be fixed):
#721 (review)

Mentioned in the comments: A user can add a stray newline to an input and mess up the formatting. The original implementation guarded against this at the point of origination. The basic concept is that user input isn't to be trusted, and at the point that they've given it to us, we can sanitize it for consistency https://github.com/heroku/buildpacks-ruby/blob/dda4ede413fc3fe4d6d2f2f63f039c7c1e5cc5fd/commons/src/output/build_log.rs#L224. With that concept in mind, the BuildOutput needs to own the consistency of the data it passes to other modules and functions.

The build output concept largely takes ownership over newlines between outputs so I feel comfortable trimming these. I picked shared entry points to trim before any writeln_now calls. 

I extended the test to check for trailing newlines in additional locations.
@schneems
Copy link
Contributor Author

I've reviewed your changes. Thank you for the help! I pushed a commit to handle more cases where stray trailing newlines can affect the output.

Regarding log and error I believe there's a bug in both of them when pushing to Heroku where the color bleeds over the remote: prefixes for any output that goes over 1 line. That's existed on main for a long time ago, it can be a bug for a bit longer.

@edmorley
Copy link
Member

edmorley commented Feb 14, 2024

Regarding log and error I believe there's a bug in both of them when pushing to Heroku where the color bleeds over the remote: prefixes for any output that goes over 1 line. That's existed on main for a long time ago, it can be a bug for a bit longer.

Yeah I believe that existing bug is this one:
#555

(Whilst the issue linked above is about the prefix added by Pack when the builder is "untrusted", it ends up having a similar impact to the "remote:" prefix added by the Heroku Git server.)

We can close out that GitHub issue (plus also #712) when we remove log later on :-)

The fix in 3ad2008 fixed trailing
newlines in a few scenarios, and updated the tests to cover most of them,
but there was one more case not tested, which is resolved now.
Previously two cases were not tested:
1. The "don't add the prefix if the line is blank" case, which is
   why the `if input.iter().all(u8::is_ascii_whitespace)` logic exists.
2. The output when the streamed command output includes a
   trailing newline.

Whilst the output for (2) currently includes a redundant newline,
IMO it makes sense to have a test demonstrating the issue in the
meantime. (I've also added a TODO.)
This makes the "don't add the indent if the line is blank" conditional:
(a) correct in the "trailing whitespace isn't just a newline" case
(b) equivalent to the `prefix_lines` closure used in `write_paragraph`

ie: This is the `start_stream` version of this change:
5e20c41

I've also renamed `input` to `line` to make the closure easier to
understand.
Copy link
Member

@Malax Malax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-approving to signal that changes made since my last approval are also approved :)

Copy link
Member

@edmorley edmorley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@schneems I've pushed some additional commits / left one last comment:
#721 (comment)

If you're happy to defer fixing the issue I found until later (and are happy with the other commits I pushed), I'm happy to call this done and I will approve/merge the PR :-)

Check if the streaming command emitted two newlines before adding an extra newline for padding. This fixes the case where an extra newline was being added when it wasn't needed.
@schneems
Copy link
Contributor Author

schneems commented Feb 14, 2024

I commented inline but will again here as well:

@schneems I've pushed some additional commits / left one last comment:
#721 (comment)

If you're happy to defer fixing the issue I found until later (and are happy with the other commits I pushed), I'm happy to call this done and I will approve/merge the PR :-)

I pushed a fix for this case. It was fairly easy. If you really don't like it for some reason you can revert and merge and we can file an issue for it.

Copy link
Member

@edmorley edmorley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything from the review is now resolved - let's merge!

Thank you for persevering through this 😄

@edmorley edmorley changed the title Add BuildpackOutput Add BuildpackOutput Feb 14, 2024
@edmorley edmorley merged commit d806cbd into main Feb 14, 2024
4 checks passed
@edmorley edmorley deleted the schneems/output branch February 14, 2024 22:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request libherokubuildpack
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants