Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase number of container logs #800

Closed
wants to merge 1 commit into from
Closed

Conversation

jenshenny
Copy link
Contributor

@jenshenny jenshenny commented Feb 8, 2021

What are you trying to accomplish with this PR?
There have been occasions where the number of lines available in the stack trace aren't enough to diagnose a problem for a deploy. This change bumps the number of lines from 25 to 100 so that more information is provided for developers so that they are able to determine the issue.

Example:

...
/app/.gem/ruby/2.6.0/gems/bundler-2.1.4/exe/bundle:46:in `block in <top (required)>'
/app/.gem/ruby/2.6.0/gems/bundler-2.1.4/lib/bundler/friendly_errors.rb:123:in `with_friendly_errors'
/app/.gem/ruby/2.6.0/gems/bundler-2.1.4/exe/bundle:34:in `<top (required)>'
/app/bin/bundle:5:in `load'
/app/bin/bundle:5:in `<main>'

but the line that gives the issue away isn't shown due to the line limit (the error was 76 lines in total)

=> Crashed: ArgumentError: Trying to register Bundler::GemfileError for status code 4 but Bundler::GemfileError is already registered

@jenshenny jenshenny marked this pull request as ready for review February 8, 2021 20:27
@jenshenny jenshenny requested a review from a team as a code owner February 8, 2021 20:27
@@ -3,7 +3,7 @@ module Krane
class ContainerLogs
attr_reader :lines, :container_name

DEFAULT_LINE_LIMIT = 25
DEFAULT_LINE_LIMIT = 100
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think most importantly this should be configurable, e.g. DEFAULT_LINE_LIMIT = Integer(ENV.fetch('KRANE_LINE_LIMIT', 100)).

Because one default won't work for all apps ever.

Copy link
Contributor

@dturn dturn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DEFAULT_LINE_LIMIT used to be larger but more often than not it wasn't helpful and just made Krane's output harder for the average developer to understand. We can talk about making it configurable, but we should leave the default value at 25.

Copy link

@lastgabs lastgabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed with @casperisfine that this should be configurable.

Related to @dturn s comment: we're currently migrating services to the new container builder, and currently the line limit is too low causing a spike in support load for multiple teams because developers can't see the errors and have to ask about it. The only feasible way to see the full log is by using kubectl which developers don't have access to, neither does my team to be able to support developers in making this migration.

I read through the PR history for this number to be increased and it's possible that core is the place where this pain can be most felt. By making it default to 100 and changing it to 25 in core would make it a useful default for all other services (rather than have to change the env variables in all the repos to a higher number).

@dturn is there any other context that I'm missing that would invalidate the things I've just said?

@dturn
Copy link
Contributor

dturn commented Feb 8, 2021

Gabi, thank you for addintional context.

is there any other context that I'm missing that would invalidate the things I've just said?

  1. Logging is the heart of Krane, its primarily a CLI tool after all. So any changes to what or how much we log get a lot of scrutiny.

  2. Krane is an open source repo / tool and one of our written rules is to prioritize Universality. In general that means we reject changes that improve the experience for some users while making things worse for other.

I'm happy to accept the env var override for but not the change in default value. As I strongly believe there isn't a right number log lines to log, but from experience logging too many lines confuses / overwhelms users more often than it actually provides the answer they're looking for.

@lastgabs
Copy link

lastgabs commented Feb 9, 2021

@dturn I understand it's not an easy change to accept and I don't think we should make it lightly.

from experience logging too many lines confuses / overwhelms users more often than it actually provides the answer they're looking for

The situation that we are seeing happen is that the stacktrace is too large and therefore the error message (the only information users need to be able to understand why something failed) doesn't appear anywhere as it usually comes last. Hiding that information is ultimately not providing the answer they are looking for, effectively making it impossible to debug any issue.

If increasing the number of logged lines is not the right answer, do you have other suggestions on how this problem can be fixed?

@casperisfine
Copy link
Contributor

I don't think it's worth arguing over this. Let's just make it configurable, we can then set that env variable globally to the value we want through our deploy tool, and set it to a custom lower value for the apps where we'd rather keep the 25 lines limit.

@lastgabs
Copy link

lastgabs commented Feb 9, 2021

No argument on my side. I understand the problem of changing this number and do agree that we can't make it worse for some users.

I like your idea @casperisfine - @jenshenny had suggested the same thing.

The other one I thought of was: instead of showing the first 25 lines of logs, we could change krane to show the first 15 lines and the last 15 (or some other combination), and that way you can see the both the beginning and the end of the stacktrace, which are usually the most important bits of the log. But this is a whole lot more involved, so I'd rather try yours first and see how that goes. If we get complaints about the size of logs, we can make a more involved change later on.

@jenshenny
Copy link
Contributor Author

Closing in favor of #803

@jenshenny jenshenny closed this Feb 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants