-
Notifications
You must be signed in to change notification settings - Fork 269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes #30284 - Improve smartproxy registration failure error messages #912
Conversation
Current output when httpd is down:
|
When foreman.service is down, there is no issue because foreman.socket will bring it back up. OTOH, when foreman.socket itself is down:
|
When a 401 response is received (for example by using incorrect
|
3f9b0f0
to
5912fc8
Compare
ab88b03
to
82ede21
Compare
303a7a6
to
661fdde
Compare
f0713d8
to
c165811
Compare
b71c6cf
to
63abca5
Compare
Error output with the latest version is as below:
Removing additional unnecessary language and de-duplicating error messages in foreman-installer output will require an additional change in Kafo, independent of the implementation here. |
It looks like a powerful solution, but it also requires a lot more build out, so it looks like there isn't an easy path to get that done for the next release? |
A couple of minor comments from me. I think this is a good update to the output messages for an area that is often unclear as to what might be the underlying cause when users hit it to help users and developers more easily diagnosis and fix the problem. I think there have been some further updates outlined we could investigate for general error reporting the installer, and for me those should not hold up this improvement. |
messages = { | ||
'400' => '400 Bad Request: Something is wrong with the data we sent to Foreman server', | ||
'401' => '401 Unauthorized Request: Check credentials for validity', | ||
'404' => '404 Not Found: The requested resource was not found', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like an edge case to me. The only thing I can think of is that they entered the wrong server URL and end up on a different service.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's definitely an edge case and I had a few more in mind when I wrote it, based on situations I've seen previously:
-
Attempted improper use of load balancing
-
"I forgot I cloned this VM and left it running on the same network"
-
Automation/scripts run amok
63abca5
to
973794d
Compare
'504' => 'The webserver timed out waiting for a response from the backend service. Is Foreman under unusually heavy load?' | ||
} | ||
|
||
if explanation = explanations[response.code.to_str] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The convention is to use braces when there is an assignment as a visual cue. Otherwise it's too easy to confuse with ==
and it may just be a bug/typo.
if explanation = explanations[response.code.to_str] | |
if (explanation = explanations[response.code.to_str]) |
The alternative is:
if explanation = explanations[response.code.to_str] | |
explanation = explanations[response.code.to_str] | |
if explanation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I wasn't aware of this convention.
Thinking about it some more, would this be acceptable?
if not (explanation = explanations[response.code.to_str]).nil?
The ()
s are only useful if the reader knows the convention. This makes it very clear we are testing whether explanation
got assigned a non-nil value, while keeping it on one line. (unless
was also tempting here but unless... else
feels awkward)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My plan is to add Rubocop to the repo which will enforce the convention. If you don't do it and meant if x == y
, Rubocop will ask you to change it to if (x = y)
. That should tell you enough.
On a side note, Python thought it was a bad idea to allows assignment statements in if conditions to prevent those logic bugs. Recently it did add if x := y
as a compromise.
That's the theory behind it.
973794d
to
5f0b1a9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it Refs
instead of Fixes
? What do you think is needed to close the issue?
Code wise I think this is ok.
Same, I do feel this 'Fixes' the issue. And other enhancements we may do can be captured by independent issues. |
I'd like to also remove near duplicate lines like But I'm happy to treat that as a separate issue.... I'll change this one to |
5f0b1a9
to
c1e5599
Compare
Forgot to comment on Friday that I did in fact update to use the parenthesis convention for variable assignment within conditional. @ekohl could you take a look and let me know if you are satisfied? |
c1e5599
to
09da70f
Compare
I have rebased to solve the merge conflict |
I'd like to get this change in before I do a release of the module. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inline there are some recommendations for more actionable error messages but I'm ok merging this and improving it later on.
@@ -96,6 +99,20 @@ def success?(response) | |||
end | |||
|
|||
def error_message(response) | |||
JSON.parse(response.body)['error']['full_messages'].join(' ') rescue "unknown error (response #{response.code})" | |||
explanations = { | |||
'400' => 'Something is wrong with the data sent to Foreman server', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking out loud because it's a bit vague. How about
'400' => 'Something is wrong with the data sent to Foreman server', | |
'400' => 'Receiving server (Foreman) did not accept our data', |
The reason I put Foreman between braces is that if it actually hits some proxy or something else in between then it may be clearer. Of course there shouldn't be anything in between but if everything went well we wouldn't be writing this code :) I believe 400 is more likely to be returned by some non-Foreman server in between. Like if a user pointed it at https://internal-service.example.com
'400' => 'Something is wrong with the data sent to Foreman server', | ||
'401' => 'Often this is caused by invalid Oauth credentials', | ||
'404' => 'The requested resource was not found', | ||
'500' => 'Check /var/log/foreman/production.log on Foreman server for detailed information', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since #28384 (Foreman 2.2) there is a rake task. If we have the request ID (available as the X-Request-Id
header) we can recommend REQUEST_ID=... foreman-rake errors:fetch_log
instead of the precise error. I don't have experience with this myself, but if we want to make it easy to debug we can be much more precise nowadays. This task has to run on the Foreman server but the registration can happen from any server so we can't directly run it.
Note that Apache can also return a HTTP 500 in which case you need to look in the Apache logs. That was more common with Passenger (for example, when Rails failed to even boot) but may still happen.
Thanks @ekohl . I'm not fully satisfied with that wording either (and I'm not sure that I'll ever be). Thinking longer term, I think the best approach is to have links in these explanation messages to community wikis that can be much more comprehensive in covering various cases, with the option of downstream releases replacing those links with more specifically applicable ones for their community. Often the workflow for an user encountering one of these errors will be to copy and paste it into a search engine, so we might as well remove that step for them and just send them directly to the relevant webpage where we can really cover all possible causes in order of likelihood, and they can hopefully expand on it as new causes are discovered. I'd like to consider this as a parking lot item to revisit in the future. For now, I'll merge it considering it's still an improvement over the current state. |
draft PR