Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reboot: Fix typo and support bare Linux systems #45607

Merged
merged 2 commits into from
Sep 17, 2018

Conversation

dagwieers
Copy link
Contributor

@dagwieers dagwieers commented Sep 13, 2018

SUMMARY

This fixes a problem for bare Linux systems that do not support 'who -b' or 'uptime -s'.
For example, OpenELEC does not ship with who and does not support uptime -s.

Probably needs a backport to Linux v2.7.

ISSUE TYPE
  • Bugfix Pull Request
COMPONENT NAME

reboot

ANSIBLE VERSION

v2.8 and v2.7

This fixes a problem for bare Linux systems that do not support 'who -b' or 'uptime -s'.
@dagwieers dagwieers added affects_2.7 This issue/PR affects Ansible v2.7 affects_2.8 This issue/PR affects Ansible v2.8 labels Sep 13, 2018
lib/ansible/plugins/action/reboot.py Outdated Show resolved Hide resolved
if command_result['rc'] != 0:
raise AnsibleError("%s: failed to get host boot time info, rc: %d, stdout: %s, stderr: %s"
% (self._task.action, command_result.rc, to_native(command_result['stdout']), to_native(command_result['stderr'])))
% (self._task.action, command_result['rc'], to_native(command_result['stdout']), to_native(command_result['stderr'])))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would aggregate both errors here, both 'uptime' and 'cat /proc..'

Copy link
Contributor

@samdoran samdoran Sep 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with this. Don't want to swallow the output from the initial command.

Long term, it feels like we may need to change distribution to being an attribute, that way we can run cat ... for Linux hosts, and use who -b for others, removing the need for uptime -s and a hard coded tree of commands to attempt.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed it so we collect all the stdout and stderr information in case we need it for the AnsibleError.

The problem with having only a single guaranteed command per OS is

  1. You would need to know the platform/OS you're working with
  2. I don't think it's that clear-cut

So this fallback method, while not necessarily 100% clean, it does the job effectively.
And who -b and uptime -s are very sensible things to do before resorting to the random boot_id. (Which may not be present in all Linux kernels either ?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My next question was going to be if the random boot ID is widely available. I guess I'll have to do some research to see when it was added.

I created a WIP PR (#45656) exploring what it would look like to add platform and dist to the plugin. I'm not terribly happy with it, but it's a start (and it accounts for Alpine, which is particularly annoying since it lacks both who and uptime for no good reason (IMO)).

That PR also changes to using the random boot ID by default rather than who -b. That has the nice side effect of being able to remove the uptime -s command for systems that lack a RTC, but at the cost of all the added complexity of platfrom and dist checking.

@ansibot ansibot added bug This issue/PR relates to a bug. support:core This issue/PR relates to code supported by the Ansible Engineering Team. needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. labels Sep 13, 2018
if command_result['rc'] != 0:
raise AnsibleError("%s: failed to get host boot time info, rc: %d, stdout: %s, stderr: %s"
% (self._task.action, command_result.rc, to_native(command_result['stdout']), to_native(command_result['stderr'])))
% (self._task.action, command_result['rc'], to_native(command_result['stdout']), to_native(command_result['stderr'])))
Copy link
Contributor

@samdoran samdoran Sep 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with this. Don't want to swallow the output from the initial command.

Long term, it feels like we may need to change distribution to being an attribute, that way we can run cat ... for Linux hosts, and use who -b for others, removing the need for uptime -s and a hard coded tree of commands to attempt.

@dagwieers
Copy link
Contributor Author

Please re-review !

@ansibot ansibot added support:community This issue/PR relates to code supported by the Ansible community. and removed support:core This issue/PR relates to code supported by the Ansible Engineering Team. labels Sep 14, 2018
Copy link
Contributor

@samdoran samdoran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a changelog fragment as well.

command_result = self._low_level_execute_command('uptime -s', sudoable=self.DEFAULT_SUDOABLE)

# This is a last resort for bare Linux systems (e.g. OpenELEC) where 'who -b' or 'uptime -s' are not supported.
# Other options like parsing /proc/uptime or default uptime output are less reliable than this
if command_result['rc'] != 0:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One issue with this is that win_reboot uses this same method. So if the windows DEFAULT_BOOT_TIME_COMMAND results in an rc != 0, it will then try this cat command, which will gloriously fail.

The previous conditional which runs uptime -s won't ever match on Windows because that string will never match the output of (Get-WmiObject -ClassName Win32_OperatingSystem).LastBootUpTime.

This is the point where it would be helpful to have platform/dist as a class attribute, either as an additional check here, or to look up the correct boot time command based on platform/dist. I'm also open to other ideas.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind if a fallback method fails though. As long as we cover all bases so in the end the function always works. For win_reboot it could be a lot easier if we have a list of commands, for both Windows and Unix.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my WIP PR, I did change the "get the last boot time" command to be a variable, so that will remove the hard coding here.

Will fix in future™.

Still, I can see it being confusing to someone running win_reboot (assuming the first command failed to get the last boot time) if they get this error:

win_reboot: failed to get host boot time info, rc: 1, stdout: 20180913204044.576906+000
, stderr: cat : Cannot find path 'C:\proc\sys\kernel\random\boot_id' because it does not exist.
At line:1 char:1
+ cat /proc/sys/kernel/random/boot_id
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo          : ObjectNotFound: (C:\proc\sys\kernel\random\boot_id:String) [Get-Content], ItemNotFoundEx
ception
+ FullyQualifiedErrorId : PathNotFound,Microsoft.PowerShell.Commands.GetContentCommand

Evidently Windows has a cat command now!?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the distinction between Windows and Unix is now easy because it's a different entry-point. So we don't need to do a cat on Windows. That was my main point. But I also don't mind having the fallback mechanism on Unix, that first does who -b, then uptime -s, and then cat /.../boot_id. Which to me is quite sensible as well. It's a trade-off for simplicity and in most cases it only requires a single command who -b. If you implement support for getting the platform, you will always have to run at least 2 commands.

So basically what I am saying is that in general the fallback mechanism is more efficient than targeting based on platform.

Copy link
Contributor Author

@dagwieers dagwieers Sep 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And yes, PowerShell has aliases for common Unix commands, which is freaking a lot of people out (because they are not drop-in replacements). https://daniel.haxx.se/blog/2016/08/19/removing-the-powershell-curl-alias/

@samdoran samdoran merged commit a7a99c5 into ansible:devel Sep 17, 2018
@@ -94,18 +94,31 @@ def construct_command(self):
return reboot_command

def get_system_boot_time(self):
stdout = ''
stderr = ''
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this might not be py2/py3 compatible, we might want to force bytes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are correct. One day I'll master that distinction. 😄

@dagwieers
Copy link
Contributor Author

@samdoran I assumed your own PR would take over this. That's why I didn't bother doing the fragment.

@samdoran
Copy link
Contributor

@dagwieers I wanted to get this fix in. My PR is probably too big a change to get in the short term, plus it's probably in the feature request territory.

@dagwieers
Copy link
Contributor Author

Ok. Sorry for the confusion.

@samdoran
Copy link
Contributor

Not at all. Thank you for finding and fixing this.

samdoran pushed a commit to samdoran/ansible that referenced this pull request Sep 28, 2018
* reboot: Fix typo and support bare Linux systems

This fixes a problem for bare Linux systems that do not support 'who -b' or 'uptime -s'.

* Accumulate stdout and stderr information

(cherry picked from commit a7a99c5)
abadger pushed a commit that referenced this pull request Oct 9, 2018
* reboot: Fix typo and support bare Linux systems

This fixes a problem for bare Linux systems that do not support 'who -b' or 'uptime -s'.

* Accumulate stdout and stderr information

(cherry picked from commit a7a99c5)
@ansible ansible locked and limited conversation to collaborators Jul 22, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
affects_2.7 This issue/PR affects Ansible v2.7 affects_2.8 This issue/PR affects Ansible v2.8 bug This issue/PR relates to a bug. needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. support:community This issue/PR relates to code supported by the Ansible community.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants