Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable provisioning with password #57

Merged
merged 3 commits into from
Mar 18, 2024
Merged

Conversation

dongsupark
Copy link
Collaborator

@dongsupark dongsupark commented Mar 13, 2024

Password authentication by itself not as secure as ssh key. For better security, we should disable password authentication. So if user enabled password authentication in Azure, azure-init now simply fails to provision.

Clean up unnecessary functions in libazureinit, like mount_media, unmount_media, allow_password_authentication.

Fixes #52.

Testing done

Manual test is done.

src/main.rs Outdated Show resolved Hide resolved
src/main.rs Outdated Show resolved Hide resolved
Copy link
Member

@jeremycline jeremycline left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few style nits which you should feel free to ignore if you like 😄

It would be nice to have a functional test that attempts to provision a host with a password and demonstrate it fails, as well, but that might need a bit more thought on how testing should be organized and whatnot.

src/main.rs Outdated Show resolved Hide resolved
src/main.rs Outdated Show resolved Hide resolved
src/main.rs Outdated Show resolved Hide resolved
@dongsupark
Copy link
Collaborator Author

Tested this PR manually in Azure, with help of the existing script demo/image_creation.sh.
(Though some tweak was needed for running that script.)

A provisioning with a ssh key seems to work well, at least no regression. On that machine, a systemd unit azure-init.service runs well, with an empty password given.

I did not yet manage to actually test the case of failing when a non-empty password.

Copy link
Member

@jeremycline jeremycline left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good to me.

Comment on lines +20 to +23
match imds::get_username(imds_body.clone()) {
Ok(username) => Ok(username),
Err(_err) => Err("Failed to get username".into()),
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a case I missed that could also be a map_err, but it's up to you if you want to adjust it. I'm touching all the errors in #59 anyway and need to rebase it after this is merged so I can adjust it there.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks a little tricky touch that in this PR, because main does not return error in this context.
It might be probably better to touch in your PR.

Password authentication by itself not as secure as ssh key.
For better security, we should disable password authentication.
So if user enabled password authentication in Azure, azure-init now
simply fails to provision.

Clean up unnecessary functions in libazureinit, like mount_media,
unmount_media, allow_password_authentication.
…bled

Rename function name from get_provision_with_password to
is_password_authentication_disabled, because that function actually
returns true if the password authentication is disabled. The original
name is rather the opposite of the actual meaning.
@dongsupark
Copy link
Collaborator Author

I did not yet manage to actually test the case of failing when a non-empty password.

Also tested the scenario, by passing --admin-password **** to az vm create command.
It works as expected.

@anhvoms
Copy link
Contributor

anhvoms commented Mar 15, 2024

Tested this PR manually in Azure, with help of the existing script demo/image_creation.sh. (Though some tweak was needed for running that script.)

A provisioning with a ssh key seems to work well, at least no regression. On that machine, a systemd unit azure-init.service runs well, with an empty password given.

I did not yet manage to actually test the case of failing when a non-empty password.

You can try to deploy a VM with boot diagnostic enabled, using password. The deployment will fail in about 20 minutes due to OS Provisioning timeout (because the agent returned an error). Assuming the output from azure-init makes it to serial console (if not we should make any err/warn/into output go to console to aid with debugging) we should see the error about non-empty pasword

@dongsupark
Copy link
Collaborator Author

You can try to deploy a VM with boot diagnostic enabled, using password. The deployment will fail in about 20 minutes due to OS Provisioning timeout (because the agent returned an error). Assuming the output from azure-init makes it to serial console (if not we should make any err/warn/into output go to console to aid with debugging) we should see the error about non-empty pasword

Thanks for the tip.
It turns out azure-init simply returned success even on failure, so azure-init systemd unit looked ok.
With the latest commit I pushed, now azure-init systemd unit fails when it should fail.

However, provisioning itself seems to run even when the systemd unit failed.
That's why I could not see such a scenario of timeout with 20 min upon provisioning failure.
I assume it has something to do with the issue you created a few days ago, right?
I think that's all I can do for now with this PR.

Currently azure-init does not fail, even when it should fail with error.
That's why azure-init systemd service simply keeps running even on
failure. Make it exit with exit code, so systemd unit azure-init fails
in case of failure.
@anhvoms
Copy link
Contributor

anhvoms commented Mar 15, 2024

You can try to deploy a VM with boot diagnostic enabled, using password. The deployment will fail in about 20 minutes due to OS Provisioning timeout (because the agent returned an error). Assuming the output from azure-init makes it to serial console (if not we should make any err/warn/into output go to console to aid with debugging) we should see the error about non-empty pasword

Thanks for the tip. It turns out azure-init simply returned success even on failure, so azure-init systemd unit looked ok. With the latest commit I pushed, now azure-init systemd unit fails when it should fail.

However, provisioning itself seems to run even when the systemd unit failed. That's why I could not see such a scenario of timeout with 20 min upon provisioning failure. I assume it has something to do with the issue you created a few days ago, right? I think that's all I can do for now with this PR.

I think we should investigate why provisioning didn't timeout. Likely there's another entity/agent in the VM that was reporting health. It would be good to know what it is and if it is expected. We might want to remove such noise in our testing. It shouldn't block this PR, however. See #61

@dongsupark dongsupark merged commit b4aaf3e into main Mar 18, 2024
2 checks passed
@dongsupark dongsupark deleted the disable-password-auth branch March 18, 2024 14:52
dongsupark added a commit that referenced this pull request Mar 19, 2024
Functions mount_media and remove_media were removed in
#57. To parse a ovf_env file,
however, both mount_media and remove_media are necessary.
Bring back the functions.
dongsupark added a commit that referenced this pull request Mar 25, 2024
Functions mount_media and remove_media were removed in
#57. To parse a ovf_env file,
however, both mount_media and remove_media are necessary.
Bring back the functions.
dongsupark added a commit that referenced this pull request Mar 26, 2024
Functions mount_media and remove_media were removed in
#57. To parse a ovf_env file,
however, both mount_media and remove_media are necessary.
Bring back the functions.

Adjust return types according to the new LibError interface as well.
dongsupark added a commit that referenced this pull request Mar 27, 2024
Functions mount_media and remove_media were removed in
#57. To parse a ovf_env file,
however, both mount_media and remove_media are necessary.
Bring back the functions.

Adjust return types according to the new LibError interface as well.
dongsupark added a commit that referenced this pull request Apr 2, 2024
Functions mount_media and remove_media were removed in
#57. To parse a ovf_env file,
however, both mount_media and remove_media are necessary.
Bring back the functions.

Adjust return types according to the new LibError interface as well.

Define const PATH_MOUNT_DEVICE, PATH_MOUNT_POINT for path to mount point for
media source and target to be mounted, and replace hard-coded path with
the new const.

Co-authored-by: Jeremy Cline <jeremycline@microsoft.com>
dongsupark added a commit that referenced this pull request Apr 3, 2024
Functions mount_media and remove_media were removed in
#57. To parse a ovf_env file,
however, both mount_media and remove_media are necessary.
Bring back the functions.

Adjust return types according to the new LibError interface as well.

Define const PATH_MOUNT_DEVICE, PATH_MOUNT_POINT for path to mount point for
media source and target to be mounted, and replace hard-coded path with
the new const.

Co-authored-by: Jeremy Cline <jeremycline@microsoft.com>
dongsupark added a commit that referenced this pull request Apr 4, 2024
Functions mount_media and remove_media were removed in
#57. To parse a ovf_env file,
however, both mount_media and remove_media are necessary.
Bring back the functions.

Adjust return types according to the new LibError interface as well.

Define const PATH_MOUNT_DEVICE, PATH_MOUNT_POINT for path to mount point for
media source and target to be mounted, and replace hard-coded path with
the new const.

Co-authored-by: Jeremy Cline <jeremycline@microsoft.com>
dongsupark added a commit that referenced this pull request Apr 5, 2024
Functions `mount_media` and `remove_media` were removed in
#57.
To parse a ovf_env file, however, both mount_media and remove_media are
necessary.
Bring back the functions.

Adjust return types according to the new LibError interface as well.

Co-authored-by: Jeremy Cline <jeremycline@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[RFE] Disable provisioning with password
3 participants