Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug/reattach across instance id delta #1016

Merged

Conversation

blackboxsw
Copy link
Collaborator

When VMs are cloned and relaunched in Azure and AWS, instance-ids change for the new vms.
cloud-init picks up on those instance-id changes and re-runs. UA auto-attach should be cognizant of a delta in instance-id and re-attach on Pro instances.

LP: #1867573

@blackboxsw
Copy link
Collaborator Author

LP:1867573

@blackboxsw blackboxsw force-pushed the bug/reattach-across-instance-id-delta branch 2 times, most recently from 1cc21fa to 49c1d8a Compare March 17, 2020 17:39
Copy link
Contributor

@OddBloke OddBloke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this, Chad. Overall in pretty good shape, I have a few questions about particulars inline.

uaclient/cli.py Show resolved Hide resolved
uaclient/cli.py Outdated
@@ -384,10 +403,13 @@ def _get_contract_token_from_cloud_identity(cfg: config.UAConfig) -> str:
ua_status.MESSAGE_UNSUPPORTED_AUTO_ATTACH
)
raise e
instance_id_file = os.path.join(cfg.data_dir, "instance-id")
if current_iid:
util.write_file(instance_id_file, current_iid)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be using UAConfig.write_cache?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

d'oh

iid_file: str = CLOUDINIT_INSTANCE_ID_FILE
) -> "Optional[str]":
"""Query cloud instance-id from cmdline or CLOUDINIT_INSTANCE_ID_FILE"""
if util.which("cloud-id"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using cloud-ids presence as a guard makes sense in get_cloud_type because it's then called there. Using its presence as a proxy for cloud-init version feels a little less obvious. Would either checking the Ubuntu release version, or explicitly checking for a lower-bound cloud-init version work instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree here. Moving it to series-based trusty/non-trusty as we know that every other supported series has a recent enough cloud-init to have this query support. I had avoided calling cloud-init query specifically as trusty cloud-init query is a mangled/useless mess for that subcommand.



# Mapping of datasource names to cloud-id responses. Trusty compat with Xenial+
DATASOURCE_TO_CLOUD_ID = {"azurenet": "azure", "ec2": "aws"}


def get_instance_id(
iid_file: str = CLOUDINIT_INSTANCE_ID_FILE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only used in testing, so perhaps:

Suggested change
iid_file: str = CLOUDINIT_INSTANCE_ID_FILE
_iid_file: str = CLOUDINIT_INSTANCE_ID_FILE

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks grabbed it

return util.load_file(iid_file)
else:
logging.warning(
"Unable to determine current instance-id from %s" % iid_file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to perform the substitution in logging calls:

Suggested change
"Unable to determine current instance-id from %s" % iid_file
"Unable to determine current instance-id from %s", iid_file

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

if os.path.exists(iid_file):
return util.load_file(iid_file)
else:
logging.warning(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A nit, but given that the if above returns unconditionally, we don't need else or an extra level of indentation here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

uaclient/cli.py Outdated
if current_iid == prev_iid:
raise exceptions.AlreadyAttachedError(cfg)
print("Re-attaching Ubuntu Advantage subscription on new instance")
_detach(cfg, assume_yes=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may return 1 on failure instead of raising an exception (though currently it won't with assume_yes=True). Should that case be handled here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So a detach fails during auto-attach on new-instance, we would have not disabled and re-enabled X and X is left with potentially old credentials for X if re-attach was going to change those creds.

That said, creds are per account I thought for each service X currently, so practically it doesn't matter at the moment, but could matter in the future if creds evolve to per machine per account.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about that specific case. My concern is more general: the interface of _detach is that on failure, it will return 1 or raise an exception. This code as-written handles the exception failure case (because it doesn't intercept them, so they will bubble out), but it silently ignores the return 1 case.

As it happens, today, _detach won't return 1 if assume_yes=True, but we can't necessarily assume that will continue to be the case in future, and we shouldn't expect people to check every call site of a function when changing its behaviour within its already-defined interface.

(My assumption here is that we care that the detach failed, because we aren't catching and ignoring exceptions here.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this what you are getting at @OddBloke?

diff --git a/uaclient/cli.py b/uaclient/cli.py
index bd37c5d..0282c34 100644
--- a/uaclient/cli.py
+++ b/uaclient/cli.py
@@ -399,8 +399,10 @@ def _get_contract_token_from_cloud_identity(cfg: config.UAConfig) -> str:
         if current_iid == prev_iid:
             raise exceptions.AlreadyAttachedError(cfg)
         print("Re-attaching Ubuntu Advantage subscription on new instance")
-        _detach(cfg, assume_yes=True)
-
+        if not _detach(cfg, assume_yes=True):
+            raise exceptions.UserFacingError(
+                ua_status.MESSAGE_DETACH_AUTOMATION_FAILURE
+            )
     contract_client = contract.UAContractClient(cfg)
     try:
         tokenResponse = contract_client.request_auto_attach_contract_token(
diff --git a/uaclient/status.py b/uaclient/status.py
index de11cab..6df7d44 100644
--- a/uaclient/status.py
+++ b/uaclient/status.py
@@ -203,6 +203,7 @@ MESSAGE_ENABLE_BY_DEFAULT_TMPL = "Enabling default service {name}"
 MESSAGE_ENABLE_BY_DEFAULT_MANUAL_TMPL = """\
 Service {name} is recommended by default. Run: sudo ua enable {name}"""
 MESSAGE_DETACH_SUCCESS = "This machine is now detached"
+MESSAGE_DETACH_AUTOMATION_FAILURE = "Unable to automatically detach machine"
 
 MESSAGE_REFRESH_ENABLE = "One moment, checking your subscription first"
 MESSAGE_REFRESH_SUCCESS = "Successfully refreshed your subscription"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, something like that would be great (though I would do == 0 instead of not, because it's an int not a bool).

# Present in cloud-init on >= Xenial
out, _err = util.subp(["cloud-init", "query", "instance_id"])
return out.strip()
if os.path.exists(iid_file):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be using UAConfig.read_cache?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes we should. thanks for that corrected throughout

@blackboxsw blackboxsw force-pushed the bug/reattach-across-instance-id-delta branch 2 times, most recently from a33f1bf to 381820d Compare March 17, 2020 21:12
Add identity.get_instance_id to check cloud-init's instance-id via:
 * 'cloud-init query instance_id' on Xenial and later
 * checking /var/lib/cloud/data/instance-id on Trusty

When auto-attaching, if instance-id has changed detach and reattach.
def test_use_cloud_init_query_when_non_trusty(
self, m_get_platform_info, m_subp, series
):
"""Get instance_id from cloud-init query when non not on trusty."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Get instance_id from cloud-init query when non not on trusty."""
"""Get instance_id from cloud-init query when not on trusty."""

@blackboxsw blackboxsw force-pushed the bug/reattach-across-instance-id-delta branch from 381820d to 9c0ce53 Compare March 18, 2020 17:37
@blackboxsw
Copy link
Collaborator Author

Tested on trusty Azure PRO system. and got the expected behavior:

  • first reboot after upgrade re-attaches because instance-id was None (not yet persisted)
  • 2nd reboot doesn't re-attach because current instance-id matches persisted.

uaclient/cli.py Outdated
if current_iid == prev_iid:
raise exceptions.AlreadyAttachedError(cfg)
print("Re-attaching Ubuntu Advantage subscription on new instance")
if _detach(cfg, assume_yes=True) == 1:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given this is an exit code, we can future-proof ourselves fully with:

Suggested change
if _detach(cfg, assume_yes=True) == 1:
if _detach(cfg, assume_yes=True) != 0:

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@blackboxsw blackboxsw force-pushed the bug/reattach-across-instance-id-delta branch from 9c0ce53 to 544232c Compare March 18, 2020 19:07
@blackboxsw blackboxsw force-pushed the bug/reattach-across-instance-id-delta branch from 544232c to acb3547 Compare March 18, 2020 19:54
Copy link
Contributor

@OddBloke OddBloke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants