-
Notifications
You must be signed in to change notification settings - Fork 822
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(disk_setup) add timeout #4673
Conversation
1bbb980
to
890dd47
Compare
Ugh. This problem demonstrates that cloud-init's current code in this area is a mess. The proposed code might work here, but I can't say that I like it.
Plus, if you're using a systemd-based instance, you should be able to accomplish this already using systemd's builtin mount point generation from fstab. Something like this should do it: mounts:
- ["/dev/disk/by-label/jenkins", "/var/lib/jenkins", "ext4", "defaults,nofail,x-systemd.makefs"] Since systemd creates mount units from fstab, and mount units get automatically ordered, I think the above should "just work". At least, that would work if cloud-init's Thanks for proposing this @flokli. This might be an interim solution that we can take, but I think a more elegant solution is desirable long term. |
Using The fstab line creates a This is not an issue on the second boot, or if I re-attempt to reach mutli-user.target (or just restart the .service unit using the data disk), but the only way to get this working straight from the first boot was to do mounting via cloud-init, not systemd. Let me know if you'd be fine to accept this, happy to then take another pass and fix the linters. |
You're right that depending on only the generator output wouldn't work for first boot. I didn't consider initial transaction failure due to missing device unit. However, the .device file should later be generated by systemd-udev[1], which will cause subsequent transaction re-calculation to succeed. I think that this is why restarting the .service unit and second boot and re-attempting multi-user all worked. I think that An alternative that could take advantage of better systemd ordering would be to use
[1] systemd.device
|
Your proposal should be easy to review, so for now if you want to move forward with this approach, please make sure to add the new key to the jsonschema. |
890dd47
to
eadf16e
Compare
I added the field to the jsonschema, addressed the linter warning and renamed |
eadf16e
to
fa956a9
Compare
In a cloud environment, sometimes disks will attach while cloud-init is running and get missed. This adds a configurable timeout to wait for those disks. Signed-off-by: Florian Klink <flokli@flokli.de>
fa956a9
to
9b2e3dc
Compare
@flokli I don't see your username |
Sorry, please check again, just did it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this contribution @flokli, and welcome to cloud-init!
@flokli Sorry about this one more thing. I just noticed that commit 9b2e3dc has @nibalizer as the author, but @flokli's DCO. This project requires CLA, not DCO. Whose code is this, @flokli's or @nibalizer's? If @nibalizer, I need you to sign the CLA please. Apologies for the hold up on this. I wish this wasn't part of the process but unfortunately it is. |
Correct, the changes were resurrected from @nibalizer's PR, that's mentioned in the PR description:
|
I assume you need CLA not DCO. I opened this PR while at IBM. At the time I don't think IBM signed a CCLA with Canonical. I no longer work there. Is it possible we could get this in under "trivial commit" ? Or perhaps all this code is authored by @flokli who can sign the CLA. |
I realized you actually can change the timeout of .device by using the |
Apparently canonical/cloud-init#4673 and more hacks are not needed, we can simply ramp up the timeout that systemd is willing to wait for the .device unit to appear.
@flokli Let me know if that works. It looks like the existence of the .device unit is required for this command (much like |
Yes, that's what I meant.
I'll check internally, but I'm not sure that there is much that we can do unless we can get a CLA signature from IBM. Any insight would be appreciated. |
You need to follow one more reference ;-) The approach using In the end I gave up on this journey, and resorted to the I assume keeping systemd about the mountpoints in the dark, and all the waiting / filesystem creation / mounting in cloud-init would also work, so I still think this PR is useful to have. |
you're right again :P
oof
👍, glad you got something figured out for your use case
Agreed |
Apparently canonical/cloud-init#4673 and more hacks are not needed, we can simply ramp up the timeout that systemd is willing to wait for the .device unit to appear. Signed-off-by: Florian Klink <flokli@flokli.de>
Apparently canonical/cloud-init#4673 and more hacks are not needed, we can simply ramp up the timeout that systemd is willing to wait for the .device unit to appear. Signed-off-by: Florian Klink <flokli@flokli.de>
Apparently canonical/cloud-init#4673 and more hacks are not needed, we can simply ramp up the timeout that systemd is willing to wait for the .device unit to appear. Signed-off-by: Florian Klink <flokli@flokli.de>
Apparently canonical/cloud-init#4673 and more hacks are not needed, we can simply ramp up the timeout that systemd is willing to wait for the .device unit to appear. Signed-off-by: Florian Klink <flokli@flokli.de>
Apparently canonical/cloud-init#4673 and more hacks are not needed, we can simply ramp up the timeout that systemd is willing to wait for the .device unit to appear. Signed-off-by: Florian Klink <flokli@flokli.de>
Hello! Thank you for this proposed change to cloud-init. This pull request is now marked as stale as it has not seen any activity in 14 days. If no activity occurs within the next 7 days, this pull request will automatically close. If you are waiting for code review and you are seeing this message, apologies! Please reply, tagging TheRealFalcon, and he will ensure that someone takes a look soon. (If the pull request is closed and you would like to continue working on it, please do tag TheRealFalcon to reopen it.) |
Hello! Thank you for this proposed change to cloud-init. This pull request is now marked as stale as it has not seen any activity in 14 days. If no activity occurs within the next 7 days, this pull request will automatically close. If you are waiting for code review and you are seeing this message, apologies! Please reply, tagging TheRealFalcon, and he will ensure that someone takes a look soon. (If the pull request is closed and you would like to continue working on it, please do tag TheRealFalcon to reopen it.) |
Apparently canonical/cloud-init#4673 and more hacks are not needed, we can simply ramp up the timeout that systemd is willing to wait for the .device unit to appear. Signed-off-by: Florian Klink <flokli@flokli.de>
Apparently canonical/cloud-init#4673 and more hacks are not needed, we can simply ramp up the timeout that systemd is willing to wait for the .device unit to appear. Signed-off-by: Florian Klink <flokli@flokli.de>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting this to changes requested for now. Conversations with legal don't appear to be moving quickly, but this is still something we'd like to add.
Hello! Thank you for this proposed change to cloud-init. This pull request is now marked as stale as it has not seen any activity in 14 days. If no activity occurs within the next 7 days, this pull request will automatically close. If you are waiting for code review and you are seeing this message, apologies! Please reply, tagging TheRealFalcon, and he will ensure that someone takes a look soon. (If the pull request is closed and you would like to continue working on it, please do tag TheRealFalcon to reopen it.) |
Hello! Thank you for this proposed change to cloud-init. This pull request is now marked as stale as it has not seen any activity in 14 days. If no activity occurs within the next 7 days, this pull request will automatically close. If you are waiting for code review and you are seeing this message, apologies! Please reply, tagging TheRealFalcon, and he will ensure that someone takes a look soon. (If the pull request is closed and you would like to continue working on it, please do tag TheRealFalcon to reopen it.) |
We're still waiting on the CLA for this PR, correct? Any updates? |
This is PR is stuck on this. I don't work for IBM, and didn't plan to. If this can't be considered a trivial contribution that doesn't need a CLA, someone at canonical needs to convince IBM to sign this. |
Proposed Commit Message
Additional Context
This is #710 resurrected.
Fixes #3386.
Test Steps
Tested with the following cloud-config file on azure:
I used a
azurerm_virtual_machine_data_disk_attachment
terraform resource, which attaches the disk halfway during bootup, due to hashicorp/terraform-provider-azurerm#6117, but this should also work with setting a reasonably large timeout and manually attaching the disk (at that lun)Checklist
Merge type