Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NixOS AMI executes user data on restart #41826

Open
shmish111 opened this issue Jun 11, 2018 · 17 comments
Open

NixOS AMI executes user data on restart #41826

shmish111 opened this issue Jun 11, 2018 · 17 comments
Assignees

Comments

@shmish111
Copy link
Contributor

Issue description

USER_DATA is executed on restart of an EC2 instance, this is contrary to AWS documentation and general practice. It caused me some big problems as I assumed this wouldn't happen.

Steps to reproduce

  1. Start and EC2 instance with some configuration.nix user data
  2. nixos-rebuild the machine with some different configuration
  3. restart the machine

Expected outcome

User data is not executed and machine state remains as it was before reboot

Actual outcome

Machine configuration is rolled back to the user data version

Technical details

Please see "View and Update the Instance User Data" in https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html

@copumpkin
Copy link
Member

Whoops! Not going to have time to look into this for a few days at least, so if you want to take a stab at it, most of the logic for this is in here.

Easiest solution is probably just to touch /root/.initialized and then skip the rebuild if it already exists. We do have a nice VM test for this functionality so it should also be fairly easy to make sure it's doing the right thing.

@coretemp
Copy link
Contributor

Just use cloud-init, because then this logic doesn't need to be in NixOS anymore.

On this topic, I think we should also have recommendations as to how to use this feature if at all, because running nixos-rebuild can be a slow operation (not something you would want to do if you have 100s/1000s of machines).

@edolstra
Copy link
Member

Cloud-init is too bloated, see #39076 (comment).

@copumpkin
Copy link
Member

I've also written plugins for cloud-init (which we'd need here) and it's kind of a miserable and undocumented project. I was not impressed. And of course we'd need to wrap our user-data with yaml, reimplement most of their existing yaml support because it wouldn't work on our platform (you can list users and such, and we'd need to translate that to our declarative config because their default implementation is to just call useradd and the like).

@coretemp
Copy link
Contributor

Due to political considerations (Canonical creates cloud-init and likely cannot allocate people who could implement this with acceptable quality), I retract my suggestion for cloud-init.

@chris-martin
Copy link
Contributor

It caused me some big problems as I assumed this wouldn't happen.

INDEED

@stale
Copy link

stale bot commented Jun 3, 2020

Thank you for your contributions.

This has been automatically marked as stale because it has had no activity for 180 days.

If this is still important to you, we ask that you leave a comment below. Your comment can be as simple as "still important to me". This lets people see that at least one person still cares about this. Someone will have to do this at most twice a year if there is no other activity.

Here are suggestions that might help resolve this more quickly:

  1. Search for maintainers and people that previously touched the related code and @ mention them in a comment.
  2. Ask on the NixOS Discourse.
  3. Ask on the #nixos channel on irc.freenode.net.

@stale stale bot added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Jun 3, 2020
@shmish111
Copy link
Contributor Author

This hasn't been a problem for me recently as I've not been restarting things but has this been fixed? @copumpkin do you know?

@stale stale bot removed the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Jun 3, 2020
@endgame
Copy link
Contributor

endgame commented Jul 26, 2020

I confirmed that this is still an issue, by doing the following:

  1. Launched the NixOS AMI in us-east-1, setting the hostname in the instance userdata.
  2. ssh'd into the instance and edited /etc/nixos/configuration.nix to have a different hostname.
  3. Rebooted the instance.
  4. ssh'd back in, and observed that the hostname was unchanged. Moreover, /etc/nixos/configuration.nix was replaced with the one from the instance's userdata.

@endgame
Copy link
Contributor

endgame commented Jul 26, 2020

Ah, but there is a way to control this: you can set systemd.services.amazon-init.enable = false; in configuration.nix.

Is this documented anywhere?

@stale
Copy link

stale bot commented Jan 23, 2021

I marked this as stale due to inactivity. → More info

@stale stale bot added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Jan 23, 2021
@bryanasdev000
Copy link
Member

Still important to me.

@stale stale bot removed the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Jan 25, 2021
@chris-martin
Copy link
Contributor

I don't see "amazon-init" mentioned in any documentation at all, and the options search doesn't turn anything up.

@endgame
Copy link
Contributor

endgame commented Jun 27, 2021

The options does have a doc string (not a useful one), but probably doesn't get pulled in as the AMI is built from https://github.com/NixOS/nixpkgs/blob/b99a500a045e73fcabfb141b33fe0b9021966040/nixos/maintainers/scripts/ec2/amazon-image.nix , which imports another amazon-image.nix:

imports = [ ../../../modules/virtualisation/amazon-image.nix ];

Which imports amazon-init.nix:

imports = [ ../profiles/headless.nix ./ec2-data.nix ./amazon-init.nix ];

Which declares the service:

options.virtualisation.amazon-init = {
enable = mkOption {
default = true;
type = types.bool;
description = ''
Enable or disable the amazon-init service.
'';
};
};
config = mkIf cfg.enable {
systemd.services.amazon-init = {
inherit script;
description = "Reconfigure the system from EC2 userdata on startup";
wantedBy = [ "multi-user.target" ];
after = [ "multi-user.target" ];
requires = [ "network-online.target" ];
restartIfChanged = false;
unitConfig.X-StopOnRemoval = false;
serviceConfig = {
Type = "oneshot";
RemainAfterExit = true;
};
};
};

I suspect that a nix expression is evaluated to generate documentation for options, and that evaluation call probably does not include these extra modules. Perhaps a doc PR is the best we can hope for here?

@stale
Copy link

stale bot commented Jan 9, 2022

I marked this as stale due to inactivity. → More info

@stale stale bot added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Jan 9, 2022
@endgame
Copy link
Contributor

endgame commented Jan 9, 2022

Still important, still need (at least) some documentation, and you're an annoying bot so please be quiet.

@stale stale bot removed the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Jan 9, 2022
@stale stale bot added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Jul 10, 2022
@AliSajid
Copy link

Bumping so this becomes fresh again. This is still an issue I'm dealing with.

@stale stale bot removed the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Mar 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants