Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

salt: Avoid duplicating static pod manifests #3003

Merged
merged 1 commit into from
Jan 11, 2021

Conversation

gdemonet
Copy link
Contributor

@gdemonet gdemonet commented Dec 28, 2020

When using metalk8s.static_pod_managed, we call file.managed behind
the scenes. This state does a lot of magic, including creating a
temporary file with the new contents before replacing the old file.
This temp file gets created in the same directory as the managed
file by default, so it gets picked up by kubelet as if it were
another static Pod to manage. If the replacement occurs too late,
kubelet may have already created another Pod for the temp file, and
may not be able to "remember" the old Pod, hence not cleaning it up.
This results in "rogue containers", which can create issues (e.g.
preventing new containers from binding some ports on the host).

This commit reimplements the 'file.managed' state in a minimal fashion,
to ensure the temporary file used for making an "atomic replace" is
ignored by kubelet. Note that it requires us to also reimplement the
'file.manage_file' execution function, since it always relies on the
existing "atomic copy" operation from salt.utils.files.copyfile.

Fixes: #2840

@gdemonet gdemonet added kind:bug Something isn't working topic:deployment Bugs in or enhancements to deployment stages topic:flakiness Some test are flaky and cause CI to do transient failing complexity:easy Something that requires less than a day to fix severity:medium Medium impact (usability) on live deployments topic:salt Everything related to SaltStack in our product labels Dec 28, 2020
@gdemonet gdemonet requested a review from a team December 28, 2020 09:47
@bert-e

This comment has been minimized.

@bert-e

This comment has been minimized.

@gdemonet gdemonet force-pushed the bugfix/2840-static-pods-tmp-files branch from 4f61d7c to 45e461e Compare December 28, 2020 11:08
@gdemonet
Copy link
Contributor Author

Received a comment from someone in PTO (😅 ), which points out that writing the temporary file to another directory (e.g. /tmp) may break the atomicity of replacing the original file (if /tmp is not on the same device as /etc/kubernetes/manifests, which is very likely for tmpfs).

Since kubelet ignores Pod manifests with filenames starting with a dot (see here), I'll change the approach to use this behaviour instead.

@gdemonet
Copy link
Contributor Author

After investigating what file.managed does behind the scenes (mostly calling out to the file.manage_file execution method), two things arose:

  • tmp_dir would not have been used anyway, because this only serves for the check_cmd (which we don't use)
  • given the log we get from kubelet when this error happens:
Unable to process watch event: can't process config file "/etc/kubernetes/manifests/repositories.yamlcsl_iuli": open /etc/kubernetes/manifests/repositories.yamlcsl_iuli: no such file or directory

It appears this file is created by the call to salt.utils.files.mkstemp, with prefix set to the destination filename and dir to the destination directory. This call is made from salt.utils.files.copyfile, and there is no way around this behaviour.

In conclusion, I'll need to write a custom module for managing these manifests as we want 😞 .

@@ -76,6 +77,7 @@ def static_pod_managed(name,
context or {},
config_digest=config_digest, metalk8s_version=metalk8s_version
),
tmp_dir=kwargs.pop("tmp_dir", tempfile.gettempdir()),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor, maybe deserve a small comment 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is going away, I'm going to push the latest code so you can have a look :)

@bert-e

This comment has been minimized.

@bert-e

This comment has been minimized.

@bert-e

This comment has been minimized.

@gdemonet gdemonet force-pushed the bugfix/2840-static-pods-tmp-files branch from 45e461e to f2ab609 Compare January 7, 2021 09:11
@bert-e

This comment has been minimized.

@gdemonet

This comment has been minimized.

@bert-e

This comment has been minimized.

@bert-e

This comment has been minimized.

@bert-e

This comment has been minimized.

@gdemonet gdemonet force-pushed the bugfix/2840-static-pods-tmp-files branch from f2ab609 to 0bb2aa1 Compare January 7, 2021 09:12
@bert-e

This comment has been minimized.

@gdemonet gdemonet marked this pull request as ready for review January 7, 2021 13:16
@gdemonet

This comment has been minimized.

@bert-e

This comment has been minimized.

@bert-e

This comment has been minimized.

@bert-e

This comment has been minimized.

@bert-e

This comment has been minimized.

@gdemonet gdemonet force-pushed the bugfix/2840-static-pods-tmp-files branch from 0bb2aa1 to e7e5ceb Compare January 7, 2021 13:41
@bert-e

This comment has been minimized.

salt/_modules/metalk8s.py Show resolved Hide resolved
salt/_modules/metalk8s.py Outdated Show resolved Hide resolved
salt/_modules/metalk8s.py Outdated Show resolved Hide resolved
salt/_modules/metalk8s.py Show resolved Hide resolved
salt/_states/metalk8s.py Show resolved Hide resolved
salt/_states/metalk8s.py Outdated Show resolved Hide resolved
@NicolasT
Copy link
Contributor

NicolasT commented Jan 7, 2021

May want to update PR description with current commit message to avoid confusion.

When using `metalk8s.static_pod_managed`, we call `file.managed` behind
the scenes. This state does a lot of magic, including creating a
temporary file with the new contents before replacing the old file.
This temp file gets created **in the same directory** as the managed
file by default, so it gets picked up by `kubelet` as if it were
another static Pod to manage. If the replacement occurs too late,
`kubelet` may have already created another Pod for the temp file, and
may not be able to "remember" the old Pod, hence not cleaning it up.
This results in "rogue containers", which can create issues (e.g.
preventing new containers from binding some ports on the host).

This commit reimplements the 'file.managed' state in a minimal fashion,
to ensure the temporary file used for making an "atomic replace" is
ignored by kubelet. Note that it requires us to also reimplement the
'file.manage_file' execution function, since it always relies on the
existing "atomic copy" operation from `salt.utils.files.copyfile`.

Fixes: #2840
@gdemonet gdemonet force-pushed the bugfix/2840-static-pods-tmp-files branch from 4af066c to b799483 Compare January 8, 2021 09:04
@gdemonet gdemonet requested review from NicolasT and a team January 8, 2021 09:04
@gdemonet

This comment has been minimized.

@bert-e

This comment has been minimized.

@bert-e

This comment has been minimized.

@bert-e
Copy link
Contributor

bert-e commented Jan 8, 2021

Waiting for approval

The following approvals are needed before I can proceed with the merge:

  • the author

  • one peer

Peer approvals must include at least 1 approval from the following list:

The following reviewers are expecting changes from the author, or must review again:

@bert-e
Copy link
Contributor

bert-e commented Jan 8, 2021

Waiting for approval

The following approvals are needed before I can proceed with the merge:

  • the author

  • one peer

Peer approvals must include at least 1 approval from the following list:

@gdemonet
Copy link
Contributor Author

gdemonet commented Jan 8, 2021

/reset

@gdemonet
Copy link
Contributor Author

gdemonet commented Jan 8, 2021

/approve

@bert-e
Copy link
Contributor

bert-e commented Jan 8, 2021

Reset complete

I have successfully deleted this pull request's integration branches.

@bert-e
Copy link
Contributor

bert-e commented Jan 8, 2021

Integration data created

I have created the integration data for the additional destination branches.

The following branches will NOT be impacted:

  • development/1.0
  • development/1.1
  • development/1.2
  • development/1.3
  • development/2.0
  • development/2.1
  • development/2.2
  • development/2.3
  • development/2.4
  • development/2.5
  • development/2.6

You can set option create_pull_requests if you need me to create
integration pull requests in addition to integration branches, with:

@bert-e create_pull_requests

The following options are set: approve

@bert-e
Copy link
Contributor

bert-e commented Jan 8, 2021

Build failed

The build for commit did not succeed in branch bugfix/2840-static-pods-tmp-files.

The following options are set: approve

@bert-e
Copy link
Contributor

bert-e commented Jan 8, 2021

In the queue

The changeset has received all authorizations and has been added to the
relevant queue(s). The queue(s) will be merged in the target development
branch(es) as soon as builds have passed.

The changeset will be merged in:

  • ✔️ development/2.7

  • ✔️ development/2.8

The following branches will NOT be impacted:

  • development/1.0
  • development/1.1
  • development/1.2
  • development/1.3
  • development/2.0
  • development/2.1
  • development/2.2
  • development/2.3
  • development/2.4
  • development/2.5
  • development/2.6

There is no action required on your side. You will be notified here once
the changeset has been merged. In the unlikely event that the changeset
fails permanently on the queue, a member of the admin team will
contact you to help resolve the matter.

IMPORTANT

Please do not attempt to modify this pull request.

  • Any commit you add on the source branch will trigger a new cycle after the
    current queue is merged.
  • Any commit you add on one of the integration branches will be lost.

If you need this pull request to be removed from the queue, please contact a
member of the admin team now.

The following options are set: approve

@bert-e
Copy link
Contributor

bert-e commented Jan 11, 2021

I have successfully merged the changeset of this pull request
into targetted development branches:

  • ✔️ development/2.7

  • ✔️ development/2.8

The following branches have NOT changed:

  • development/1.0
  • development/1.1
  • development/1.2
  • development/1.3
  • development/2.0
  • development/2.1
  • development/2.2
  • development/2.3
  • development/2.4
  • development/2.5
  • development/2.6

Please check the status of the associated issue None.

Goodbye gdemonet.

@bert-e bert-e merged commit 776668e into development/2.7 Jan 11, 2021
@bert-e bert-e deleted the bugfix/2840-static-pods-tmp-files branch January 11, 2021 20:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
complexity:easy Something that requires less than a day to fix kind:bug Something isn't working severity:medium Medium impact (usability) on live deployments topic:deployment Bugs in or enhancements to deployment stages topic:flakiness Some test are flaky and cause CI to do transient failing topic:salt Everything related to SaltStack in our product
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Salt master not deploy correctly because another salt master is already running.
5 participants