enhancements/machine-config: add updates to PinnedImageSet #1599

hexfusion · 2024-03-15T15:16:35Z

This PR is a follow-up to #1481 adding changes to the planned v1alpha1 API and TechPreview 4.16 implementation .

Signed-off-by: Sam Batschelet <sbatsche@redhat.com>

hexfusion · 2024-03-15T15:45:09Z

cdoern

Looks good, the implementation makes sense to me and will not result in any major outward facing MCO changes! thanks for updating this

enhancements/machine-config/pin-and-pre-load-images.md

openshift-ci · 2024-03-15T16:00:35Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: cdoern
Once this PR has been reviewed and has the lgtm label, please assign stbenjam for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

enhancements/machine-config/pin-and-pre-load-images.md

Signed-off-by: Sam Batschelet <sbatsche@redhat.com>

openshift-ci · 2024-03-16T12:33:49Z

@hexfusion: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

sinnykumari · 2024-03-18T13:24:36Z

enhancements/machine-config/pin-and-pre-load-images.md

+`certificate_writer`. This new feature will operate independently of the
+`MachineConfig` for setting configurations. It is advantageous to not use
+`MachineConfig` because the configuration and image prefetching can happen
+across the nodes in the pool in parallel. The new controller will be watching


Now that we are introducing PinnedImageSetController , which new controller we are referring to here?

pinned_image_manager will clarify if its not MCD itself

sorry, it was a typo. I meant to ask, Now that we are no longer adding PinnedImageSetController , which new controller we are referring to here?

sinnykumari · 2024-03-18T14:30:01Z

enhancements/machine-config/pin-and-pre-load-images.md

+`pinned_image_manager` will:
+
+1. Begin by marking the node with an annotation to indicate the manager is
+`Working` utilizing the `nodeWriter`, similar to the `MachineConfigDaemon` `update`


Just a note: For better troubleshooting, In MCO implementation, wherever possible we would want to add some logging to make sure MCP status update is happening via pinned_image_manager.

sinnykumari · 2024-03-18T15:38:47Z

enhancements/machine-config/pin-and-pre-load-images.md

-selectors in the `PinnedImageSet` custom resources.
+Once all the relevant images are successfully pinned and downloaded to the
+matching nodes, the `pinned_image_manager` will signal the completion of the
+process by invoking nodeWriter.SetDone(). This action notifies the


Maybe I am overthinking. Calling nodeWriter.SetDone() would mean that we are creating two sources where anode is marked as update completed, so we need a way to ensure that we don't do that while MCD is also performing an update due to rendered-config change. Probably, this isn't an issue for targeted usecase but making this a general feature would require better thought process.

I will reword there is just a single sync of MCD so Done will only be applied by MCD. I was trying to convey that no additional mechanisms were required. But how that will work was not described will flesh that out.

I have a similar question here, I guess we're sort of introducing an inter-dependency between the "machineconfig update" and the "pinnedimageset update" which, reading the controller code, shouldn't conflict unless there is a degrade. But this does sort of raise some clarifications such as:

do we want one's failure to block the other

if you roll out one pinnedimageset and then try to roll out a second one (i.e. append another set to the list in the MCP) should we be waiting for the first to finish

if the daemon is working on a node, should it not reboot the node unless pinnedimageset is able to finish

does the done state otherwise affect node updates

how does the pinnedimageset controller actually know if the done state is reacting to the current set of pinnedimageset changes vs, say, the daemon is dead and hasn't reacted to the latest request yet?

yuqi-zhang · 2024-03-18T22:34:33Z

enhancements/machine-config/pin-and-pre-load-images.md

@@ -17,7 +17,7 @@ api-approvers:
 - "@deads2k"
 - "@JoelSpeed"
 creation-date: 2023-09-21
-last-updated: 2023-09-21
+last-updated: 2023-03-15


nit: wrong year ;)

yuqi-zhang · 2024-03-18T22:36:33Z

enhancements/machine-config/pin-and-pre-load-images.md

-  ...
-```
+The PinnedImageSet is closely linked with the `MachineConfigPool`, and each
+Custom Resource (CR) can be associated with a pool at the


I assume you can reference the same pinnedimageset across different pools?

yuqi-zhang · 2024-03-18T22:36:55Z

enhancements/machine-config/pin-and-pre-load-images.md

-those is created, updated or deleted the controller will start a daemon set that
-will do the following in each node of the cluster:
+The _machine-config-daemon_ will grow a new `pinned_image_manager` utilizing
+the same general flow as the existing `MAchineConfigDaemon` `certificate_writer`. This approach is not dependent on


nit: MAchineConfigDaemon -> MachineConfigDaemon

yuqi-zhang · 2024-03-18T22:49:26Z

enhancements/machine-config/pin-and-pre-load-images.md

-selectors in the `PinnedImageSet` custom resources.
+Once all the relevant images are successfully pinned and downloaded to the
+matching nodes, the `pinned_image_manager` will signal the completion of the
+process by invoking nodeWriter.SetDone(). This action notifies the


I have a similar question here, I guess we're sort of introducing an inter-dependency between the "machineconfig update" and the "pinnedimageset update" which, reading the controller code, shouldn't conflict unless there is a degrade. But this does sort of raise some clarifications such as:

do we want one's failure to block the other

if you roll out one pinnedimageset and then try to roll out a second one (i.e. append another set to the list in the MCP) should we be waiting for the first to finish

if the daemon is working on a node, should it not reboot the node unless pinnedimageset is able to finish

does the done state otherwise affect node updates

how does the pinnedimageset controller actually know if the done state is reacting to the current set of pinnedimageset changes vs, say, the daemon is dead and hasn't reacted to the latest request yet?

openshift-bot · 2024-05-09T01:15:40Z

Inactive enhancement proposals go stale after 28d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle stale.
Stale proposals rot after an additional 7d of inactivity and eventually close.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle stale

hexfusion · 2024-05-09T10:42:00Z

/close will open new PR with updates

openshift-bot · 2024-05-17T00:45:10Z

Stale enhancement proposals rot after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Rotten proposals close after an additional 7d of inactivity.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

openshift-bot · 2024-05-24T08:15:37Z

Rotten enhancement proposals close after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Reopen the proposal by commenting /reopen.
Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Exclude this proposal from closing again by commenting /lifecycle frozen.

/close

openshift-ci · 2024-05-24T08:15:49Z

@openshift-bot: Closed this PR.

In response to this:

Rotten enhancement proposals close after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Reopen the proposal by commenting /reopen.
Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Exclude this proposal from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

enhancements/machine-config: add updates to PinnedImageSet

25ce316

Signed-off-by: Sam Batschelet <sbatsche@redhat.com>

openshift-ci bot requested review from kbsingh and runcom March 15, 2024 15:19

openshift-ci bot requested review from cdoern, sinnykumari and yuqi-zhang March 15, 2024 15:45

cdoern approved these changes Mar 15, 2024

View reviewed changes

enhancements/machine-config/pin-and-pre-load-images.md Show resolved Hide resolved

enhancements/machine-config/pin-and-pre-load-images.md Outdated Show resolved Hide resolved

enhancements/machine-config/pin-and-pre-load-images.md Show resolved Hide resolved

sinnykumari reviewed Mar 15, 2024

View reviewed changes

enhancements/machine-config/pin-and-pre-load-images.md Outdated Show resolved Hide resolved

sinnykumari reviewed Mar 15, 2024

View reviewed changes

enhancements/machine-config/pin-and-pre-load-images.md Outdated Show resolved Hide resolved

sinnykumari reviewed Mar 15, 2024

View reviewed changes

enhancements/machine-config/pin-and-pre-load-images.md Show resolved Hide resolved

hexfusion mentioned this pull request Mar 15, 2024

[wip] add PinnedImageSet types and crd openshift/api#1713

Closed

enhancements/machine-config: add additional implementation details

142664d

Signed-off-by: Sam Batschelet <sbatsche@redhat.com>

hexfusion force-pushed the pinned_image_set_addendum branch from afe184b to 142664d Compare March 16, 2024 12:27

sinnykumari reviewed Mar 18, 2024

View reviewed changes

yuqi-zhang reviewed Mar 18, 2024

View reviewed changes

openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 9, 2024

romfreiman mentioned this pull request May 9, 2024

OCPNODE-2205: Lazy image pull support #1600

Open

openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 17, 2024

openshift-ci bot closed this May 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enhancements/machine-config: add updates to PinnedImageSet #1599

enhancements/machine-config: add updates to PinnedImageSet #1599

hexfusion commented Mar 15, 2024 •

edited

hexfusion commented Mar 15, 2024

cdoern left a comment

openshift-ci bot commented Mar 15, 2024

openshift-ci bot commented Mar 16, 2024

sinnykumari Mar 18, 2024

hexfusion Mar 18, 2024 •

edited

sinnykumari Mar 18, 2024 •

edited

sinnykumari Mar 18, 2024

sinnykumari Mar 18, 2024

hexfusion Mar 18, 2024

yuqi-zhang Mar 18, 2024

yuqi-zhang Mar 18, 2024

yuqi-zhang Mar 18, 2024

yuqi-zhang Mar 18, 2024

yuqi-zhang Mar 18, 2024

openshift-bot commented May 9, 2024

hexfusion commented May 9, 2024

openshift-bot commented May 17, 2024

openshift-bot commented May 24, 2024

openshift-ci bot commented May 24, 2024

enhancements/machine-config: add updates to PinnedImageSet #1599

enhancements/machine-config: add updates to PinnedImageSet #1599

Conversation

hexfusion commented Mar 15, 2024 • edited

hexfusion commented Mar 15, 2024

cdoern left a comment

Choose a reason for hiding this comment

openshift-ci bot commented Mar 15, 2024

openshift-ci bot commented Mar 16, 2024

Choose a reason for hiding this comment

hexfusion Mar 18, 2024 • edited

Choose a reason for hiding this comment

sinnykumari Mar 18, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

openshift-bot commented May 9, 2024

hexfusion commented May 9, 2024

openshift-bot commented May 17, 2024

openshift-bot commented May 24, 2024

openshift-ci bot commented May 24, 2024

hexfusion commented Mar 15, 2024 •

edited

hexfusion Mar 18, 2024 •

edited

sinnykumari Mar 18, 2024 •

edited