-
Notifications
You must be signed in to change notification settings - Fork 452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
First iteration of running the Assisted Installer in end-user clusters. #574
First iteration of running the Assisted Installer in end-user clusters. #574
Conversation
* additionalNtpSource: string | ||
* installConfigOverrides: string | ||
|
||
Status: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we show API errors? mostly talking about 4xx? maybe have another filed like errors
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we check for that in the sync part and fail the request?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's already in the reconcile part, so user can take a look at the events.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But how Hive will see the error? i'm not sure if they are monitoring the events.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the controller is unable to make progress due to errors like that, it should log those errors, and possibly also set a condition in the resource status such as "progressing = false" with a reason such as "error received from backend service" or similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, we should use conditions with specific type to provide the necessary status updates instead of these specialized fields in the status...
* logsCollectedAt:string | ||
* installerVersion: string | ||
* createdAt: string | ||
* updatedAt: string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should use creationTimestamp property of the object instead
for kubernetes-native is more natural for it to be separate. | ||
|
||
Spec: | ||
* clusterName: string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use ObjectReference instead of string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we anticipate an image existing in a different namespace than the cluster it's related to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use of ObjectReference is discouraged upstream, I think because it's very unclear which of the many fields it exposes are in use. https://godoc.org/k8s.io/api/core/v1#ObjectReference
We use a mix of String name and LocalObjectRef in Hive if that's of any help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@filanov I guess we move to back strings
I guess things like secrets should still be proper references? For example I could imagine the pull-secret being used by multiple entities in multiple namespaces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull secret is tough to use in multiple namespaces as you can't mount in a secret for a pod. The pod would need very broad secret RBAC across the cluster, which is troublesome.
ACM fights with this as they keep a central index of cloud creds users can use for lots of clusters in different namespaces. @jnpacker was commenting on the challenges keeping these in sync in all namespaces and we floated some ideas, nothing concrete yet.
I'd suggest Secret be LocalReference as well, but if you need to go to secrets in another namespace, I believe best practice is to create your own type instead of using ObjectReference. (from our experience everyone is trying to stay away from it now)
Installer's REST API. | ||
|
||
Spec: | ||
* clusterName: string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use ObjectReference instead of string
* sshPublicKey: string | ||
* pullSecretName: string (reference to k8s secret) | ||
* ignitionOverrides: object | ||
* staticIpConfiguration: object |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ignitionOverrides
and staticIpConfiguration
are configMaps
?
* installerArgs: string | ||
|
||
Status: | ||
* state: string enum |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be string alias, same as in line#172?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
* macAddress: string | ||
* flags: array of strings | ||
* speedMbps: integer | ||
* disks: array |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about InstallationEligibility
and IoPerf
as seen in models/disk.go?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These were recently added to the REST API, need to add them here
* bootable: boolean | ||
* smart: string | ||
* boot: | ||
* current_boot_mode:string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should probably be: currentBootMode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, missed some instances of snake to camel case
* stageStartedAt: string | ||
* stageUpdatedAt: string | ||
* connectivity: array of | ||
* host |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which object is this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@filanov proposed to reference the Host object, which makes sense to me.
Please approve/disapprove
* createdAt: string | ||
* expiresAt: string | ||
|
||
**AgentDrivenClusterInstallation** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One point of clarification that would be useful to make here - that this resource will play no part in the standalone cluster use cases, that it will only play a role in the multi-cluster use cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And then in the context of multi-cluster installs, we should clearly distinguish between the purpose of ClusterDeployment and AgentDrivenClusterInstallation - right now, they appear to overlap in difficult to justify ways
The "create cluster" workflow below highlights this when it says:
2. User creates AssistedInstallCluster resource describing their desired cluster.
3. User creates ClusterDeployment describing a new cluster.
What is the user describing about the cluster with AgentDrivenClusterInstallation that should not be described by ClusterDeployment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another example would be e.g. machineNetworkCidr - if you created a ClusterDeployment referencing an install-config with a particular machine network (or with hiveutil create-cluster --machine-network
), and separately AssistedInstallCluster listed a different machine network, which would win?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're looking at having at least 3 clusters CRs in play, ACM, Hive, Assisted (and in a parallel stream a Hypershift cluster). There will likely be substantial if not complete duplication between them, as a Hive caller should be able to fully specify their cluster in Hive. They should not need to create CRs for the Assisted operator directly. (as far as I am aware)
We talked about doing this so teams can develop independently and as quickly as possible, and components remain useful in isolation. I'm still a bit nervous about this and not sure if it's the ideal approach. (and we should always strive to reduce the load on etcd, so I don't love duplicating all of this everywhere)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They should not need to create CRs for the Assisted operator directly. (as far as I am aware)
Discussions today have revealed this to be less of a mandate than we may have thought. Let's continue to discuss, and I think we can agree on something reasonable. Making duplicate pass-through APIs in hive to mirror the complete APIs of agent-based installation probably doesn't make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After a chat with @eparis and @derekwaynecarr yesterday I think we should discuss eliminating the middle API and operator, if not now, then after prototyping. The end resulting topology should be the same, and it feels like the only reason we're duplicating those APIs is to not have to figure out how to work together. From Hive POV it doesn't feel like it makes sense that we would wrap a REST API with Kube APIs, and then wrap those Kube APIs with more Kube APIs. I don't think we have a need for a standalone Assisted operator, and I think that would be a misunderstanding of what exactly Hive is. It is quite small and lightweight, easy to get running, and it's job is to provision configurations of OpenShift at scale.
I would pitch that once we've got a prototype up (or sooner if desired), we start pushing that code to hive. Review process will be difficult for both sides but there's a big benefit there in that we can help translate that imperative API to a declarative Kube API as my team has been at this for a few years.
In this scenario I would assume that the current assisted install code base remains separate, as this allows the really deep domain logic for performing these installs to stay in the teams with that knowledge. Presumably this would also still need an OperatorHub operator to get the REST API up on a hub cluster, just with a much more limited scope.
* httpsProxy: string | ||
* noProxy: string | ||
* userManagedNetworking: boolean | ||
* additionalNtpSource: string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This NTP source example makes me super nervous about what this resource could turn into.
AIUI, there is only one recommended way to configure the NTP source in an cluster - using a MachineConfig that lays down an /etc/chrony.conf file
If this isn't sufficiently user friendly (and I agree that it is not!) then that is a good argument for it being a top-level install-config option
A new resource like this should not accrete installation options that are "too hard" to add to the core installer
See also:
openshift/installer#3258
openshift/machine-config-operator#629
openshift/installer#1951
https://github.com/openshift/machine-config-operator#applying-configuration-changes-to-the-cluster
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This field is used to synchronize clocks of hosts that have not yet formed a cluster. If hosts boot the live ISO and run the agent, and they do not already have NTP sources configured through other means (DHCP, ...), and their clocks are divergent, then we see failed installations. AIUI the user can set this value if clock validation fails, then the agent will apply it, then the validation will eventually succeed after clocks synchronize, and finally installation can be initiated.
When this field is used to add an NTP source for the above use case, a machine config is additionally created so that the new cluster will continue using that source. But this is a secondary behavior driven by the need to converge clocks pre-installation. If an install-config option were available, that could be used instead.
Here's the relevant BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1884465
Does this alleviate your concerns? I know it's still creating an API for something that could or should reasonably also be in the install config, but putting it in the install config would not solve this problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, it's useful to know it serves this pre-installation purpose also
It makes the issue more complicated, but it doesn't alleviate the concern - there's nothing specific to agent-based installations about users needing to configure an NTP server. See openshift/hive#1247 (comment)
* sshPublicKey: string | ||
* pullSecretName: string (reference to k8s secret) | ||
* ignitionOverrides: object | ||
* staticIpConfiguration: object |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to see some discussion about this design versus likely future directions around supporting static IP configs e.g. #540
For example, consider the standalone cluster case - if the static IP config is stored in a MachineConfig, and you want to create an installation image for a given machine ... wouldn't it make sense for the AgentDrivenInstallationImage to reference that MachineConfig?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that #540 is mainly about how to configure vlans/bonding - this is potentially possible via MachineConfig because it's likely to be a common configuration for all Machines - but there's no way to define per-Machine MachineConfigs ref
openshift/machine-config-operator#1720
Until/unless that's resolved I don't think MachineConfig provides a solution for static ips (I guess in theory it'd work for single-node deployments, but we want a common interface for multi-node deployments too, right?)
This does raise an interesting question of the proliferation of network interface config interfaces we're exposing users to though, e.g:
- NetworkManager keyfiles injected via
coreos-install --copy-network
(potentially per-node for UPI/AI but no IPI platforms use this) - Ignition customization (potentially per-node in the UPI/AI case)
- Some future machine-specific MachineConfig interface
- The declarative kubernetes NMState work being done for day-2 networking (and long term it'd be really nice if day1 and day2 interfaces were the same?)
In this case we need some way to configure the install-agent ISO (which IIUC will then pass-through the configuration via coreos-install like in 1 above), which perhaps justifies yet-another interface, but I agree with @markmc it may be worth considering all-the-interfaces that already exist, and justifying if we need a net-new format or some kind of passthrough to one of the existing interfaces.
One simple option might be to allow passing in encoded NM keyfiles via this new API - that provides flexibility (for example bond/vlan support etc without having to add explicit support to a new DSL), but it exposes users to the underlying format (which isn't super-well documented IME).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW we made the API very similar to what the ignition config expects
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW we made the API very similar to what the ignition config expects
@avishayt can you clarify that - ignition doesn't have any explicit support for network configuration, only to write files or systemd units, can you point me to any examples of what the API looks like please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See here: https://coreos.com/ignition/docs/latest/network-configuration.html
But either way, it's another API
|
||
#### CAPBM | ||
|
||
[Cluster-API Provider Bare Metal](https://github.com/openshift/cluster-api-provider-baremetal/) will need to gain the ability to interact with the new assisted installer APIs. Specifically it will need to match a Machine with an available assisted Agent that is ready to be provisioned. This will be in addition to matching a Machine with a BareMetalHost, which it already does today. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to see some discussion here of the alternative - a new machine API provider specifically for agent-driven installation, such that agent-driven support isn't limited to bare metal
The basic idea is straightforward - a new actuator that simply satisfies a Machine request by choosing an AgentDrivenHostInstallation resource and initiating an install
The challenging part is how to then leverage the per-platform providers to make it easy to boot a new machine with the agent, in the case where there is no existing AgentDrivenHostInstallation available
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another reason why it's important to consider this alternative - we may struggle to find good justification why support for agent-driven installation is within the scope of upstream CAPBM, and maye for this exact reason ... that there is nothing bare-metal specific about it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, consider the boot-it-yourself day 2 add-worker case. There really is no need for a BareMetalHost resource to be involved with such a workflow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UPI workflows get Nodes without Machines today, right? Is there any reason this workflow always needs a Machine either?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mhrivnak This thread was detached from the text because of changes, but I don't think it's resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we just agreed to delete this whole section, and the proposal no longer proposes any particular Machine API integration. There's a new section Future Use of MachinePools and Machine API
that elaborates.
|
||
Agents that are running on hardware and waiting for installation-related | ||
instructions will continue to communicate with the assisted service using the | ||
existing REST API. In the future that may transition to using a CRD. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this status be mirrored onto the CRD's?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The communiction between the agent and service doesn't always yield something useful for the user. When it does, then it will be included in the AgentDrivenInstallationHost.
for kubernetes-native is more natural for it to be separate. | ||
|
||
Spec: | ||
* clusterName: string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use of ObjectReference is discouraged upstream, I think because it's very unclear which of the many fields it exposes are in use. https://godoc.org/k8s.io/api/core/v1#ObjectReference
We use a mix of String name and LocalObjectRef in Hive if that's of any help.
Spec: | ||
* name: string | ||
* approved: boolean (used for starting installation) | ||
* openshiftVersion: string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this openshift version look like?
We work with ClusterImageSets, which ended up being just a string pointing to a release image pull spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like "4.6". We have a ConfigMap that we maintain with information about each version.
To answer some possible followup questions:
- We don't currently have real support for arbitrary versions
- We know that rhcos_version can be derived but haven't gotten around to it yet
{
"4.6": {
"display_name": "4.6.8",
"release_image": "quay.io/openshift-release-dev/ocp-release:4.6.8-x86_64",
"rhcos_image": "https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/4.6/4.6.8/rhcos-4.6.8-x86_64-live.x86_64.iso",
"support_level": "production"
},
"4.7": {
"display_name": "4.7-pre-release",
"release_image": "quay.io/openshift-release-dev/ocp-release@sha256:2419f9cd3ea9bd114764855653012e305ade2527210d332bfdd6dbdae538bd66",
"rhcos_image": "https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/4.6/4.6.8/rhcos-4.6.8-x86_64-live.x86_64.iso",
"support_level": "beta"
}
}
* isApiVipConnected: [success, failure, pending, error] | ||
* belongsToMajorityGroup: [success, failure, pending, error] | ||
* isNtpSynced: [success, failure, pending, error] | ||
* progress: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another good fit for a condition.
agent-based install tooling needs to know about the cluster that it will be | ||
creating. Some or all of that new resource may be folded into ClusterDeployment | ||
in the future, but in the immediate term, keeping it separate will enable much | ||
quicker development. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general once an API goes out, it's near impossible to change it. So before this becomes a real thing, we'll want to transition to the permanent style. During initial devel it makes sense to keep it separate.
1. User creates AssistedInstallCluster resource describing their desired cluster. | ||
1. User creates ClusterDeployment describing a new cluster. The Platform section indicates that metal3 should be used to boot hosts with a live ISO. A new field references the AssistedInstallCluster resource. | ||
1. User associates some BareMetalAssets with the ClusterDeployment. | ||
1. Hive creates BareMetalHost resources corresponding to each BareMetalAsset, and uses them to boot the live ISO. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why this would be Hive, we'll need to discuss.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BMA is an ACM API. Hive is not presently aware of it. A dependency from ACM to Hive and from Hive back to ACM feels like trouble, so I'm having a hard time visualizing who or what should be responsible for this automated booting of BMA inventory.
Should BMA CRD move to Hive?
Would the Assisted Install operator be a good place to scan BMA inventory to auto-boot?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should BMA CRD move to Hive?
That is a key question. I agree with what I think you're implying, which is that this move would make sense if we expect hive to act on BareMetalAssets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would the Assisted Install operator be a good place to scan BMA inventory to auto-boot?
One of the advantages of the design in this proposal is that it keeps the assisted installer components out of the business of booting hosts or needing any platform-specific capabilities. It's reasonable to think that other platforms besides bare metal may benefit from this pattern, particularly on-prem virtualization platforms for example. I don't think we would want to oblige assisted installer itself to know how to boot hosts for each platform.
bare metal hardware. It is necessary to bring those capabilities on-premise in | ||
users' clusters for two purposes: | ||
|
||
1. A cluster created by the Assisted Installer needs the ability to add workers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how is adding workers related to running on premise?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think running on premise is more useful when users want private setup compared to public endpoint as currently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently we do not have a solution for adding workers to an on-premise cluster without depending on cloud.redhat.com. This proposal adds that capability.
|
||
### Goals | ||
|
||
* Expose the Assisted Installer's capabilities as a kubernetes-native API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was no context provided above as to why this is one of the goals. What's wrong with keeping the current API but just running a local instance of the entire stack. Components like hive can communicate with rest APIs directly, noneed for k85 API i think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OpenShift's infrastructure management is k8s-native, so agent-based installation through assisted installer needs to be exposed as k8s-native APIs. That's a design principle of OpenShift 4 that is outside the scope of this enhancement, though it wouldn't hurt to re-state it here.
As for hive, hive is not necessarily the only user of these APIs, so it would not be a suitable point at which to bridge the APIs from non-native to native.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related to comment I just added below, but I think it's important to remember what Hive is. Hive is not a big bloated management application, it's an API for installing OpenShift at scale. I am presently thinking we should work together to get this into Hive and eliminate this middle API + operator.
* approved: boolean (used for starting installation) | ||
* openshiftVersion: string | ||
* baseDnsDomain: string | ||
* clusterNetworkCidr: string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are multiple fields that describe networking for the cluster I so maybe we combine them into one one struct? If this is just an indication of all the fields that would be required & not structure of API, we can skip this comment.
* noProxy: string | ||
* userManagedNetworking: boolean | ||
* additionalNtpSource: string | ||
* installConfigOverrides: string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What might be more ergonomic is allowing uses to provide an install config yaml object, with the expectations that fields defined in this api will replace/override the user provided
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you mean - can you please elaborate?
The purpose of this API is to allow advanced users in specific cases to override few values in the install config.
* additionalNtpSource: string | ||
* installConfigOverrides: string | ||
|
||
Status: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, we should use conditions with specific type to provide the necessary status updates instead of these specialized fields in the status...
|
||
1. Get an ISO URL from assisted operator. | ||
1. Use metal3's BareMetalHost to boot the live ISO on some hardware. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the perspective of hive, I still feel some missing info for the workflow here... could we provide one example workflow here to understand the interactions here?
- a user creates the cluster deployment and agent driven cluster installation object, there’s a bunch of duplicate things in these objects like clustername etc..
- hive then uses these to figure out things like what version of open shift is required and then requests for an iso from the service
- Hive then uses the iso and some bare metal api to boot these hosts
- Hive then performs the matching of the hosts that report back to fulfill the cluster deployment requirements and then triggers the install??
- hive then keeps track of status of installation
An high level workflow will help understand the next sections a little bit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We reworked the document (and the design) a bit, hopefully it answers your questions. Let us know if not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some high-level workflow descriptions like this would make excellent introductory text for the proposal section around line 66, and give the overall context for the rest of these details.
For modest clusters that do not have persistent storage available but need | ||
day-2 ability to add nodes, the database will need to be reconstructable in | ||
case its current state is lost. Thus the CRs will be the source of truth for | ||
the specs, and the controller will upon statup first ensure that the DB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the specs, and the controller will upon statup first ensure that the DB | |
the specs, and the controller will upon startup first ensure that the DB |
storage | ||
|
||
For modest clusters that do not have persistent storage available but need | ||
day-2 ability to add nodes, the database will need to be reconstructable in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In other places where we've done this sort of syncing we've found it to be a source of complexity and bugs. I strongly encourage you to require persistent storage for the service. If running without persistent storage is a requirement, then it needs to be listed as a goal above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea was to require persistent storage for the multi-cluster case which is more complex, and there is already a precedent for (ACM). However, it would be a difficult to tell users that they need persistent storage to add workers.
Spec: | ||
* clusterName: string | ||
* sshPublicKey: string | ||
* pullSecretName: string (reference to k8s secret) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect to be able to reuse a secret like this across clusters.
resource. | ||
1. A host boots the live ISO, the assisted agent starts running, and the agent contacts the assisted service to register its existence. Communication utilizes the existing non-k8s REST API. The agent walks through the validation and inspection workflow as it exists today. | ||
1. The wrapping controller creates a new AssistedInstallAgent resource to be the k8s-native API for the agent. | ||
1. CAPBM creates a new BareMetalHost resource, setting the status annotation based on inspection information from the prior step. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect the wrapping controller to do this step. What causes CAPBM to know that it needs to happen? Why does it need to happen here, instead of later when power management is added?
|
||
1. User creates BareMetalAssets containing BMC credentials. | ||
1. User creates AssistedInstallCluster resource describing their desired cluster. | ||
1. User creates ClusterDeployment describing a new cluster. The Platform section indicates that metal3 should be used to boot hosts with a live ISO. A new field references the AssistedInstallCluster resource. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dgoodwin points it out earlier, but this linkage makes me think the new AssistedInstallCluster should just be a new platform within ClusterDeployment. Users shouldn't have to know about metal3.
1. User creates ClusterDeployment describing a new cluster. The Platform section indicates that metal3 should be used to boot hosts with a live ISO. A new field references the AssistedInstallCluster resource. | ||
1. User associates some BareMetalAssets with the ClusterDeployment. | ||
1. Hive creates BareMetalHost resources corresponding to each BareMetalAsset, and uses them to boot the live ISO. | ||
1. baremetal-operator uses redfish virtualmedia to boot the live ISO. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What validation are we going to do, and where, to ensure that uses only give us BMC settings that support virtualmedia?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see an answer to this question in the latest draft.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current draft has departed significantly from this. For example, hive no longer is making BareMetalHosts, and there are no BareMetalAssets involved. To the question though, the latest draft does not require virtualmedia. It now says "baremetal-operator boots the live ISO on the BareMetalHost."
Scenario B: user maintains an inventory of hosts as BareMetalHosts. They are | ||
directly used to boot live ISOs as part of agent-based provisioning. When a | ||
user selects BMHs for cluster creation, they may be moved into a new | ||
cluter-specific namespace by deleting and re-creating them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deleting and recreating host resources is relatively expensive, because it requires updating ironic, which include ensuring the host is deprovisioned. It seems more flexible to not care which namespace a host is in, so that users can configure their inventory how they prefer.
Moves CRD detail discussions to dedicated PRs where they are being implemented.
* install_completed_at: string | ||
* controller_logs_collected_at: string | ||
A first draft of this CRD has been | ||
[implemented](https://github.com/openshift/assisted-service/blob/master/internal/controller/api/v1alpha1/image_types.go) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit odd to have code approved before the design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a matter of practicality, reviewing the field-level details of the API seems to be most productive when looking at the go structs. And since the approach is currently being proved out through code, it doesn't seem productive to try duplicating those details here and keeping them in sync. Open to suggestions though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have noticed a tendency of the edge teams to rush to implementation before the broader organization is aligned with the direction, resulting in re-implementing some changes as many as 3 times (see #555 for example). We may all work more efficiently if we focus on agreement here first.
* sourceState: [synced, combined, not_combined, error, variable, unreachable] | ||
The details of this API are being discussed in a [pull | ||
request](https://github.com/openshift/assisted-service/pull/861) that | ||
implements the CRD. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, the process here is backwards.
Other platforms may benefit from similar capability, such as on-premise virtualization | ||
platforms. Ideally the ability to match a Machine with an Agent will be delivered | ||
so that it can be integrated into multiple machine-api providers. Each platform would | ||
perform this workflow: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So (potentially) every cluster-api-provider implementation would have 2 workflows? One for the IPI process it uses today, and another for this new agent-based installation? Or we would eventually replace the IPI process with the new one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each machine-api provider could either A) add this workflow, or B) replace its existing workflow with this workflow. I think it'll be a per-provider decision on whether either option makes sense.
The host CRD has a field called "Role" that defaults to unset, and it must be | ||
set prior to provisioning to one of "master", "worker", or "auto-assign". The | ||
act of assigning a role and a cluster to a host will implicitly approve it to | ||
join that cluster with the specified role. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still allows any arbitrary host to contact the service in the cluster that manages the agents, right? Are there any security implications from that? Can a user of that API learn anything about the cluster that we wouldn't want an unknown actor to know?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The API the agent reports to will require authentication, and that authentication will be baked into the live ISO. I don't think the agent can learn any information about anything else through that API today: @avishayt is that right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we find the answer to the security question here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* install_completed_at: string | ||
* controller_logs_collected_at: string | ||
A first draft of this CRD has been | ||
[implemented](https://github.com/openshift/assisted-service/blob/master/internal/controller/api/v1alpha1/image_types.go) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have noticed a tendency of the edge teams to rush to implementation before the broader organization is aligned with the direction, resulting in re-implementing some changes as many as 3 times (see #555 for example). We may all work more efficiently if we focus on agreement here first.
implementation details that can be removed in the future. While continuing to | ||
run a separate database and continuing to utilize parts of the existing REST | ||
API are not ideal for an in-cluster service, this approach is the only | ||
practical way to deliver a working solution in a short time frame. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^^ 👍
760cd25
to
37a5777
Compare
I have modified the "API Versions and Potential Change" section to express our intent to use "beta" and "v1" APIs. @avishayt @dgoodwin @markmc @romfreiman |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some feedback regarding our use case
means. | ||
* a list of SSH public keys that will be added to all agents for use in debugging. | ||
* pull secret reference | ||
* a label selector for Agents. This is how Agent resources are identified as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Label selectors can allow a resource to match multiple queries. Do we want to allow an agent to match multiple environments? Would we be better of making the link between the agent and the environment explicit by adding a reference to the agent API pointing to an InstallEnv?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I don't think there's any late binding or choose from pool functionality here, makes sense that the Agent would reference it's InstallEnv explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An explicit reference on the Agent could be useful, but I don't know of a direct use case for it yet. We could add it later potentially, or now if there is a use case.
While the label selector can potentially behave as you describe, the same would apply to a Deployment, MachineSet, or other similar resources.
Typically when you create a Deployment, you make a simple label selector and put the corresponding label in the Pod template a few lines below. The same approach would apply here and should make it simple to avoid conflicts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ITSM that a controller replicating a resource using a template has very different needs from this case. The InstallEnv controller isn't doing anything with agents, and we just want to establish the relationship? And the agent labels you set on InstallEnv are really for the ClusterDeployment controller to identify a set of agents?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The InstallEnv controller creates the ISO that is basically a template for Agents. I see it as very similar in that way. The one and only place to define an Agent is the InstallEnv spec. Based on that, utilizing an ISO and some boot process, n number of Agents get stamped out.
Generally if a user or client wants to do a list query with some selector, a label is the way to do that. InstallEnv is a well-defined grouping for Agents and part of their identity, so a label referencing the InstallEnv seems to be the right way to reflect that relationship. kubectl get Agents -l 'installenv=foo'
or similar.
We haven't dug into reporting details in this proposal, but there is value in reporting on an InstallEnv's aggregate Agent status, including how many Agents were booted from an out-of-date ISO for example. We could imagine future actions that may be taken on an Agent based on changes to its InstallEnv, though that's not in-scope for this enhancement.
Another related detail not mentioned but that's probably implied is that an Agent should likely have an owner reference to its InstallEnv. Agent deletion in general may still have some open questions to work out, but I think it will make sense to garbage-collect them if their InstallEnv gets deleted. In any case, the owner reference may serve the role of the explicit reference you're looking for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The InstallEnv controller creates the ISO that is basically a template for Agents. I see it as very similar in that way. The one and only place to define an Agent is the InstallEnv spec. Based on that, utilizing an ISO and some boot process, n number of Agents get stamped out.
I think it's unlikely users will make that mental connection with the current design.
How would you feel about taking the next logical step with that analogy - putting the labels under a AgentTemplateSpec field, like installEnv.spec.template.metadata.labels ? Could we imagine wanting to extend AgentTemplateSpec as we want to allow the user to specify additional stuff that would show up in the created Agent resources?
Generally if a user or client wants to do a list query with some selector, a label is the way to do that. InstallEnv is a well-defined grouping for Agents and part of their identity, so a label referencing the InstallEnv seems to be the right way to reflect that relationship.
kubectl get Agents -l 'installenv=foo'
or similar.We haven't dug into reporting details in this proposal, but there is value in reporting on an InstallEnv's aggregate Agent status, including how many Agents were booted from an out-of-date ISO for example. We could imagine future actions that may be taken on an Agent based on changes to its InstallEnv, though that's not in-scope for this enhancement.
Both of those seem compelling to me, thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that thinking, but I'm not able to come up with anything other than labels that we would want to templatize for the Agent. (Also note that labels are not part of the Agent's Spec, but the Metadata.) Most of what the InstallEnv is providing as input to Agent creation is data that's required to run the live ISO, which of course precedes the Agent resource. Maybe there's a use case for adding custom annotations that are meaningful to the user, in which case we could templatize the Agent's metadata. Imagine the ReplicaSet template without the spec
part. But the annotation use case may or may not be terribly useful.
Right now, this is what the labels field looks like on the InstallEnv Spec:
// AgentLabels lists labels to apply to Agents that are associated with this InstallEnv upon
// the creation of the Agents.
AgentLabels map[string]string `json:"agentLabels,omitempty"`
Another option is we could try to name this field more clearly. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the metadata template approach, if you think that can work well
1. The Agent's Role field in its spec is assigned a value if a corresponding label and value were present on its BareMetalHost. (only "worker" is supported for now on day 2) | ||
1. The Agent is marked as Approved via a field in its Spec based on being recognized as running on the known BareMetalHost. | ||
1. The Agent runs through the validation and inspection phases. The results are shown on the Agent's Status, and eventually a condition marks the Agent as "ready". | ||
1. The Baremetal Agent Controller adds inspection data found on the Agent's Status to the BareMetalHost. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this is so we get the IP address, etc.? Do we need that in the multi-cluster case, though? In a regular cluster we need the host, machine, and node to have matching IP details so the CSR for the node will be approved. What approves the CSR for nodes in this case, where the deployed cluster platform is not "baremetal" and the host resources are outside of the cluster that was deployed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This scenario assumes that the spoke cluster does have platform "baremetal". A BareMetalHost resource gets created on the spoke cluster in part to enable the same CSR approval workflow that is already working on that platform.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the CSR for the host in the spoke cluster will be approved by something in the hub cluster?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CSR for the host in the spoke cluster will be approved in the spoke cluster by the same component and mechanism that approves them today for the baremetal platform. Creating the BareMetalHost and a corresponding Machine with correct details in the spoke cluster should make that work. Unless there's a detail we're missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There isn't anything in this section talking explicitly about data ending up in the spoke cluster, so it wasn't clear that was the case. Maybe that's all part of "installation of that host begins" on line 420?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's one of the things assisted does as part of its workflow. We could dig up the details if necessary, but I don't think those details are central to this proposal.
Lots of good discussion around the comments from @yrobla and @dhellmann - let's get the comment threads resolved with whatever changes are needed to incorporate that feedback |
1c2c4b1
to
0020f5a
Compare
Thanks for you patience capturing everyone's feedback, Michael. I think we're in a good position to merge this now /approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: markmc The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/priority important-soon |
No description provided.