-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SPLAT-1291: platform external documentation #16
Conversation
@mtulio: This pull request references OCPCLOUD-1581 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Skipping CI for Draft Pull Request. |
@mtulio: This pull request references OCPCLOUD-1581 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
i like how you are separating the two integration paths. |
Cool! I wasn't sure the best name for those paths, but let's developing the content and check the best fit. Considering AI requires the same prerequisites as UPI, I also thinking to merge both Use cases (UPI + AI), splitting it into sections of the same doc |
@mtulio: This pull request references OCPCLOUD-1581 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@mtulio: This pull request references OCPCLOUD-1581 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@mtulio: This pull request references OCPCLOUD-1581 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
The general steps to install a cluster using Platform External, providing examples of customizations, have been added ( I will develop the Use Case using OCI asap. |
@mtulio: This pull request references OCPCLOUD-1581 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@mtulio: This pull request references OCPCLOUD-1581 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@mtulio: This pull request references OCPCLOUD-1581 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
143e57b
to
de5ad52
Compare
docs/platform-external/installing.md
Outdated
# - specify a command to startup the CCM in your container | ||
# - define and mount extra volumes if needed | ||
# This example defines the CCM as a Deployment, but a DaemonSet is also possible as long as Pod's template is defined in the same way. | ||
kind: Deployment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kind: Deployment | |
--- | |
apiVersion: v1 | |
kind: Namespace | |
metadata: | |
name: {{ cloud provider name }}-cloud-controller-manager | |
annotations: | |
workload.openshift.io/allowed: management | |
labels: | |
"pod-security.kubernetes.io/enforce": "privileged" | |
"pod-security.kubernetes.io/audit": "privileged" | |
"pod-security.kubernetes.io/warn": "privileged" | |
"security.openshift.io/scc.podSecurityLabelSync": "false" | |
"openshift.io/run-level": "0" | |
"pod-security.kubernetes.io/enforce-version": "v1.24" | |
--- | |
apiVersion: v1 | |
kind: ServiceAccount | |
metadata: | |
name: cloud-controller-manager | |
namespace: {{ cloud provider name }}-cloud-controller-manager | |
--- | |
kind: Deployment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @adriengentil , thanks for updating the template. I think we need to make sure we want to run the CCM "as root" (privileged
) when defining those PSA labels for Namespace, or if we can improve those permissions.
@elmiko @rvanderp3 do you know the permissions added to CCM's controllers/workloads when PSA was introduced? (IIRC 4.13+)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mtulio are you talking about the permissions in here https://github.com/openshift/cluster-cloud-controller-manager-operator/blob/master/manifests/0000_26_cloud-controller-manager-operator_03_rbac_provider.yaml ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@elmiko No, it is about the labels added in the Namespace manifest (first comment in this thread), also the Google Docs thread about encouraging using a dedicated namespace (out of kube-* or openshift-*).
In our tests (namespace labels above) we are enforcing running the CCM as privileged
. My big question is: should we invest to refine those rules, or use the same we are using in integrated providers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My big question is: should we invest to refine those rules, or use the same we are using in integrated providers?
i think we should start with what we already use for the integrated providers, and then refine them over time. unless you have some ideas about how we should restrict them now. my concern here is that we will spend a lot of time understanding the refinements when there might be other areas that need attention.
metadata: | ||
name: oci-cloud-controller-manager | ||
namespace: $OCI_CCM_NAMESPACE | ||
data: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I discovered stringData
: https://kubernetes.io/docs/concepts/configuration/secret/#restriction-names-data
so no need to convert the string into base 64
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, thanks! I need to run an e2e to make sure I will not have any formatting issues.
- name: trusted-ca | ||
mountPath: /etc/pki/ca-trust/extracted/pem | ||
readOnly: true | ||
env: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it should be temporary, but for consistency, should we align with the template and source /etc/kubernetes/apiserver-url.env
instead of setting the environment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, we should use that env file. I was thinking about how to do it (load) in the kubernetes object, instead of changing their cmd statement - let's say for example that the bash is not available in the image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall this is going in a good direction @mtulio , i do think we need to be more consistent about how we talk about "Platform External".
i think that unless we are talking about using the exact value External
, we should avoid using the block quotes.
in other cases we should say Platform External, or the External platform type, to help differentiate when we are talking about this type as opposed to the direct value.
i left some minor nits in the comments.
edit: fwiw, i'm reaching out to some docs folks to ask about guidance on the whole "Platform External" usage
docs/platform-external/installing.md
Outdated
# - specify a command to startup the CCM in your container | ||
# - define and mount extra volumes if needed | ||
# This example defines the CCM as a Deployment, but a DaemonSet is also possible as long as Pod's template is defined in the same way. | ||
kind: Deployment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mtulio are you talking about the permissions in here https://github.com/openshift/cluster-cloud-controller-manager-operator/blob/master/manifests/0000_26_cloud-controller-manager-operator_03_rbac_provider.yaml ?
docs/platform-external/installing.md
Outdated
``` | ||
|
||
|
||
#### Create custom manifests for Kubelet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will we need these instructions with the external changes we are landing in 4.14?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, these are instructions to provide the ProviderID through MachineConfig, I think it is not part of the code changes but for the installation setup. Am I correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, it is cloud-specific and the provider must tell how the kubelet will find/populate/discover the Provider ID by node.
Let me know if you have any other thoughts/alternatives/advice on that, this is how we are doing so far in OCI :)
cc @rvanderp3 for awareness and suggestions in VMware world
talked with @jeana-redhat about the usage of "External", i think we should reserve |
that's excellent input. Thanks for sharing it. I will address your feedback considering those suggestions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reminder
|
||
Table of Contents | ||
|
||
- [Prerequisites]() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is missing the links to the sections. I will do it in the next review round.
@@ -0,0 +1,11 @@ | |||
# Run conformance tests in custom OpenShift installations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure yet the level of tests we'll describe here for platform-external
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in general this is looking really good @mtulio , most of my suggestions are about minor grammar changes. we also need to pass through this to make the capitalization more consistent.
docs/platform-external/index.md
Outdated
Kubernetes components in OpenShift/OKD without the need to modify any core payload | ||
and without the need for direct involvement of OpenShift engineering. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we should say "Red Hat engineering" ?
The external platform in OpenShift sets the `--cloud-provider` flag to `external` on Kubernetes components | ||
(Kubelet and Kube Controller Manager) to signalize the use of external cloud providers, | ||
allowing partners to extend providers' components like Cloud Controller Manager to the OpenShift platform. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we probably should update this paragraph to make it clear that users can choose to enable or disable the use of ccms
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@elmiko ACK. What is your suggestion? iirc the optional path will be available only in 4.15, right? Should I say in 4.14 they need to provide a CCM and 4.15 it will be optional, with the default as "similar none"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, i think that makes sense. let users know that in 4.14 it is set to external , but in 4.15 it can set or unset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added an "admonition" block explaining it.
docs/platform-external/index.md
Outdated
|
||
This work is divided into phases, the initial version is available on OCP 4.14+ which allows | ||
providers to install OpenShift cluster supplying the cloud provider's Cloud Controller | ||
Manager (CCM) when the cluster is initialized. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can probably drop this paragraph, unless we want to talk more about our roadmap.
--instance-details file://instance-config-details-controlplanes.json | ||
``` | ||
|
||
- Creating the Instance Pool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't think we need to capitalize "Instance Pool"
|
||
#### Compute/workers | ||
|
||
- Creating the Instance Configuration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably don't need to capitalize "Instance Configuration"
--instance-details file://instance-config-details-compute.json | ||
``` | ||
|
||
- Creating the Instance Pool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similar here with "Instance Pool"
|
||
#### Approve certificates for worker nodes | ||
|
||
Once the instances are created by the provider and ignitions loaded, the certification signing requests (CSR's) must be created by kubelet awaiting to be approved - in pending condition. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this part is a little confusing to me
the certification signing requests (CSR's) must be created by kubelet awaiting to be approved - in pending condition.
it sounds like the user needs to create the requests in pending state, but i thought the kubelet creates the requests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is, it wasn't well stated. I replaced to a fragment from our docs/UPI part.
|
||
#### Wait for Bootstrap complete | ||
|
||
Check if you can remove the bootstrap instance when the Control Plane |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to capitalize "Control Plane"
48f1f2c
to
03946da
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @elmiko! Thanks for your review. All the points should be already addressed.
I will take a new look tomorrow if I didn't miss anything, but it is ready for review.
Thanks! o/
The external platform in OpenShift sets the `--cloud-provider` flag to `external` on Kubernetes components | ||
(Kubelet and Kube Controller Manager) to signalize the use of external cloud providers, | ||
allowing partners to extend providers' components like Cloud Controller Manager to the OpenShift platform. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@elmiko ACK. What is your suggestion? iirc the optional path will be available only in 4.15, right? Should I say in 4.14 they need to provide a CCM and 4.15 it will be optional, with the default as "similar none"?
docs/platform-external/installing.md
Outdated
method from the official documentation ["Installing a cluster on any platform"](https://docs.openshift.com/container-platform/4.13/installing/installing_platform_agnostic/installing-platform-agnostic.html). | ||
|
||
This method is a fully customized automation, allowing the user to deploy | ||
a cluster to use any automation they want, including provider's specific, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rewrote to:
This method is a fully customized automation, allowing the user to deploy
a cluster to use any automation they want, including provider's specific like network components,
required to deploy an OpenShift cluster.
lmk wdyt
docs/platform-external/installing.md
Outdated
~~~ | ||
|
||
You must upload the downloaded image to your cloud provider image service and | ||
use it when creating virtual machines. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@elmiko imo we can skip this for now, it's not a regular task in AWS, and I am taking some time to set the parameters to do it when importing an RHCOS OVA image, as coreos seems not to be natively supported[1] without adding extra parameters[2]. As we are already sharing how to do it in OCI, maybe we can skip this generic guide.
[1] https://docs.aws.amazon.com/vm-import/latest/userguide/prerequisites.html#vmimport-image-formats
[2] error when importing OVA
$ aws ec2 describe-import-image-tasks --region us-west-2
{
"ImportImageTasks": [
{
"ImportTaskId": "import-ami-0871372c3894f0971",
"SnapshotDetails": [
{
"DeviceName": "/dev/sde",
"DiskImageSize": 1241901056.0,
"Format": "VMDK",
"Status": "completed",
"UserBucket": {
"S3Bucket": "openshift-images",
"S3Key": "rhcos-414.92.202309201615-0-vmware.x86_64.ova"
}
}
],
"Status": "deleted",
"StatusMessage": "ClientError: Unknown OS / Missing OS files.",
"Tags": []
}
]
}
The following section is a draft I wrote that can be revisited in the future.
Example importing image to AWS
As a example in AWS, you can import the OVA image as AMI with the following steps:
Prerequisites:
- Create the service role
vmimport
to Import/Export the VM using CLI. See the document as reference.
Steps:
- Download OVA image:
# Get image URL
IMAGE_URL=$(./openshift-install coreos print-stream-json | jq -r '.architectures["x86_64"].artifacts["vmware"].formats["ova"].disk.location')
# Download the OVA
wget $IMAGE_URL
# Set the name
IMAGE_NAME=$(basename $IMAGE_URL)
- Import to a S3 Bucket:
aws s3 cp "${IMAGE_NAME}" s3://openshift-images/"${IMAGE_NAME}"
- Import the image:
aws ec2 import-image --region us-west-2 \
--disk-containers Format=ova,UserBucket="{S3Bucket=openshift-images,S3Key=${IMAGE_NAME}}"
aws describe-image --region us-west-2 --image-name ${IMAGE_NAME}
docs/platform-external/installing.md
Outdated
The agnostic installation used by the platform external does not require any identity, | ||
although the provider's components may require identity to communicate with the cloud APIs. | ||
OpenShift prioritizes, and recommends, the least privileges and password-less authentication | ||
method, or short-lived tokens, when providing credentials to components. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@elmiko As a generic guide I didn't add much here. With your suggestion, I am adding this statement:
"You might need to create any credentials, secrets and/or configmap manifest according to the cloud provider components' documentation, like Cloud Controller Manager."
I would keep specific guidance to the components's documentation, or expand the "Create manifests" subsection, more specifically Create custom manifests for CCM.
docs/platform-external/installing.md
Outdated
|
||
### Load Balancers | ||
|
||
The ["Load balancing requirements for user-provisioned infrastructure"](https://docs.openshift.com/container-platform/4.13/installing/installing_platform_agnostic/installing-platform-agnostic.html#installation-load-balancing-user-infra_installing-platform-agnostic). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah xD fixed, thanks!
created replacing it with a remote URL fetching from the temporary Bucket Object URL. | ||
|
||
Once the bootstrap instance is created, it must be attached to the Load Balancer in the | ||
Backends for Kubernetes API Server and Machine Config Server. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The correct term is Backend Sets
, the official name of OCI: https://docs.oracle.com/en-us/iaas/Content/NetworkLoadBalancer/BackendSets/backend-set-management.htm
I replaced the "Backends for ..." with "Backend Sets of ...". let me known wdyt.
The ignition URL for the Bootstrap node must be available in the `$IGN_BOOTSTRAP_URL`. | ||
|
||
!!! warning "Attention" | ||
The bucket Object URL will expire in one hour if you are planning to create |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I adjusted it. non-capitalize must fit better in this case. thanks
!!! tip "Helper - OCI CLI documentation" | ||
- [`oci nlb backend-set update`](https://docs.oracle.com/en-us/iaas/tools/oci-cli/3.29.1/oci_cli_docs/cmdref/nlb/backend-set/update.html) | ||
|
||
- Add the bootstrap to the Load Balancer's backend set MCS[Machine Config Server]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added MCS, I mentioned it earlier. Thanks
#### Control Plane | ||
|
||
Three control plane instances will be created. The instances is created using | ||
Compute Pool, which will automatically inherit the same configuration and add |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be Instance Pools https://docs.oracle.com/en-us/iaas/Content/Compute/Tasks/creatinginstancepool.htm
|
||
#### Approve certificates for worker nodes | ||
|
||
Once the instances are created by the provider and ignitions loaded, the certification signing requests (CSR's) must be created by kubelet awaiting to be approved - in pending condition. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is, it wasn't well stated. I replaced to a fragment from our docs/UPI part.
@mtulio gave another read through today, how do you feel about merging? |
talked with Marco about this today, it sounds like we just need to make sure we are happy with the oci instructions and then we can consider merging. |
Hey @elmiko thanks for the feedback and sync. The plan is:
Working on it now, then we are ready to merge. |
adding platform external documentation adding instructions for OCI CCM external/use-case oci: adding infra patch steps Update docs/platform-external/installing.md Co-authored-by: Adrien Gentil <adrien.gentil@free.fr> Update docs/platform-external/installing.md (rbac) Co-authored-by: Adrien Gentil <adrien.gentil@free.fr> Update docs/platform-external/installing.md (nit) adding create nodes steps reviewing formatting plat-external/oci-guide: update provisioning steps Addressing content review for Platform External Addressing review for generic installing document fixes/external-use-case-oci: 4.14-4c4 deployment with CCM Create/delete reviewed with tagging namespace to restrict cloud api OCI/use case: adding VCN and dependencies external/oci: provision and deprovision steps are created/validated external/review: typo review: removing account setup in generic guide review: removing afterburn in the unit for kubelet providerID review: improving bootstrap node intro PR full review - addressing feedback and missing test parts
03946da
to
a4cf238
Compare
Hi @elmiko - just let you know I just applied all the planned work for this PR. It is ready now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is looking really good @mtulio ! i left a few comments of minor cleanups, but after that i'm good to merge this.
The external platform in OpenShift sets the `--cloud-provider` flag to `external` on Kubernetes components | ||
(Kubelet and Kube Controller Manager) to signalize the use of external cloud providers, | ||
allowing partners to extend providers' components like Cloud Controller Manager to the OpenShift platform. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, i think that makes sense. let users know that in 4.14 it is set to external , but in 4.15 it can set or unset.
Co-authored-by: Michael McCune <msm@opbstudios.com>
@mtulio: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@mtulio: This pull request references SPLAT-1291 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "4.15.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updates look great, thanks @mtulio
/approve
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: elmiko, mtulio The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Creating Platform External documentation.
Steps:
Suggested next steps:
References: