Docs: add architecture overview, remove outdated HACKING guide. #1078

squeed · 2021-04-28T18:57:12Z

Adds an architecture overview. This isn't a detailed reference, rather an overview of the intentions and structure of the code.

/cc @danwinship
/cc @rcarrillocruz
/cc @vpickard
/cc @dcbw

openshift-ci-robot · 2021-04-28T18:57:28Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: squeed

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [squeed]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

vpickard

Do you plan to update the HACKING guide in a separate PR? It seems that an updated version would be useful to folks making changes for CNO, agree?

squeed · 2021-04-28T19:30:27Z

@vpickard most of the things in HACKING are now in the wiki, and are more correct. So I'd rather keep those sorts of things in a non-ci-gated wiki.

dougbtv · 2021-04-28T19:36:01Z

I was looking for the same, but Casey's right, we have a run-locally reference in the wiki @ https://github.com/openshift/cluster-network-operator/wiki/Running-a-local-cluster-network-operator-for-plugin-development#run-hackrun-locallysh-to-start-a-cluster-with-your-custom-image (which was probably one of the most useful things there)

squeed · 2021-05-03T13:58:20Z

/hold
Just to stop the retests.

msherif1234 · 2021-05-03T15:20:03Z

@squeed can you add a section about debugging and troubleshooting CNO ? also can you expand on what it will take to extend CNO with new CRD, kind of workflow ?

aojea · 2021-05-03T16:06:09Z

docs/architecture.md

+4. **Bootstrap** - gather existing cluster state, and create any non-Kubernetes resources (i.e. OpenStack objects)
+5. **Render** - process template files in `/bindata` and generate Kubernetes objects


maybe it is worth to add a subsection for the upgrade logic (and the dual-stack conversion I've added) ?
that make use of these 2 steps to retain one of the changes

danwinship · 2021-05-03T16:45:17Z

docs/architecture.md

+
+## CNO as SLO
+
+CNO is a so-called second-level-operator (SLO), which means it is installed by the Cluster Version Operator (CVO). Owing to it's critical position in the installation flow, it is installed quite early. However, no other operators wait for the CNO -- their pods just have to wait for the network to come up.


"its"
(also, line-wrap?)

it is installed quite early

IIRC, the CVO originally installed things in a particular order at install time, but it no longer does. (It does still do ordering during upgrades.) It's just that most operators don't tolerate node.kubernetes.io/not-ready and so won't be scheduled until after the SDN has come up.

This is perhaps worth clarifying some more:

CNO tolerates node-role.kubernetes.io/master, node.kubernetes.io/not-ready, and node.kubernetes.io/network-unavailable, and is hostNetwork: true, to ensure that it can be started before the workers are created and before there is any SDN.

CNO deploys the network plugin, which has similar tolerations

CNO also deploys some other operands (link to operands.md) which will not be able to come up yet, because they don't have the same tolerations, or because they depend on other operators that haven't started yet.

Once the network plugin starts up on each node, the node untaints itself and other less-tolerant/non-host-network second-level operators become able to run there.

docs/architecture.md

squeed · 2021-05-10T22:15:41Z

Updated based on feedback - thanks @danwinship and @aojea. All that's left is @aojea's suggestion to talk about upgade & migration logic.

danwinship

/lgtm
/hold
feel free to fix or merge

danwinship · 2021-07-28T14:10:49Z

docs/architecture.md

+The CVO has a notion of 
+[run levels](https://github.com/openshift/cluster-version-operator/blob/master/docs/dev/operators.md#how-do-i-get-added-as-a-special-run-level),
+which dictate the order in which components are **upgraded**. Presently, the CNO 
+(and thus its operands) are runlevel 07, which is comparatively early. At 


Perhaps worth clarifying that MCO updates very late, and thus during an upgrade the new networking components will initially be running against an N-1 RHCOS and in particular an N-1 OVS.

added something for here.

docs/architecture.md

squeed · 2021-07-28T16:12:11Z

@danwinship updated based on your suggestions (I assume I can't self-lgtm).

openshift-ci · 2021-07-28T19:19:09Z

@squeed: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Rerun command
ci/prow/e2e-ovn-hybrid-step-registry	`9a685e2`	link	`/test e2e-ovn-hybrid-step-registry`
ci/prow/e2e-ovn-ipsec-step-registry	`9a685e2`	link	`/test e2e-ovn-ipsec-step-registry`
ci/prow/e2e-azure-ovn	`9a685e2`	link	`/test e2e-azure-ovn`
ci/prow/e2e-openstack-ovn	`9a685e2`	link	`/test e2e-openstack-ovn`
ci/prow/e2e-gcp-ovn-upgrade	`9a685e2`	link	`/test e2e-gcp-ovn-upgrade`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

danwinship · 2021-07-30T17:55:34Z

/lgtm

openshift-ci · 2021-07-30T17:55:50Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danwinship, squeed

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [danwinship,squeed]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

squeed · 2021-08-03T12:54:32Z

/hold cancel

squeed · 2021-08-03T12:54:58Z

Manually adding valid-bug, since this is a doc change.

openshift-ci · 2021-08-03T12:59:12Z

@squeed: /override requires a failed status context or a job name to operate on.
The following unknown contexts were given:

ci/prow/

Only the following contexts were expected:

ci/prow/e2e-agnostic-upgrade
ci/prow/e2e-aws-ovn-windows
ci/prow/e2e-aws-sdn-multi
ci/prow/e2e-azure-ovn
ci/prow/e2e-gcp
ci/prow/e2e-gcp-ovn
ci/prow/e2e-gcp-ovn-upgrade
ci/prow/e2e-metal-ipi-ovn-ipv6
ci/prow/e2e-metal-ipi-ovn-ipv6-ipsec
ci/prow/e2e-openstack-ovn
ci/prow/e2e-ovn-hybrid-step-registry
ci/prow/e2e-ovn-ipsec-step-registry
ci/prow/e2e-ovn-step-registry
ci/prow/e2e-vsphere-ovn
ci/prow/e2e-vsphere-windows
ci/prow/images
ci/prow/unit
ci/prow/verify
pull-ci-openshift-cluster-network-operator-release-4.1-images
pull-ci-openshift-cluster-network-operator-release-4.1-unit
pull-ci-openshift-cluster-network-operator-release-4.1-verify
pull-ci-openshift-cluster-network-operator-release-4.3-e2e-agnostic-upgrade
pull-ci-openshift-cluster-network-operator-release-4.3-e2e-gcp
pull-ci-openshift-cluster-network-operator-release-4.4-e2e-gcp-ovn
pull-ci-openshift-cluster-network-operator-release-4.5-e2e-aws-sdn-multi
pull-ci-openshift-cluster-network-operator-release-4.5-e2e-gcp-ovn-upgrade
pull-ci-openshift-cluster-network-operator-release-4.5-e2e-metal-ipi-ovn-ipv6
pull-ci-openshift-cluster-network-operator-release-4.5-e2e-ovn-hybrid-step-registry
pull-ci-openshift-cluster-network-operator-release-4.5-e2e-ovn-step-registry
pull-ci-openshift-cluster-network-operator-release-4.6-e2e-aws-ovn-windows
pull-ci-openshift-cluster-network-operator-release-4.6-e2e-azure-ovn
pull-ci-openshift-cluster-network-operator-release-4.6-e2e-openstack-ovn
pull-ci-openshift-cluster-network-operator-release-4.7-e2e-ovn-ipsec-step-registry
pull-ci-openshift-cluster-network-operator-release-4.7-e2e-vsphere-ovn
pull-ci-openshift-cluster-network-operator-release-4.7-e2e-vsphere-windows
pull-ci-openshift-cluster-network-operator-release-4.8-e2e-metal-ipi-ovn-ipv6-ipsec
tide

In response to this:

/override ci/prow/e2e-aws-ovn-windows
/override ci/prow/
/override ci/prow/
/override ci/prow/
/override ci/prow/

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

squeed · 2021-08-03T12:59:34Z

/override ci/prow/e2e-aws-ovn-windows
/override ci/prow/e2e-gcp
/override ci/prow/e2e-gcp-ovn
/override ci/prow/e2e-metal-ipi-ovn-ipv6

openshift-ci · 2021-08-03T12:59:45Z

@squeed: Overrode contexts on behalf of squeed: ci/prow/e2e-aws-ovn-windows, ci/prow/e2e-gcp, ci/prow/e2e-gcp-ovn, ci/prow/e2e-metal-ipi-ovn-ipv6

In response to this:

/override ci/prow/e2e-aws-ovn-windows
/override ci/prow/e2e-gcp
/override ci/prow/e2e-gcp-ovn
/override ci/prow/e2e-metal-ipi-ovn-ipv6

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

squeed · 2021-08-03T13:00:51Z

/override ci/prow/verify
/override ci/prow/unit
/override ci/prow/images
/override ci/prow/e2e-aws-sdn-multi
/override ci/prow/e2e-agnostic-upgrade

openshift-ci · 2021-08-03T13:01:11Z

@squeed: Overrode contexts on behalf of squeed: ci/prow/e2e-agnostic-upgrade, ci/prow/e2e-aws-sdn-multi, ci/prow/images, ci/prow/unit, ci/prow/verify

In response to this:

/override ci/prow/verify
/override ci/prow/unit
/override ci/prow/images
/override ci/prow/e2e-aws-sdn-multi
/override ci/prow/e2e-agnostic-upgrade

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci-robot requested review from danwinship, dcbw, rcarrillocruz and vpickard April 28, 2021 18:57

openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 28, 2021

vpickard reviewed Apr 28, 2021

View reviewed changes

openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 3, 2021

squeed removed the request for review from rcarrillocruz May 3, 2021 13:58

aojea reviewed May 3, 2021

View reviewed changes

danwinship reviewed May 3, 2021

View reviewed changes

danwinship reviewed Jul 28, 2021

View reviewed changes

openshift-ci bot assigned danwinship Jul 28, 2021

openshift-ci bot added lgtm Indicates that a PR is ready to be merged. and removed lgtm Indicates that a PR is ready to be merged. labels Jul 28, 2021

Docs: add architecture overview, remove outdated HACKING guide.

9a685e2

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jul 30, 2021

openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 3, 2021

squeed added the bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. label Aug 3, 2021

openshift-ci bot merged commit fcdac34 into openshift:master Aug 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Docs: add architecture overview, remove outdated HACKING guide. #1078

Docs: add architecture overview, remove outdated HACKING guide. #1078

squeed commented Apr 28, 2021

openshift-ci-robot commented Apr 28, 2021

vpickard left a comment

squeed commented Apr 28, 2021

dougbtv commented Apr 28, 2021

squeed commented May 3, 2021

msherif1234 commented May 3, 2021

aojea May 3, 2021

danwinship May 3, 2021

danwinship May 3, 2021

danwinship May 3, 2021

squeed commented May 10, 2021

danwinship left a comment

danwinship Jul 28, 2021

squeed Jul 28, 2021

squeed commented Jul 28, 2021

openshift-ci bot commented Jul 28, 2021 •

edited

danwinship commented Jul 30, 2021

openshift-ci bot commented Jul 30, 2021

squeed commented Aug 3, 2021

squeed commented Aug 3, 2021

openshift-ci bot commented Aug 3, 2021

squeed commented Aug 3, 2021

openshift-ci bot commented Aug 3, 2021

squeed commented Aug 3, 2021

openshift-ci bot commented Aug 3, 2021

		4. Bootstrap - gather existing cluster state, and create any non-Kubernetes resources (i.e. OpenStack objects)
		5. Render - process template files in `/bindata` and generate Kubernetes objects


		## CNO as SLO

		CNO is a so-called second-level-operator (SLO), which means it is installed by the Cluster Version Operator (CVO). Owing to it's critical position in the installation flow, it is installed quite early. However, no other operators wait for the CNO -- their pods just have to wait for the network to come up.

Docs: add architecture overview, remove outdated HACKING guide. #1078

Docs: add architecture overview, remove outdated HACKING guide. #1078

Conversation

squeed commented Apr 28, 2021

openshift-ci-robot commented Apr 28, 2021

vpickard left a comment

Choose a reason for hiding this comment

squeed commented Apr 28, 2021

dougbtv commented Apr 28, 2021

squeed commented May 3, 2021

msherif1234 commented May 3, 2021

aojea May 3, 2021

Choose a reason for hiding this comment

danwinship May 3, 2021

Choose a reason for hiding this comment

danwinship May 3, 2021

Choose a reason for hiding this comment

danwinship May 3, 2021

Choose a reason for hiding this comment

squeed commented May 10, 2021

danwinship left a comment

Choose a reason for hiding this comment

danwinship Jul 28, 2021

Choose a reason for hiding this comment

squeed Jul 28, 2021

Choose a reason for hiding this comment

squeed commented Jul 28, 2021

openshift-ci bot commented Jul 28, 2021 • edited

danwinship commented Jul 30, 2021

openshift-ci bot commented Jul 30, 2021

squeed commented Aug 3, 2021

squeed commented Aug 3, 2021

openshift-ci bot commented Aug 3, 2021

squeed commented Aug 3, 2021

openshift-ci bot commented Aug 3, 2021

squeed commented Aug 3, 2021

openshift-ci bot commented Aug 3, 2021

openshift-ci bot commented Jul 28, 2021 •

edited