Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OLM - ci -test. Create cdi-olm-catalog to be able to deploy cdi via o… #862

Merged
merged 8 commits into from
Jul 15, 2019

Conversation

annastopel
Copy link
Contributor

…lm in ci. This is due to okd 4.1.

  • Add make tagret docker-olm-catalog

  • Add build-olm-catalog script that builds tree expected by operator-registry.

  • add catalogsource manifest to deploy operator registry per provider

    • os
    • k8s
  • update olm documentation with opertor-registry deployment

What this PR does / why we need it:

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Special notes for your reviewer:

This pr enforces CDI installation via OLM on okd-4.1.0 in cdi ci.
This is accomplished by deploying via CatalogSource that points to operator-registry container image that holds cdi OLM manifests.

  • In order to build docker container image new make target was introduced docker-olm-catalog
  • In order to pack OLM bundle in directory structure expected by operator-registry hack/build/build-cdi-olm-catalog.sh is introduced.
  • During CDI ci OLM manifests are built from scratch and not from bundle in kunevirt/cdi-operatorhub. The default version is 0.0.0 (van be chaged to something like 0.0.0-dev)

Release note:

NONE

@kubevirt-bot kubevirt-bot added the release-note-none Denotes a PR that doesn't merit a release note. label Jun 20, 2019
@annastopel annastopel requested a review from mhenriks June 20, 2019 12:49
@annastopel annastopel assigned awels and annastopel and unassigned awels Jun 20, 2019
@annastopel annastopel force-pushed the cdi-okd-ci branch 3 times, most recently from d93679c to 82c8167 Compare June 20, 2019 13:00
Copy link
Member

@awels awels left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few questions:

  1. Do we install the olm catalog container even on providers that do not use olm to install cdi?
  2. If we are using a provider that installs cdi with olm, and I am making a change to cdi, and I sync, will it immediately update cdi, or do I have to wait for olm to see that a new version is there before it updates.


function install_cdi {
#Install CDI via OLM
_kubectl create ns cdi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CI will randomize the namespace this is installed in, so we should use the NAMESPACE variable for this. This also allows installation in a different namespace like the hco does.

Also this is a thought I am having. Why not make a separate install.sh. In that script define install_cdi_operator and install_cdi_olm, and wait_cdi_crd_installed. All of those are related operations. Also add an install_cdi that looks at a specific environment variable, and pick either install_cdi_olm or install_cdi_operator (default to operator if not specified). Then we can source that in ephemeral-provider.sh and the external provider.sh. Then in each provider set the environment variable to what we want.

We can prescribe what the environment variable value should be for our CI providers, and the user of the external provider can decide if their external cluster has olm or not. This gives us maximum flexibility.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious, why does CI randomize the install namespace? I get randomized namespaces for creating objects during tests, but do we need randomized install namespaces? I'm just curious as all, I've wondered about that before so thought I'd ask .

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its to stop lazy developers from hard coding the cdi namespace in their tests causing them to fail if someone installs cdi in a different namespace. The randomization is to catch the hard coded cdi namespace if a reviewer misses it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CI will randomize the namespace this is installed in, so we should use the NAMESPACE variable for this. This also allows installation in a different namespace like the hco does.

Also this is a thought I am having. Why not make a separate install.sh. In that script define install_cdi_operator and install_cdi_olm, and wait_cdi_crd_installed. All of those are related operations. Also add an install_cdi that looks at a specific environment variable, and pick either install_cdi_olm or install_cdi_operator (default to operator if not specified). Then we can source that in ephemeral-provider.sh and the external provider.sh. Then in each provider set the environment variable to what we want.

We can prescribe what the environment variable value should be for our CI providers, and the user of the external provider can decide if their external cluster has olm or not. This gives us maximum flexibility.

Great idea! - see dedicated commit in the pr

@@ -9,3 +9,9 @@ re='^-?[0-9]+$'
if ! [[ $num_nodes =~ $re ]] || [[ $num_nodes -lt 1 ]] ; then
num_nodes=1
fi

function install_cdi {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we are not using OLM to install for k8s then? Just the OKD provider will have it (for now?) That is totally fine, but I would put the install_cdi function in the ephemeral-provider, and override it with the olm install in the OKD provider. That way we are not duplicating this line in the other providers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I added install_cdi to ephemeral provider and overrode it in okd4.1, however there is still code duplication in external provider script

@awels
Copy link
Member

awels commented Jun 20, 2019

Also travis appear unhappy about some missing file.

Copy link
Contributor

@j-griffith j-griffith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple of minor nits, and Alexanders points of course. Looks pretty good though as I wade through OLM more.


function install_cdi {
#Install CDI via OLM
_kubectl create ns cdi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious, why does CI randomize the install namespace? I get randomized namespaces for creating objects during tests, but do we need randomized install namespaces? I'm just curious as all, I've wondered about that before so thought I'd ask .

fi

# Need to set the DOCKER_PREFIX appropriately in the call to `make docker push`, otherwise make will just pass in the default `kubevirt`
QUAY_NAMESPACE=$QUAY_NAMESPACE DOCKER_PREFIX=$MANIFEST_REGISTRY PULL_POLICY=$(getTestPullPolicy) make manifests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the seperate QUAY_NAMESPACE? We could just set DOCKER_PREFIX=quay.io/xxxxx

This also brings back my earlier comment that we should probably change DOCKER_PREFIX to REGISTRY_NAME or something more general. This isn't your problem and shouldn't impact your patch but capturing the point here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QUAY_NAMESPACE is the namespace in quay where CDI OLM bundle is located. It is not necessarily equals to DOCKER_PREFIX

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Johns point is, that we either use Quay OR Docker but not both at the same time, so having two separate variables is sort of weird, it would make sense to have a 'REGISTRY_NAME' or something like instead of DOCKER_PREFIX and QUAY_NAMESPACE if they both serve the same purpose but for a different registry provider.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Johns point is, that we either use Quay OR Docker but not both at the same time, so having two separate variables is sort of weird, it would make sense to have a 'REGISTRY_NAME' or something like instead of DOCKER_PREFIX and QUAY_NAMESPACE if they both serve the same purpose but for a different registry provider.

But we do utilize both dockerhub and quay - we push all cdi images to repo in dockerhub and OLM bundle to quay

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, my point was as @awels mentioned why both, I get that we do push to both, but we don't push to both in the same build iteration. So the idea I was getting at was use a single variable and set it appropriately for the build we're doing.

If you're doing parallel buld/push operations then it would make more sense to still use a single image registry variable but make it a list and just do build, then push to 'n' registries from the list. We should probably be doing something like that anyway if we're not already rather than run a seperate build/release process for each registry.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed we do need both DOCKER_PREFIX and QUAY_REPOSITORY variables in the same make manifests iteration. DOCKER_PREFIX for container images pushed to docker and QUAY related variables for olm manifests and bundle

install_cdi

#wait cdi crd is installed with 120s timeout
wait_cdi_crd_installed 120
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably make this a variable that can also be set via env variable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is similar to timeout we have when waiting for CDI cr to be installed. This is utilized only during cluster-sync on kubevirtci, that is why I did not make variable out of it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well you just made it so this also works for when you cluster-sync to an external provider, and since we have little to no control over the external provider, it would be nice to give the external provider user as many knobs they can change to make it work for their cluster.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is similar to timeout we have when waiting for CDI cr to be installed. This is utilized only during cluster-sync on kubevirtci, that is why I did not make variable out of it

Fair enough, not a big deal fro me, I was justthinking if it was a declared var at the top of the script file it's easy to identify and bump up or down if we need to in the future, or more importantly if somebody ever says "Hey Griffith, what's the timeout duration for cluster-sync?". I can just open the file and "BOOM" it hits me right in the face. Your call though I certainly wouldn't hold this up any longer over it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added Variable for CD_INSTALL_TIMEOUT

@annastopel
Copy link
Contributor Author

annastopel commented Jun 24, 2019

A few questions:

  1. Do we install the olm catalog container even on providers that do not use olm to install cdi?
  2. If we are using a provider that installs cdi with olm, and I am making a change to cdi, and I sync, will it immediately update cdi, or do I have to wait for olm to see that a new version is there before it updates.
  1. We install CDI in ci only on okd4.1. It is possible to deploy CDI via OLM on k8s provider as well. Instructions are provided in md file
  2. Cluster sync on clean phase will delete CDI crd which will uninstall operator and all OLM related instances. After this it will deploy CDI with new images updated on sync.

@annastopel annastopel force-pushed the cdi-okd-ci branch 3 times, most recently from 979fb8d to cf61ea2 Compare June 24, 2019 10:55
Copy link
Member

@awels awels left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general looks good, I think you can get rid of the install.sh files in the provider directories.

@@ -0,0 +1 @@
CDI_INSTALL=${CDI_INSTALL_OPERATOR}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need the install.sh files for the providers, you can put

CDI_INSTALL=${CDI_INSTALL_OPERATOR}

In the provider.sh right before you source cluster-sync/ephemeral_provider.sh

fi

# Need to set the DOCKER_PREFIX appropriately in the call to `make docker push`, otherwise make will just pass in the default `kubevirt`
QUAY_NAMESPACE=$QUAY_NAMESPACE DOCKER_PREFIX=$MANIFEST_REGISTRY PULL_POLICY=$(getTestPullPolicy) make manifests
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Johns point is, that we either use Quay OR Docker but not both at the same time, so having two separate variables is sort of weird, it would make sense to have a 'REGISTRY_NAME' or something like instead of DOCKER_PREFIX and QUAY_NAMESPACE if they both serve the same purpose but for a different registry provider.

install_cdi

#wait cdi crd is installed with 120s timeout
wait_cdi_crd_installed 120
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well you just made it so this also works for when you cluster-sync to an external provider, and since we have little to no control over the external provider, it would be nice to give the external provider user as many knobs they can change to make it work for their cluster.

Copy link
Contributor

@j-griffith j-griffith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a couple of comments/suggestions but I don't think any of them are really worth blocking this any longer. I'm happy to approve/lgtm in the morning with or without changes to it. We can always iterate on it in the future. Thanks for being patient, I should've gotten back to this a long time ago.

@@ -0,0 +1,12 @@
FROM quay.io/openshift/origin-operator-registry
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about using an tag here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done - set to tag 4.2.0

Copy link
Member

@mhenriks mhenriks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very good I just think we should use a tagged image marketplace container


packBundles ${OLM_MANIFESTS_SRC_PATH}


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra whitespace

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

displayName: KubeVirt CDI
publisher: Red Hat


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra whitespace

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

displayName: KubeVirt CDI
publisher: Red Hat


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra whitespace

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

@annastopel
Copy link
Contributor Author

ci test please

1 similar comment
@annastopel
Copy link
Contributor Author

ci test please

@kubevirt-bot kubevirt-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 4, 2019
…lm in ci. This is due to okd 4.1.

  * Add make tagret docker-olm-catalog
  * Add build-olm-catalog script that builds tree expected by operator-registry.

  * add catalogsource manifest to deploy operator registry per provider
    * os
    * k8s
  * update olm documentation with opertor-registry deployment
* cluster-sync/install.sh - contains all supported installation techniques
  and a install_cdi method that derives the technique based on CDI_INSTALL variable
	- operator manifests
        - OLM manifests
* per provider there is cluster-sync/${PROVIDER}/install.sh that has the settting for CDI_INSTALL
  for specific provider
* default technique is CDI installation via operator
* support case when olm bundle contains more than one crd version
@kubevirt-bot kubevirt-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 14, 2019
@annastopel
Copy link
Contributor Author

ci test please

@annastopel
Copy link
Contributor Author

@awels @mhenriks please review

@j-griffith
Copy link
Contributor

ci test please

Copy link
Member

@mhenriks mhenriks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@j-griffith
Copy link
Contributor

/lgtm

@j-griffith
Copy link
Contributor

I'd like to get a clean CI run before merging, let's see how it goes.

@j-griffith
Copy link
Contributor

j-griffith commented Jul 15, 2019

ci test please

@j-griffith j-griffith merged commit af4c3f9 into kubevirt:master Jul 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note-none Denotes a PR that doesn't merit a release note. size/XXL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants