Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: image list for offline installation #8596

Merged
merged 1 commit into from Sep 9, 2021

Conversation

subhamkrai
Copy link
Contributor

Description of your changes:
create file tests/scripts/rook-ceph-image.txt which
will have list of images required for offline installation.

Closes: #6406
Signed-off-by: subhamkrai srai@redhat.com

Which issue is resolved by this Pull Request:
Resolves #6406

Checklist:

  • Commit Message Formatting: Commit titles and messages follow guidelines in the developer guide.
  • Skip Tests for Docs: Add the flag for skipping the build if this is only a documentation change. See here for the flag.
  • Skip Unrelated Tests: Add a flag to run tests for a specific storage provider. See test options.
  • Reviewed the developer guide on Submitting a Pull Request
  • Documentation has been updated, if necessary.
  • Unit tests have been added, if necessary.
  • Integration tests have been added, if necessary.
  • Pending release notes updated with breaking and/or notable changes, if necessary.
  • Upgrade from previous release is tested and upgrade user guide is updated, if necessary.
  • Code generation (make codegen) has been run to update object specifications, if necessary.

Copy link
Member

@travisn travisn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about the following approach?

  • Generate the image list as part of make build
  • If the list changes, the dev will need to submit it with their PR, for example when the CSI images change or when we have the release version update.
  • Commit an images.txt to the example manifest folder. Those who need the image list can find it in a predictable place in the repo for every release tag going forward.
  • The official build should fail if there are any required changes to the image list. (We already have a check for changed files in the official build, right?)

This will avoid any extra steps in the release process.

@@ -40,3 +40,6 @@ jobs:
FROM_BRANCH: ${{ env.FROM_BRANCH }}
TO_TAG: ${{ env.TO_TAG }}
run: tests/scripts/gen_release_notes.sh

- name: Images for offline installation
run: awk '/image:/ {print $2}' cluster/examples/kubernetes/ceph/operator.yaml cluster/examples/kubernetes/ceph/cluster.yaml | tee -a tests/scripts/rook-ceph-image.txt && awk '/quay.io/ || /k8s.gcr.io/ {print $3}' cluster/examples/kubernetes/ceph/operator.yaml | tr -d '"' | tee -a tests/scripts/rook-ceph-image.txt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be broken up into multiple lines for easier reading like below?

command | \
  tee blah
next command | \
  tr etc | tee etc

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a better way to write the below portion so that we are not going to face problems if we add reference dockerhub image in the future?

awk '/quay.io/ || /k8s.gcr.io/ {print $3}' cluster/examples/kubernetes/ceph/operator.yaml

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of a file somewhere, can we add this as an asset to the release? See the "Assets" for instance here https://github.com/rook/rook/releases/tag/v1.7.2.
If we manage to add this automatically it's a win for each release/branch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can do it automatically but the request was having a text file comment but the asset should work too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on this approach? It's fine if we can automatically attach an asset to the release, it just can't be a manual process to add to the release artifacts or else it will be easily missed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

going with this approach

.github/workflows/create-tag.yaml Outdated Show resolved Hide resolved
@@ -0,0 +1,16 @@
quay.io/cephcsi/cephcsi:v3.4.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this going into tests/scripts? Is this not intended (or useful) for users?

If we are doing this, I would put it in a place where users should be empowered to find it (cluster/examples/kubernetes/ceph?) and add to the docs that it exists.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

putting file in cluster/examples/kubernetes/ceph now.

k8s.gcr.io/sig-storage/csi-snapshotter:v4.1.1
k8s.gcr.io/sig-storage/csi-attacher:v3.2.1
quay.io/csiaddons/volumereplication-operator:v0.1.0
rook/ceph:master
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This image will change for releases right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, each branch should have its own version. Because this was generated on the master branch we get the master tag.

# remove the file if already exists
[ -f ../../cluster/examples/kubernetes/ceph/rook-ceph-image.txt ] && rm ../../cluster/examples/kubernetes/ceph/rook-ceph-image.txt;\
awk '/image:/ {print $2}' ../../cluster/examples/kubernetes/ceph/operator.yaml ../../cluster/examples/kubernetes/ceph/cluster.yaml | \
cut -d: -f2- |\
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to add this cut -d: -f2- because when running older command(cut -d: -f2-) from running from cli has no issue but when running from make build it generates in this format

          image: rook/ceph:master
    image: quay.io/ceph/ceph:v16.2.5
  # ROOK_CSI_CEPH_IMAGE: quay.io/cephcsi/cephcsi:v3.4.0
  # ROOK_CSI_REGISTRAR_IMAGE: k8s.gcr.io/sig-storage/csi-node-driver-registrar:v2.2.0
  # ROOK_CSI_RESIZER_IMAGE: k8s.gcr.io/sig-storage/csi-resizer:v1.2.0
  # ROOK_CSI_PROVISIONER_IMAGE: k8s.gcr.io/sig-storage/csi-provisioner:v2.2.2
  # ROOK_CSI_SNAPSHOTTER_IMAGE: k8s.gcr.io/sig-storage/csi-snapshotter:v4.1.1
  # ROOK_CSI_ATTACHER_IMAGE: k8s.gcr.io/sig-storage/csi-attacher:v3.2.1
  # CSI_VOLUME_REPLICATION_IMAGE: quay.io/csiaddons/volumereplication-operator:v0.1.0

@subhamkrai
Copy link
Contributor Author

How about the following approach?

  • Generate the image list as part of make build
  • If the list changes, the dev will need to submit it with their PR, for example when the CSI images change or when we have the release version update.
  • Commit an images.txt to the example manifest folder. Those who need the image list can find it in a predictable place in the repo for every release tag going forward.
  • The official build should fail if there are any required changes to the image list. (We already have a check for changed files in the official build, right?)

just to verify above, I didn't add the filerook-ceph-image.txt to see if it is failing. I found it is not failing with exit 1.

## Build images for all platforms.'
+ tests/scripts/validate_modified_files.sh build
+ CODEGEN_ERR='found codegen files! please run '\''make codegen'\'' and update your PR'
+ MOD_ERR='changes found by mod.check. You may need to run make clean'
+ CRD_ERR='changes found by '\''make crds'\''. please run '\''make crds'\'' locally and update your PR'
+ BUILD_ERR='changes found by make build'\'', please commit your go.sum or other changed files'
+ case "$1" in
+ validate 'changes found by make build'\'', please commit your go.sum or other changed files'
++ git status --porcelain
+ git='?? cluster/examples/kubernetes/ceph/rook-ceph-image.txt'
changes found by make build', please commit your go.sum or other changed files
?? cluster/examples/kubernetes/ceph/rook-ceph-image.txt
+ for file in $git
+ '[' -n '??' ']'
+ echo 'changes found by make build'\'', please commit your go.sum or other changed files'
+ echo '?? cluster/examples/kubernetes/ceph/rook-ceph-image.txt'
+ git diff
+ exit 1

with exit 1 also tests are passing https://github.com/rook/rook/pull/8596/checks?check_run_id=3503686956

I found this https://github.community/t/no-output-on-process-completed-with-exit-code-1/123821/7

This will avoid any extra steps in the release process.

Copy link
Member

@travisn travisn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plan is to commit the txt file, right? Looks like the build is failing because it is missing.

images/ceph/Makefile Outdated Show resolved Hide resolved
@subhamkrai
Copy link
Contributor Author

The plan is to commit the txt file, right? Looks like the build is failing because it is missing.

Yes I did it intentionally to confirm if tests are failing, and if you notice build on Linux is passing

images/ceph/Makefile Outdated Show resolved Hide resolved
@subhamkrai subhamkrai force-pushed the offline-installation branch 3 times, most recently from 8737f6d to ac3ea7d Compare September 7, 2021 09:06
images/ceph/Makefile Outdated Show resolved Hide resolved
awk '/image:/ {print $2}' ../../cluster/examples/kubernetes/ceph/operator.yaml ../../cluster/examples/kubernetes/ceph/cluster.yaml | \
cut -d: -f2- |\
tee -a ../../cluster/examples/kubernetes/ceph/images.txt && \
awk '/quay.io/ || /k8s.gcr.io/ {print $3}' ../../cluster/examples/kubernetes/ceph/operator.yaml | \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we read these from the code instead? It is rare but sometimes happens that the manifests don't get updated when the default images in the code are updated. The sed to find these can then look for Default.*Image[[:whitespace:]]+=[[:whitespace:]]+"<image name>"$ (attempt at regex, but might have bugs).

From pkg/operator/ceph/csi/spec.go:

	DefaultCSIPluginImage         = "quay.io/cephcsi/cephcsi:v3.4.0"
	DefaultRegistrarImage         = "k8s.gcr.io/sig-storage/csi-node-driver-registrar:v2.2.0"
	DefaultProvisionerImage       = "k8s.gcr.io/sig-storage/csi-provisioner:v2.2.2"
	DefaultAttacherImage          = "k8s.gcr.io/sig-storage/csi-attacher:v3.2.1"
	DefaultSnapshotterImage       = "k8s.gcr.io/sig-storage/csi-snapshotter:v4.1.1"
	DefaultResizerImage           = "k8s.gcr.io/sig-storage/csi-resizer:v1.2.0"
	DefaultVolumeReplicationImage = "quay.io/csiaddons/volumereplication-operator:v0.1.0"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can read images from pkg/operator/ceph/csi/spec.go for some images but in the end, we have to fetch from manifest for complete images list. IMO manifests is the right place .

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the rook/ceph:<tag> image and ceph/ceph:<tag> image, the only place they are defined is in the manifests.

But it has happened more than once that the CSI image defined in spec.go is different from what is in the manifests. When (not if) that happens in the future, Rook will fail to start up for air-gapped users because they will have mirrored images that Rook isn't using by default. As far as we know, most users do use the internal default images, so it is paramount that these images are the ones we list in this file.


There are a couple other routes we could consider, probably for a future PR.

  1. We could add images from both the manifests and the Go code and remove duplicate tags. We could also run an analysis on the file to see if there are images listed with multiple version, and if that's the case, we know that the manifests and Go code haven't been updated both at the same time.
  2. (probably better) We could use the Go code to generate/modify the CSI_..._IMAGE vars that are commented-out in the manifests so they don't get out of sync.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your points. so can we go ahead with this PR and create a follow-up PR?

images/ceph/Makefile Outdated Show resolved Hide resolved
@subhamkrai subhamkrai force-pushed the offline-installation branch 2 times, most recently from d7b4816 to 326d491 Compare September 8, 2021 14:21
create file tests/scripts/rook-ceph-image.txt which
will have list of images required for offline installation.

Closes: rook#6406
Signed-off-by: subhamkrai <srai@redhat.com>
@BlaineEXE
Copy link
Member

Merging this so it can go into v1.7.3, and I'll create a new bug to take the images from the go code as a future update.

@BlaineEXE BlaineEXE merged commit 94eb451 into rook:master Sep 9, 2021
Comment on lines +135 to +138
tee -a $(MANIFESTS_DIR)/images.txt && \
awk '/quay.io/ || /k8s.gcr.io/ {print $3}' $(MANIFESTS_DIR)/operator.yaml | \
cut -d: -f2- |\
tr -d '"' | tee -a $(MANIFESTS_DIR)/images.txt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For future reference, this should be replaced with lines that get the CSI-related images from pkg/operator/ceph/csi/spec.go

travisn added a commit that referenced this pull request Sep 9, 2021
ci: image list for offline installation (backport #8596)
@subhamkrai subhamkrai deleted the offline-installation branch September 13, 2021 10:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Air-Gapped/Offline installation require much effort currently
4 participants