Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Baremetal: Bug 1801238: Pull data from ironic inspector and annotate BareMetalHost #3591

Merged
merged 1 commit into from May 15, 2020

Conversation

imain
Copy link
Contributor

@imain imain commented May 13, 2020

Right now we don't get any introspection data about the masters
because they are provisioned by the bootstrap node and the information
is lost. This patch adds a script to pull the introspection data
from ironic running on the bootstrap and places it in the BareMetalHost CRs
for the masters. This is done via an annotation as we cannot directly
update the Status in BMH. The annotation is picked up by the BMO
and added to the BMH CRs.

There are several bugs filed about this:

openshift-metal3/dev-scripts#917
https://bugzilla.redhat.com/show_bug.cgi?id=1801238

@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 13, 2020
@imain imain force-pushed the installer_inspection branch 3 times, most recently from bbf11a7 to be75551 Compare May 13, 2020 19:43
@imain
Copy link
Contributor Author

imain commented May 13, 2020

/assign @hardys

Copy link
Member

@stbenjam stbenjam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! A couple small comments

Shellcheck has a few nits:

./data/data/bootstrap/baremetal/files/usr/local/bin/master-bmh-update.sh:3:1: note: Not following: /usr/local/bin/release-image.sh was not specified as input (see shellcheck -x). [SC1091]
./data/data/bootstrap/baremetal/files/usr/local/bin/master-bmh-update.sh:27:15: note: Double quote to prevent globbing and word splitting. [SC2086]
./data/data/bootstrap/baremetal/files/usr/local/bin/master-bmh-update.sh:28:49: note: Double quote to prevent globbing and word splitting. [SC2086]
./data/data/bootstrap/baremetal/files/usr/local/bin/master-bmh-update.sh:41:15: note: Double quote to prevent globbing and word splitting. [SC2086]
./data/data/bootstrap/baremetal/files/usr/local/bin/master-bmh-update.sh:49:34: note: Double quote to prevent globbing and word splitting. [SC2086]

@imain
Copy link
Contributor Author

imain commented May 13, 2020

I don't understand that shellcheck failure. Not sure how to fix it..

@stbenjam
Copy link
Member

One thing I noticed is the master-bmh-update doesn't seem to start on the bootkube, didn't dig into why but it just shows as 'inactive'.

@imain imain force-pushed the installer_inspection branch 2 times, most recently from 2e0a558 to 472dcd7 Compare May 14, 2020 18:24
@imain imain changed the title WIP: Baremetal: Pull data from ironic inspector and annotate BareMetalHost Baremetal: Pull data from ironic inspector and annotate BareMetalHost May 14, 2020
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 14, 2020
@stbenjam
Copy link
Member

/label platform/baremetal

@openshift-ci-robot openshift-ci-robot added the platform/baremetal IPI bare metal hosts platform label May 14, 2020
@imain imain force-pushed the installer_inspection branch 4 times, most recently from b10994a to 73c52db Compare May 14, 2020 19:26
@imain
Copy link
Contributor Author

imain commented May 14, 2020

/assign @abhinavdahiya

Comment on lines +336 to +337
"ironic.service": {},
"master-bmh-update.service": {},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can there be some wants/depends_on relationship between these 2 services that allows us to not activate each individual service? or are these actually 2 independent services?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are 2 independent .service files for them if that's what you mean. I did try it with a dependency but the service was just marked as disabled and wouldn't start.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

master-bmh-update talks to ironic, but they are independent with different goals. I think it's more obvious to leave both explicitly enabled


# Wait for 3 masters to appear.
while [ "$(curl -s http://localhost:6385/v1/nodes | jq '.nodes[] | .uuid' | wc -l)" -lt 3 ]; do
echo waiting for at least 3 masters.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, one more thought here: 3 might be a bad assumption. Single master installs have historically been possible and used by developers with minimal hardware. It's unsupported though.

We could maybe count how many BareMetalHost CR's there are with master in the name, or read the actual value from /opt/openshift/manifests/cluster-config.yaml.

@abhinavdahiya might have thoughts here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or just check even for 1? If there's 1 it will not be 'active'.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll test it and see if that works

Copy link
Member

@stbenjam stbenjam May 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea that's probably good enough. I worried about a race but the next step takes like 10 minutes at least, so the installer will have created all the nodes by then

Copy link
Contributor Author

@imain imain May 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes ok. I tested it with -lt 1, meaning we just need 1 node to come up and it continues. I also didn't use any sleep in the loop so it queried constantly to check for possible race conditions. As soon as a node comes up its state is 'enroll', and then moves to introspection states, so I think we are safe race-wise looking for 1 node and then on to the next check; which is based on all node states being "active".

@stbenjam
Copy link
Member

stbenjam commented May 14, 2020

Got a good local test. You can test with the baremetal-operator from git if you check it out to $HOME, replace "base" with "ubi8" in the dockerfile and set this:

KNI_INSTALL_FROM_GIT=true
BAREMETAL_OPERATOR_LOCAL_IMAGE=file:///home/notstack/baremetal-operator
OPENSHIFT_RELEASE_IMAGE=registry.svc.ci.openshift.org/ocp/release:4.5.0-0.nightly-2020-05-14-021132

MY MASTER MACHINES HAVE NODES!!

$ oc get machines -n openshift-machine-api -o wide                           
NAME                    PHASE         TYPE   REGION   ZONE   AGE     NODE                                 PROVIDERID   STATE
ostest-master-0         Running                              49m     master-0.ostest.test.metalkube.org                
ostest-master-1         Running                              49m     master-1.ostest.test.metalkube.org                
ostest-master-2         Running                              49m     master-2.ostest.test.metalkube.org     

The status has all the IP info and node ref:

status:
  addresses:
  - address: fd00:1101::2c
    type: InternalIP
  - address: fd2e:6f44:5dd8:c956::14
    type: InternalIP
  - address: master-0.ostest.test.metalkube.org
    type: Hostname
  - address: master-0.ostest.test.metalkube.org
    type: InternalDNS
  lastUpdated: "2020-05-14T20:31:33Z"
  nodeRef:
    kind: Node
    name: master-0.ostest.test.metalkube.org
    uid: 3f3bbf75-d6cc-4111-9016-d034b16ded87

@imain
Copy link
Contributor Author

imain commented May 14, 2020

🤘 ;-)

…BareMetalHost

Right now we don't get any introspection data about the masters
because they are provisioned by the bootstrap node and the information
is lost.  This patch adds a script to pull the introspection data
from ironic running on the bootstrap and places it in the BareMetalHost CRs
for the masters.  This is done via an annotation as we cannot directly
update the Status in BMH.  The annotation is picked up by the BMO
and added to the BMH CRs.

There are several bugs filed about this:

openshift-metal3/dev-scripts#917
https://bugzilla.redhat.com/show_bug.cgi?id=1801238
@stbenjam
Copy link
Member

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label May 15, 2020
# shellcheck disable=SC1091
. /usr/local/bin/release-image.sh

export KUBECONFIG=/etc/kubernetes/kubeconfig
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

# Wait for a master to appear.
while [ "$(curl -s http://localhost:6385/v1/nodes | jq '.nodes[] | .uuid' | wc -l)" -lt 1 ]; do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also do we expect the RHCOS image to ship jq ? that's odd.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why’s that odd? It’s there.

# the BareMetalHost CRs as annotations, which BMO then picks up.
HARDWARE_DETAILS=$(podman run --quiet --net=host \
--rm \
--name baremetal-operator \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you need to use a specific name? there are times where names container will fail if previous one was not cleaned up..

"${BAREMETAL_OPERATOR_IMAGE}" \
http://localhost:5050/v1 "$node" | jq '{hardware: .}')

oc annotate --overwrite -n openshift-machine-api baremetalhosts "$name" 'baremetalhost.metal3.io/status'="$HARDWARE_DETAILS"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if you get an error when trying to update, like maybe machine-api isn't registered yet?

Copy link
Member

@stbenjam stbenjam May 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first thing we check at the start of the script is for the presence of the CR’s

@openshift-ci-robot
Copy link
Contributor

@imain: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/prow/e2e-metal-ipi 73c52db link /test e2e-metal-ipi
ci/prow/e2e-ovirt fb6b325 link /test e2e-ovirt
ci/prow/e2e-openstack fb6b325 link /test e2e-openstack
ci/prow/e2e-aws-scaleup-rhel7 fb6b325 link /test e2e-aws-scaleup-rhel7
ci/prow/e2e-libvirt fb6b325 link /test e2e-libvirt

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@abhinavdahiya
Copy link
Contributor

/approve
this script looks very brittle to me.

This is done via an annotation as we cannot directly
update the Status in BMH.

to update the status subresources, you need to use the k8s api clients and not kubectl/oc.. and i think it would be worth it imo to move code to go if you can instead of this bash.

Right now we don't get any introspection data about the masters
because they are provisioned by the bootstrap node and the information
is lost.

i'm really confused how the data get's lost. couldn't we get the same data from ironic service when the controller is running on the control-plane? It's just not clear to me.

@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: abhinavdahiya

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 15, 2020
@openshift-merge-robot openshift-merge-robot merged commit aecb920 into openshift:master May 15, 2020
@openshift-ci-robot
Copy link
Contributor

@imain: All pull requests linked via external trackers have merged: openshift/installer#3591. Bugzilla bug 1801238 has been moved to the MODIFIED state.

In response to this:

Baremetal: Bug 1801238: Pull data from ironic inspector and annotate BareMetalHost

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@stbenjam
Copy link
Member

stbenjam commented May 15, 2020

i'm really confused how the data get's lost. couldn't we get the same data from ironic service when the controller is running on the control-plane?

We would have to run inspection again.

Plans to collect the data in the cluster from ironic were outright rejected because of the overlap with node-exporter, even though it’s not the same data and we need the data to even make the node/machine link.

The ironic database is MySQL, and we treat it as ephemeral. It’s database isn’t migrated to the cluster. That works fine for everything except inspection information.

We are aware you can use the k8s API. But the metal3 community explicitly created the annotation for this purpose to make it easy to move hardware between clusters. I don’t see why we shouldn’t use it here as the use case is very similar.

Trust me when I say we’ve invested a significant amount of time in this problem and we had many other better solutions that didn’t pan out. This was the best fix to address the problem in the short term.

@hardys
Copy link
Contributor

hardys commented May 15, 2020

i'm really confused how the data get's lost. couldn't we get the same data from ironic service when the controller is running on the control-plane? It's just not clear to me.

@abhinavdahiya the introspection data can only be collected by rebooting the nodes, which normally happens right before deployment - the same ramdisk we use to deploy RHCOS is what collects this data.

For the masters, this happens via terraform, using ironic on the bootstrap VM, so by the time the masters are all up and running, the bootstrap VM is gone and we can't reboot/redeploy the controlplane just to collect the introspection data.

So the ironic service running on the cluster doesn't have this data for the masters (the master nodes are "adopted" by the cluster ironic, which doesn't do any introspection). This PR wires the data from the bootstrap VM into the BMH objects created by the installer for the masters.

@stbenjam
Copy link
Member

/cherry-pick release-4.4

@openshift-cherrypick-robot

@stbenjam: new pull request created: #3656

In response to this:

/cherry-pick release-4.4

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

hardys pushed a commit to hardys/installer that referenced this pull request Jul 28, 2020
This was mentioned in openshift#3591 but not addressed during the review
before merging
hardys pushed a commit to hardys/installer that referenced this pull request Jul 28, 2020
This was also mentioned in openshift#3591 as a potential issue in the case
where a restart of the script/unit results in a stale container
still running with a conflicting name.

Since we specify --rm the container is removed so having the
human readable name for debugging isn't an issue.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged. platform/baremetal IPI bare metal hosts platform
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants