Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix machine-api resources #118

Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 15 additions & 4 deletions collection-scripts/gather
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,10 @@
# Resource List
resources=()

# Cluster Version Information
# machine-api resources
machineresources=()

# Cluster Version Information
resources+=(clusterversion ns/openshift-cluster-version)

# Operator Resources
Expand All @@ -12,8 +15,11 @@ resources+=(clusteroperators)
# Certificate Resources
resources+=(certificatesigningrequests)

# Machine/Node Resources
resources+=(nodes machines machineconfigs machineconfigpools)
# MCO/Node Resources
resources+=(nodes machineconfigs machineconfigpools)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a relation ship between these?
IE: If we get a node; do we need to know its machineconfigs or machineconfigpools?

If so - it might be better to put this relationship into to the must-gather binary so every time a node is collected *(or one of the other objects); the proper related objects are collected.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't speak for everyone though I believe machineconfigs and machineconfigpools are namespace-less. That's why we need to query for machines separately since they live only under openshift-machine-api namespace.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@runcom FYI ^^

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@ingvagabund ingvagabund Aug 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sferich888 I have a feeling it's impossible to know which machine config was used for a given node just by quering node. Based on posted [1], I don't think node object has any field/label/annotation that can say which endpoint was used. Also, worker nodes have different image than master nodes. Each image knows an endpoint (most likely it's hard-coded in it when it's baked). Due to security reason so you can't pretend to be master when you are worker. Or, am I wrong @runcom @ashcrow ?

[1] https://github.com/openshift/machine-config-operator/blob/master/docs/MachineConfigServer.md#detailed-design

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This patch has little to do with machineconfigs and everything to do with machines.

Could we automatically collect these resources when collecting nodes? Yes, but if what you're trying to troubleshoot is a machine that never became a node, that's not going to help. We should explicitly collect all these types.


# machine-api resources
machineresources+=(machines machinesets)

# Namespaces/Project Resources
resources+=(ns/default ns/openshift ns/kube-system ns/openshift-etcd)
Expand All @@ -25,13 +31,18 @@ resources+=(storageclasses persistentvolumes volumeattachments)
resources+=(clusternetworks hostsubnets)

# Autoscaler Resources
resources+=(clusterautoscaler machineautoscaler)
machineresources+=(clusterautoscaler machineautoscaler)

# Run the Collection of Resources using must-gather
for resource in ${resources[@]}; do
/usr/bin/openshift-must-gather inspect ${resource}
done

# Run the Collection of MachineAPIResources using must-gather
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why collect this separate? Why not just included this in resources?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sferich888 I'm not sure if it's intended to work as-is, but without this patch, we don't get any machines or machinesets from must-gather utility. Is there a better way? Note, 'machines' is already included in Machine/Node resources, but it doesn't collect any machines (aka, a bug).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to https://docs.openshift.com/container-platform/4.1/machine_management/applying-autoscaling.html the clusterautoscaler is something you would find in the default namespace while the machineautoscaler is something you find out of the openshift-machine-api namespace.

Should we just do some loop or update the loop above to get things from all namespaces?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for resource in ${machineresources[@]}; do
/usr/bin/openshift-must-gather inspect ${resource} --namespace openshift-machine-api
done

# Collect System Audit Logs
/usr/bin/gather_audit_logs

Expand Down