New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add "kubectl validate" command to do a cluster health check. #6597
Conversation
I think Zach is in a better position to review this PR. |
|
||
// ClusterValidationState retrieves and parses the server's self-monitored cluster state. | ||
func (c *Client) ClusterValidationState() (*api.ServerStatusList, error) { | ||
body, err := c.Get().AbsPath("/validate").Do().Raw() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a TODO to make this hit a versioned validate endpoint?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this return an api.Status
? If not then DoRaw()
might be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is DoRaw() different from Do().Raw()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I discovered a bug where Do().Raw()
does the wrong thing when the non-apiStatus
JSON response includes a status
field as is the case with Elasticsearch. I refactored it into two distinct phases which first processes the response without assuming it is an api.Status
(this is done by DoRaw()
) and then interpret it as an api.Status
which is done by Do()
which now calls DoRaw()
. I think we should replace all occurrences of Do().Raw()
with DoRaw()
and remove the Raw()
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added TODO and changed to DoRaw().
Since this makes changes to the api and @zmerlynn is out this week, I'm going to bounce this over to @lavalamp for server-side review. @jlowdermilk might also want to take a look at the kubectl changes. |
f410b60
to
0b9e39c
Compare
Rebased and pushed the changes. PTAL. Thank you. |
@@ -128,7 +128,7 @@ func TestList(t *testing.T) { | |||
roundTripSame(t, item) | |||
} | |||
|
|||
var nonRoundTrippableTypes = util.NewStringSet("ContainerManifest", "ContainerManifestList") | |||
var nonRoundTrippableTypes = util.NewStringSet("ContainerManifest", "ContainerManifestList", "ServerStatus") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New objects should never appear in this list. You have a bug in your conversions if you need this.
I have pointed out some problems. Reassigning to @bgrant0607 to approve an API change. |
@@ -65,6 +65,7 @@ Find more information at https://github.com/GoogleCloudPlatform/kubernetes.`, | |||
cmds.AddCommand(NewCmdClusterInfo(f, out)) | |||
cmds.AddCommand(NewCmdApiVersions(f, out)) | |||
cmds.AddCommand(NewCmdVersion(f, out)) | |||
cmds.AddCommand(NewCmdValidateCluster(f, out)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have cluster-info
. How about cluster-validate
? validate
is just too generic, and we're likely to want to use that for validating user objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I take it back. If this is just component status, we should call it component-status
.
If we versioned the APIs, it could be:
kubectl get cs
More context please? The PR description and cited issue say almost nothing. |
This is going to be a breaking API change. Have you asked on kubernetes-dev whether anyone is using the current /validate API? |
@@ -1767,3 +1768,19 @@ func AddToNodeAddresses(addresses *[]NodeAddress, addAddresses ...NodeAddress) { | |||
} | |||
} | |||
} | |||
|
|||
// ServerStatus (and ServerStatusList) holds the cluster validation info. | |||
type ServerStatus struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove the json tags here. This is the internal rep.
5aa6371
to
9275dd2
Compare
@bgrant0607 Ping. Does the object model seem ok? |
@fabioy Will take a look tonight. In the future, feel free to ping me by direct email. I'm drowning in Github notifications. |
// Returns the list of component status. Note that the label and field are both ignored. | ||
// Note that this call doesn't support labels or selectors. | ||
func (rs *REST) List(ctx api.Context, label labels.Selector, field fields.Selector) (runtime.Object, error) { | ||
servers := rs.GetServersToValidate() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eventually I want the apiserver to be a dumb, CRUDy server. Eventually, a controller would need to implement all the business logic and post status back.
We'll need a registration mechanism for components. Right now, we sort of have a fixed set of components on one node, but that's about to change Real Soon Now. Already we have "add ons" that run on the cluster but which are required, or will be (DNS), and full API plugins are planned. The components are all being converted to static pods as we speak. Others are working on HA, so components will be replicated. Full self-hosting of those components is the eventual goal.
What should this return if there are 5 etcd instances, 3 apiservers, 2 schedulers, and 2 controller managers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wishing we had the experimental API prefix discussed in #6363 and elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"What should this return if there are 5 etcd instances, 3 apiservers, 2 schedulers, and 2 controller managers?"
The call is to the current master api server, which tries it best to return the status of the currently active components. It seems irrelevant to list the "stand-by" components. When HA become available, this "extended" reporting could be done under an option flag.
9275dd2
to
c6ef2dd
Compare
Addressed comments, sync'ed, rebased and pushed changes. PTAL. Thank you! |
LGTM. This PR has gone on long enough. This looks good enough to merge as a first round (and its certainly an improvement over what was there previously) To be honest, blocking this PR on making it into a real API seems kind of like Overkill to me, but I was out, and now its done so merge and move on... |
Add "kubectl validate" command to do a cluster health check.
AFAICT, this PR had unresolved feedback. Maybe changes weren't pushed? |
It also demonstrates another bug was introduced into verify-description.sh, since this should be failing it: |
|
||
type ComponentCondition struct { | ||
Type ComponentConditionType `json:"type" description:"the type of condition"` | ||
Status ConditionStatus `json:"status" description:"the status of this condition"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file shouldn't have description tags. Description tags need to be on all non-inline fields in the versioned types.go files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, I have the standard presubmit hooks. Filed issue #7080 for the "verify-description.sh" error. I'll prepare a separate PR and address these.
I've created a separate PR to resolve these issues. PR #7082. Thank you. |
Is this feature still work? how to use kubectl validate? In k8s 1.9.9, kubectl validate command not found |
This is the first step to resolving issue #5946.