Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kiali working with no Istiod access #5642

Merged
merged 56 commits into from
Jan 13, 2023
Merged

Kiali working with no Istiod access #5642

merged 56 commits into from
Jan 13, 2023

Conversation

josunect
Copy link
Contributor

@josunect josunect commented Nov 24, 2022

Proposal as a first version to have Kiali running with no Istiod.

Details in: https://docs.google.com/document/d/1XjkodS5Uu14C_1l3-j7iNaWfLufQFycY-cKMlagIxyc

Changes:

  • Not perform some calls if there is no access to istiod. They were making the system really slow (When istiod was not available):
    • get pod proxy status for workloads
    • Get registry services for services
    • Fetch registry services
  • Istio Config: Show a message indicating that Istiod is not available

Some tests to review performance: Frontend tests were showing a degraded performance so a new commit was sent to internally cache the istioAccess value

Fixes #5626

@Xunzhuo
Copy link
Member

Xunzhuo commented Nov 24, 2022

Great! If you want some reviews, ping me after this is ready! @josunect

@josunect
Copy link
Contributor Author

Great! If you want some reviews, ping me after this is ready! @josunect

Thanks, @Xunzhuo ! It would be great! I'll let you know.

@josunect josunect self-assigned this Nov 25, 2022
@josunect josunect added the enhancement This is the preferred way to describe new end-to-end features. label Nov 25, 2022
@josunect
Copy link
Contributor Author

josunect commented Nov 25, 2022

Changes when istiod is not available:

Istio no available messages

This remains the same. Should we change it to just a warning instead of error?
image

Service mesh version unknown

image

Workload status warning because no proxy pod sync

image

Istio Config not available

Show a message instead
image

No error requests
There are no error or time out requests.

Running locally with no istiod
I had to add some code in kubernetes/token to run in local environment. But not sure yet what would be the best solution to not have to modify the code all the time.

Probably it would require some more testing but I think it is ready for feedback and reviews

@josunect josunect marked this pull request as ready for review November 25, 2022 15:50
Copy link
Contributor

@nrfox nrfox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for picking up this investigation!

Comment on lines 180 to 189
func (in *K8SClient) HasIstioAccess() bool {
componentStatus, err := in.CanConnectToIstiod()
if err != nil {
return false
}
if len(componentStatus) == 0 {
return false
}
return true
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would be better as a config option for the registry where you can enable/disable the registry altogether if you'd like rather than having it just be a fallback option: https://github.com/kiali/kiali/blob/master/config/config.go#L208.

@hhovsepy may be looking into this but you should be able to get the istio config from the kube api rather than the registry when there's no istiod access.

What will be the impact on the controlplane namespace card once there is no direct istiod access? It might still be possible to get some details, like controlplane revision, about the controlplane from the injection webhook since this may be present in the cluster even if the controlplane is not.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Funny, I was just looking at the same thing. I also think it makes sense to configure Kiali for this scenario. By default Kiali would expect to be able to connect to istiod's debug endpoint and proceed normally. If it can't connect then that would be a FAILURE scenario for the control plane card. If configured to ignore istiod then it would be a DEGRADED scenario for the control plane card (even if it is accessible we won't connect, this could be for performance reasons, or whatever).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

draft solution of retrieving configs when istiod is not available #5649

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

draft solution of retrieving configs when istiod is not available #5649

Thanks, @hhovsepy !

Copy link
Collaborator

@jshaughn jshaughn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good progress @josunect , I think I agree with @nrfox about making this a more explicit configuration option for Kiali (see comments). What do you think?

@josunect
Copy link
Contributor Author

Good progress @josunect , I think I agree with @nrfox about making this a more explicit configuration option for Kiali (see comments). What do you think?

Thanks, @jshaughn and @nrfox for the review! I'll approach the changes and submit.

I've though in do it this way as it can be more robust in the sense that it can work even if Istiod is or is not available, but it is true that it is not possible to know the reason, can be intentional or not - a failure. So it make sense to do it more explicit as part of the configuration.

@josunect josunect added requires operator PR It requires update in operator code requires helm chart PR labels Nov 29, 2022
@josunect
Copy link
Contributor Author

josunect commented Nov 29, 2022

I've also added the following change:

image

View for CP card if Istio API is disabled:
image

And enabled:

image

Copy link
Member

@Xunzhuo Xunzhuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reviewed the backend codes, did not find any issues. LGTM. Wait for other reviews in front part :)

@josunect
Copy link
Contributor Author

I have reviewed the backend codes, did not find any issues. LGTM. Wait for other reviews in front part :)

Thank you, @Xunzhuo !

@hhovsepy
Copy link
Contributor

Is it supposed, that the case when 'istiod' is not running/accessible and the config "istio_api_enabled: true" will be handled? Because it throws "open /var/run/secrets/kubernetes.io/serviceaccount/token: no such file or directory" error message while selecting "istiod" workload.

@josunect
Copy link
Contributor Author

josunect commented Dec 1, 2022

Is it supposed, that the case when 'istiod' is not running/accessible and the config "istio_api_enabled: true" will be handled? Because it throws "open /var/run/secrets/kubernetes.io/serviceaccount/token: no such file or directory" error message while selecting "istiod" workload.

In that case, it won't be handled - That should be the default behavior. But, I will reproduce it.

@hhovsepy hhovsepy self-requested a review December 2, 2022 10:19
@hhovsepy
Copy link
Contributor

hhovsepy commented Dec 2, 2022

Facing a different behavior on OCP cluster, gateway timeouts on all requests:
Screenshot from 2022-12-02 13-55-40

Many errors in Kiali pod log:
1202 12:56:50.277723 1 reflector.go:324] pkg/mod/k8s.io/client-go@v0.24.2/tools/cache/reflector.go:167: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:istio-system:kiali-service-account" cannot list resource "pods" in API group "" at the cluster scope E1202 12:56:50.277759 1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.24.2/tools/cache/reflector.go:167: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:istio-system:kiali-service-account" cannot list resource "pods" in API group "" at the cluster scope W1202 12:56:51.683977 1 reflector.go:324] pkg/mod/k8s.io/client-go@v0.24.2/tools/cache/reflector.go:167: failed to list *v1beta1.AuthorizationPolicy: authorizationpolicies.security.istio.io is forbidden: User "system:serviceaccount:istio-system:kiali-service-account" cannot list resource "authorizationpolicies" in API group "security.istio.io" at the cluster scope

config/config.go Outdated Show resolved Hide resolved
business/workloads.go Outdated Show resolved Hide resolved
business/istio_validations.go Outdated Show resolved Hide resolved
@jmazzitelli
Copy link
Collaborator

Many errors in Kiali pod log: `1202 12:56:50.277723 1 reflector.go:324] pkg/mod/k8s.io/client-go@v0.24.2/tools/cache/reflector.go:167: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:istio-system:kiali-service-account" cannot list resource "pods" in API group "" at the cluster scope

@hhovsepy What is your deployment.accessible_namespaces set to? Is it ['**'] or a list of namespaces?

That error message says it wants permission to list pods AT THE CLUSTER SCOPE. Which means it expects (for some reason) a ClusterRole. The Kiali SA only gets bound to a ClusterRole when accessible_namespaces is set to ['**']. Otherwise, there are just Roles for each namespace listed in accessible_namespaces.

We never before required ClusterRole access to look at pods. So this error message seems strange. We should avoid this requirement if possible.

@hhovsepy
Copy link
Contributor

hhovsepy commented Dec 2, 2022

When setting accessible_namespaces: [ "**" ], Kiali pod fails to get ready, but this is not PR related:
W1202 17:07:02.688572 1 reflector.go:324] pkg/mod/k8s.io/client-go@v0.24.2/tools/cache/reflector.go:167: failed to list *v1.Deployment: deployments.apps is forbidden: User "system:serviceaccount:istio-system:kiali-service-account" cannot list resource "deployments" in API group "apps" at the cluster scope E1202 17:07:02.688614 1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.24.2/tools/cache/reflector.go:167: Failed to watch *v1.Deployment: failed to list *v1.Deployment: deployments.apps is forbidden: User "system:serviceaccount:istio-system:kiali-service-account" cannot list resource "deployments" in API group "apps" at the cluster scope W1202 17:07:03.229890 1 reflector.go:324] pkg/mod/k8s.io/client-go@v0.24.2/tools/cache/reflector.go:167: failed to list *v1.StatefulSet: statefulsets.apps is forbidden: User "system:serviceaccount:istio-system:kiali-service-account" cannot list resource "statefulsets" in API group "apps" at the cluster scope E1202 17:07:03.229935 1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.24.2/tools/cache/reflector.go:167: Failed to watch *v1.StatefulSet: failed to list *v1.StatefulSet: statefulsets.apps is forbidden: User "system:serviceaccount:istio-system:kiali-service-account" cannot list resource "statefulsets" in API group "apps" at the cluster scope

@josunect
Copy link
Contributor Author

It will need to test further after the merge. I see some errors like:

2023-01-10T15:01:54Z WRN Error getting proxy status from istiod. Proxy status may be stale. Err: unable to proxy Istiod pods. Make sure your Kubernetes API server has access to the Istio control plane through 8080 port

2023-01-10T15:03:54Z ERR Failed to call Istiod endpoint /debug/syncz error: unable to proxy Istiod pods. Make sure your Kubernetes API server has access to the Istio control plane through 8080 port

@josunect josunect added the docs It requires changes on kiali.io label Jan 10, 2023
@josunect josunect requested a review from nrfox January 11, 2023 08:01
nrfox
nrfox previously approved these changes Jan 11, 2023
frontend/src/components/IstioWizards/ServiceWizard.tsx Outdated Show resolved Hide resolved
Co-authored-by: Nick Fox <nick@nrfox.com>
Copy link
Contributor

@nrfox nrfox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@josunect josunect merged commit 7e95fbc into kiali:master Jan 13, 2023
@josunect josunect deleted the issue5626 branch January 13, 2023 06:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs It requires changes on kiali.io enhancement This is the preferred way to describe new end-to-end features. requires operator PR It requires update in operator code
Projects
Development

Successfully merging this pull request may close these issues.

Investigate the idea of Kiali working with no istiod connection
6 participants