Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

We're revamping the GKE Services and Ingress UI. What would you like to see? #1073

Closed
mark-church opened this issue Apr 10, 2020 · 14 comments
Closed
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@mark-church
Copy link

Hi everybody the GKE Networking team is currently working on a revamp of the GKE Services and Ingress UI. We'd like to know, what would you like to see in the UI?

This area of the UI consists of a couple different pages. Feel free to be specific about things you don't like or would like to see.

  • Services list page
  • Ingress list page
  • Services detail page
  • Ingress detail page
  • Ingress creation flow

go_markchurch-content (1)

go_markchurch-content (2)

Some specific questions:

  • Most interaction with K8s occurs via CLI or through CI/CD. If you use the GKE Services & Ingress UI, what do you use it for?
  • What information would you like to see presented in this Ingress & Services space that's not today?
  • How could this UI be made better for your use-cases?
  • Have some ideas about what could make the UX/UI really awesome? Share them!
@frank-berlin
Copy link

That on the list pages "Rows per page" get remembered or has a user changeable default.

@CoderPraBhu
Copy link

  1. Most interaction with K8s occurs via CLI. Use GKE Services & Ingress UI to verify that everything is green. I use it to understand and explore the connections between ingress controller, url maps and forwarding rules. It also helps to shed some light on what else is connected to a particular ingress.

  2. On Ingress Details page, I only see hyperlinks for http target proxy and http forwarding rule, even thought following annotations are present:

ingress.kubernetes.io/https-forwarding-rule: k8s-fws-default-my-ingress--drandom3
ingress.kubernetes.io/https-target-proxy: k8s-tps-default-my-ingress--drandom3
  1. When and I hope soon, issue 1075 HTTP to HTTPS redirection is fixed, may be this information should be highlighted on ingress details page too.

@omaskery
Copy link

Apologies if this is out of scope (also I'm quite new to these technologies, so may be doing something silly 😁), but is it possible for you to add some way to see the gce-ingress logs on GKE as part of this work?

When using the GKE provided gce-ingress installation it feels like a black box when I make a mistake, because the gce-ingress Ingress Controller is inside the GKE black-box?

(We're having a lot of trouble currently understanding the interaction between gce-ingress and GCE load balancers 😄)

@mark-church
Copy link
Author

@omaskery and @CoderPraBhu thanks for the feedback. Converging and showing linkages between GKE and GCE LB resources is one of the biggest goals in the UI overhaul. Here are some of the things we are thinking of adding links for on the Ingress page:

  • HTTP access logs - these can be viewed with the following query in Cloud Logging. However, it's not immediately obvious to the user how this is done so we think a simple button that takes you here will help.
resource.type="http_load_balancer" resource.labels.url_map_name="k8s2-um-xxx"
  • LB metrics - Cloud Monitoring automatically generates a dashboard of metrics for GCE LBs. You can get to it with the following link. Again, this is not obvious to Ingress users so we are aiming to provide automatic links to this page from the GKE resources.
https://pantheon.corp.google.com/monitoring/dashboards/resourceDetail/l7_lb_rule,project_id:<project>,url_map_name:<url-map>?project=<project>&timeDomain=1h

@omaskery are you installing gce-ingress manually in the cluster? Is there any reason why you're not using the GKE-managed ingress controller?

Please keep the feedback coming, this is very helpful!

@omaskery
Copy link

@mark-church Hi, thanks for the reply!

We're using the GKE-managed ingress controller, but my understanding is that there is no way to access the logs from that controller. Does that sound correct? (I'd love to be wrong about this! 😁)

As a result, when we see unexpected behaviour (probably as a result of us making mistakes when applying k8s resources), it is hard to understand the sequence of events that are causing the behaviour.

This is in contrast to some of the (admittedly out-dated) documentation in this repo, which points out that diagnosing some issues is achieved by looking at the log output from the gce-ingress controller (which we can't access!).

The lack of visibility into how the controller is behaving has caused us to consider deploying gce-ingress manually, at additional cost, though we haven't yet resorted to that!

@mcfedr
Copy link
Contributor

mcfedr commented Apr 29, 2020

hi,

I really only deploy to k8s using CI - but the UI is really helpful for debugging, and with ingresses/services section its most often, why is this url not working

  • very often it appears to be that the load balancer isnt setup yet, its still in a pending state
  • or an ssl cert change hasnt propated
  • or a healthcheck is failing - often this happens because the wrong healthcheck url is set, or a service is responsding with a 204 instead of GCLB required 200,

So any of this information that could be displayed here would be super helpful - you cannot see anywhere at the moment pending changes - and the healthcheck issues take a lot of digging to get into and work out whats going on.

@iiro
Copy link

iiro commented Apr 29, 2020

Totally agree here with the previous feedbacks about the blackbox on what's really happening inside GCE when you're altering your built-in ingress. I guess it's not a UI only thing as I assume the same happens with Cloud SDK - but this is the worst feature of the built-in ingress, as "somewhere the magic just happens" - if happens, and you have no idea what's really going on...

@icco
Copy link

icco commented Apr 30, 2020

If I have lots of certs on an ingress, it's hard to see which ones are there. Also I would love QPS graphs on services and/or ingresses.

@mark-church
Copy link
Author

@icco we are currently working on integrating QPS and other metrics for Ingress in to the UI. Did you know they can be seen by going to /monitoring/dashboards/resourceDetail/l7_lb_rule,project_id:${project},url_map_name:${ingress-url-map}? You can get the URL map name with

kubectl get ingress ${ingress-name} -o yaml | grep url-map
    ingress.kubernetes.io/url-map: k8s2-um-4trmo3oq-default-hugo-ingress-8wpk3ncv

This dashboard has lots of stats such as latency, QPS, response codes, and throughput. It's not very easy to understand which URL map is connected with which Ingress though so we are looking at ways to make this relationship more apparent so that it's easier to get to these metrics from the Ingress page. I'd be interested to see which types of metrics you find most useful?

image

@iiro @omaskery we are looking in to making the Ingress events more verbose so that the ingress controller is not such a black box. Some of the things we are thinking about surfacing are letting you know when and which GCE resource (such as a forwardingRule) has been created, reconciled, or deleted. What are some other types of internal state about the controller and its operations that you would find useful?

cc @bowei @freehan @rramkumar1

@omaskery
Copy link

omaskery commented Jul 5, 2020

@mark-church is it possible to expose the log output of the GCE ingress controller? Then any existing documentation for troubleshooting the ingress controller would be potentially useable.

In particular, it would be useful to know errors/failures to reconcile. I'm afraid it's been a while since I was working in this area, so can't be more helpful - apologies for being so vague :(

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 3, 2020
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Nov 2, 2020
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

9 participants