Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ambient Mesh - Validate Telemetry #5978

Closed
Tracked by #5476
josunect opened this issue Apr 3, 2023 · 5 comments
Closed
Tracked by #5476

Ambient Mesh - Validate Telemetry #5978

josunect opened this issue Apr 3, 2023 · 5 comments
Labels
enhancement This is the preferred way to describe new end-to-end features. sub-task Ties an issue to an epic waiting external It requires additional info to progress. For example, it can require a fix in other project.

Comments

@josunect
Copy link
Contributor

josunect commented Apr 3, 2023

At the moment, there are a couple of issues:

  • It is not possible to see service nodes. Looking at the telemetry data reported by ztunnel, the destination_service is unknown: sum(istio_tcp_sent_bytes_total{app="ztunnel"}) by (reporter, destination_canonical_service, destination_service, destination_service_name)
{destination_canonical_service="details", destination_service="unknown", destination_service_name="unknown", reporter="destination"}
{destination_canonical_service="details", destination_service="unknown", destination_service_name="unknown", reporter="source"}
{destination_canonical_service="istio-ingressgateway", destination_service="unknown", destination_service_name="unknown", reporter="source"}
{destination_canonical_service="productpage", destination_service="unknown", destination_service_name="unknown", reporter="destination"
  • When we create a waypoint proxy, some edged are duplicated with http and tcp, because ztunnel reports everything as tcp traffic.

L4 telemetry:
image

With waypoint proxy:
image

Epic subtask: #5476

istio/istio#46169

@josunect josunect added enhancement This is the preferred way to describe new end-to-end features. sub-task Ties an issue to an epic labels Apr 3, 2023
@jshaughn jshaughn added the waiting external It requires additional info to progress. For example, it can require a fix in other project. label Apr 4, 2023
@josunect
Copy link
Contributor Author

josunect commented Jun 9, 2023

  • In minikube, the telemetry looks like (Tested in Istio 1.18):

image

(Perhaps there is no processing in L4 for the local node?)

@josunect
Copy link
Contributor Author

josunect commented Aug 1, 2023

After test this PR: istio/istio#46169

image

It is possible to see the service nodes. But, now there are some duplicated edges.

sum(istio_tcp_sent_bytes_total{app="ztunnel"}) by (reporter, destination_canonical_service, destination_service, destination_service_name)

{destination_canonical_service="details", destination_service="details.bookinfo.svc.cluster.local", destination_service_name="details", reporter="source"} | 176852
{destination_canonical_service="details", destination_service="unknown", destination_service_name="unknown", reporter="destination"} | 112632
{destination_canonical_service="productpage", destination_service="productpage.bookinfo.svc.cluster.local", destination_service_name="productpage", reporter="source"} | 2542904
{destination_canonical_service="productpage", destination_service="unknown", destination_service_name="unknown", reporter="destination"} | 2529671
{destination_canonical_service="ratings", destination_service="ratings.bookinfo.svc.cluster.local", destination_service_name="ratings", reporter="source"} | 69264
{destination_canonical_service="ratings", destination_service="unknown", destination_service_name="unknown", reporter="destination"} | 73668
{destination_canonical_service="reviews", destination_service="reviews.bookinfo.svc.cluster.local", destination_service_name="reviews", reporter="source"} | 282662
{destination_canonical_service="reviews", destination_service="unknown", destination_service_name="unknown", reporter="destination"}

I was debugging the edges generation and it looks like the extra edge is generated with the query:

sum(rate(istio_tcp_received_bytes_total{reporter="destination",destination_workload_namespace="ambient"} [15306s])) by (source_cluster,source_workload_namespace,source_workload,source_canonical_service,source_canonical_revision,destination_cluster,destination_service_namespace,destination_service,destination_service_name,destination_workload_namespace,destination_workload,destination_canonical_service,destination_canonical_revision,response_flags) > 0

{destination_canonical_revision="latest", destination_canonical_service="tcpserver", destination_cluster="Kubernetes", destination_service="unknown", destination_service_name="unknown", destination_service_namespace="unknown", destination_workload="tcpserver", destination_workload_namespace="ambient", response_flags="-", source_canonical_revision="latest", source_canonical_service="debugbox", source_cluster="Kubernetes", source_workload="debugbox", source_workload_namespace="ambient"}

In the query destination telemetry: https://github.com/kiali/kiali/blob/master/graph/telemetry/istio/istio.go#L235

There are 3 edges from the debugbox node:

edges": [
      {
        "data": {
          "id": "1a54005a7c18bf3d28a5def62bc38256",
          "source": "0ef8a87188caa5c6a3e637c764420883",
          "target": "17ce12e7f26ee796d0538bea255c0e90",
          "traffic": {
            "protocol": "tcp",
            "rates": {
              "tcp": "0.01"
            },
            "responses": {
              "-": {
                "flags": {
                  "-": "100.0"
                },
                "hosts": {
                  "tcpserver.ambient.svc.cluster.local": "100.0"
                }
              }
            }
          }
        }
      },
      {
        "data": {
          "id": "522c03d0469c858928b947084b2cfb75",
          "source": "0ef8a87188caa5c6a3e637c764420883",
          "target": "24b3a0bfcc52c7df84de1cc97ca81f42",
          "traffic": {
            "protocol": "tcp",
            "rates": {
              "tcp": "0.007"
            },
            "responses": {
              "-": {
                "flags": {
                  "-": "100.0"
                },
                "hosts": {
                  "httpserver.ambient.svc.cluster.local": "100.0"
                }
              }
            }
          }
        }
      },
      {
        "data": {
          "id": "7344506aac9e77db6bac915046a8a693",
          "source": "0ef8a87188caa5c6a3e637c764420883",
          "target": "2f19c182bef63e07d0c334bc11a6f84b",
          "traffic": {
            "protocol": "tcp",
            "rates": {
              "tcp": "0.02"
            },
            "responses": {
              "-": {
                "flags": {
                  "-": "100.0"
                },
                "hosts": {
                  "unknown": "100.0"
                }
              }
            }
          }
        }
      },

@josunect
Copy link
Contributor Author

josunect commented Aug 7, 2023

Validated:

image

@josunect josunect closed this as completed Aug 7, 2023
@josunect josunect reopened this Aug 17, 2023
@josunect
Copy link
Contributor Author

Tested today (With the updated Istio image) and the queries seems to have unknown data again:

image

image

@josunect
Copy link
Contributor Author

Working as expected in Istio 1.20

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement This is the preferred way to describe new end-to-end features. sub-task Ties an issue to an epic waiting external It requires additional info to progress. For example, it can require a fix in other project.
Projects
None yet
Development

No branches or pull requests

2 participants