Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Kubernetes service registry for ALS analysis #5722

Merged
merged 8 commits into from
Oct 29, 2020

Conversation

kezhenxu94
Copy link
Member

@kezhenxu94 kezhenxu94 commented Oct 25, 2020

Improve Kubernetes service registry for ALS analysis

The current implementation of Envoy ALS K8S analysis is based on the hierarchy, pod -> StatefulSet -> deployment, StatefulSet, or others. It's freaky and different from the Istio Kubernetes registry. And the service name pattern changed in recent Kubernetes versions, which generates weird service names in the topology.

The new path is pod -> endpoint -> service, and we should leverage Informer API instead of raw Kubernetes API.

@kezhenxu94 kezhenxu94 added backend OAP backend related. enhancement Enhancement on performance or codes labels Oct 25, 2020
@kezhenxu94 kezhenxu94 added this to the 8.3.0 milestone Oct 25, 2020
CHANGES.md Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Oct 25, 2020

Codecov Report

Merging #5722 into master will decrease coverage by 8.34%.
The diff coverage is 53.20%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #5722      +/-   ##
============================================
- Coverage     51.59%   43.24%   -8.35%     
+ Complexity     3448     2666     -782     
============================================
  Files          1637     1635       -2     
  Lines         34928    34942      +14     
  Branches       3806     3786      -20     
============================================
- Hits          18020    15112    -2908     
- Misses        16021    18934    +2913     
- Partials        887      896       +9     
Impacted Files Coverage Δ Complexity Δ
...ver/receiver/envoy/als/k8s/K8SServiceRegistry.java 0.90% <0.90%> (ø) 0.00 <0.00> (?)
...r/envoy/als/k8s/K8sALSServiceMeshHTTPAnalysis.java 80.54% <80.54%> (ø) 0.00 <0.00> (?)
...rver/receiver/envoy/EnvoyMetricReceiverConfig.java 60.00% <100.00%> (ø) 2.00 <0.00> (ø)
...r/receiver/envoy/als/k8s/ServiceNameFormatter.java 100.00% <100.00%> (ø) 0.00 <0.00> (?)
...kywalking/oap/server/core/storage/AbstractDAO.java 0.00% <0.00%> (-100.00%) 0.00% <0.00%> (-2.00%)
...ng/oap/server/core/analysis/config/NoneStream.java 0.00% <0.00%> (-100.00%) 0.00% <0.00%> (-1.00%)
...ement/ui/template/UITemplateManagementService.java 0.00% <0.00%> (-100.00%) 0.00% <0.00%> (-6.00%)
...server/storage/plugin/influxdb/base/RecordDAO.java 0.00% <0.00%> (-100.00%) 0.00% <0.00%> (-6.00%)
...er/storage/plugin/influxdb/base/NoneStreamDAO.java 0.00% <0.00%> (-100.00%) 0.00% <0.00%> (-4.00%)
...erver/storage/plugin/elasticsearch/base/EsDAO.java 0.00% <0.00%> (-100.00%) 0.00% <0.00%> (-4.00%)
... and 188 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c95a913...2846d61. Read the comment docs.

@kezhenxu94 kezhenxu94 changed the title Improve Kubernetes service registry fo ALS analysis Improve Kubernetes service registry for ALS analysis Oct 25, 2020
@kezhenxu94
Copy link
Member Author

kezhenxu94 commented Oct 25, 2020

FYI, @wu-sheng @hanahmily What it differs from the previous version is that the different versions of a same service are now collapsed into one service, e.g. in the previous version, reviews-v1, reviews-v2, reviews-v3 are 3 different services, but in this version, they are all collapsed into reviews, as a single service, with different version tag, (labels: version=v1, version=v2, etc.), which I think is natural

Copy link
Member

@wu-sheng wu-sheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like the idea. Wait for @hanahmily code level review.

@kezhenxu94 kezhenxu94 force-pushed the k8s/service-registry branch 2 times, most recently from beb33c9 to a2d90ad Compare October 25, 2020 15:53
final CoreV1Api coreV1Api = new CoreV1Api(apiClient);
final SharedInformerFactory factory = new SharedInformerFactory(executor);

listenEndpointsEvents(coreV1Api, factory);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Endpoint slice resources should be listened to either.

Copy link
Member Author

@kezhenxu94 kezhenxu94 Oct 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hanahmily even the latest kubernetes-client(10.0.0, 27th, Oct, 2020) doesn't support to listen to the EndpointSlice events. There is no such API to do this 😢 Let's postponed this resource until the newer version can do this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If no listener, are you reading the data periodically like the old way?

Copy link
Member Author

@kezhenxu94 kezhenxu94 Oct 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If no listener, are you reading the data periodically like the old way?

Not now, as the doc says, EndpointSlice is to improve the performance of Endpoints, so I think it's OK to just ignore this kind of event for now, (p.s. EndpointSlice is still in beta now), is it OK @hanahmily ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If java client doesn't support it right now, feel free to leave it alone. But we should mention it in our document and leave todo market in codes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about submitting a request to k8s java client repo about listen to the EndpointSlice ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They may need a scenario about which java codes need this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about submitting a request to k8s java client repo about listen to the EndpointSlice ?

I'm thinking that they don't support it only because EndpointSlice is still in beta(after stabilization, they will), and the methods to list/watch that resources are unstable now, (e.g. need the apiGroup, version, which are changing for an alpha/beta feature).

IMO, ignoring EndpointSlice for now is safe because the docs says

Although the EndpointSlice API is providing a newer and more scalable alternative to the Endpoints API, the Endpoints API will continue to be considered generally available and stable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can revamp this after the EndpointSlice API is stabilised

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point is only creating an issue to track this todo as backlog.

@kezhenxu94
Copy link
Member Author

All review comments should be addressed, if I missed anything, please let me know

The current implementation of envoy ALS K8S analysis is based on the hierarchy, pod -> StatefulSet -> deployment, StatefulSet, or others. It's freaky and different from the Istio Kubernetes registry.

The new path is pod -> endpoint -> service, and we should leverage Informer API instead of raw Kubernetes API.
Copy link
Contributor

@hanahmily hanahmily left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kezhenxu94 kezhenxu94 merged commit 92bb474 into master Oct 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend OAP backend related. enhancement Enhancement on performance or codes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve ALS k8s analysis
3 participants