Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plans for Node Group Management and Service Scope Management #3582

Open
4 of 11 tasks
vincentgoat opened this issue Jan 21, 2022 · 21 comments
Open
4 of 11 tasks

Plans for Node Group Management and Service Scope Management #3582

vincentgoat opened this issue Jan 21, 2022 · 21 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@vincentgoat
Copy link
Member

vincentgoat commented Jan 21, 2022

Hi guys, we have plans for achieving the feature of Pod Scheduling among node groups in phase 1, and we also make plans for achieving the feature of Service Scope within a node group in phase 2.

If you have features to achieve or to get involved in the development, please see the project form below, and you can add other todos here as you want:

Phase 1: Node Group Management

  • Define a NodeGroup CR to indicate which nodes belong to the node group.
  • Define an EdgeApplication CR to create the resource
  • Override image
  • Unit/E2E Test/Integration Test with edgemesh

Phase 2: Service Scope Management

  • achieve service scope management
  • achieve gateway for node group
  • A more friendly querying result of EdgeApplication/NodeGroup
  • Support service access only on the local host
  • Statefulset is deployed by NodeGroup
  • Prioritized NodeGroup management, such as:
    NodeGroup-A will be deployed preferentially. After the specified number of deployments in NodeGroup-A are deployed, NodeGroup-B will be started deployed.
  • The node group integrates the node relay function to enhance the stability of service traffic access

/cc @Congrool @fisherxu

@vincentgoat vincentgoat added the kind/feature Categorizes issue or PR as related to a new feature. label Jan 21, 2022
@vincentgoat
Copy link
Member Author

2022.1.24 Meeting Minutes:

  1. Achieve the override in cloud core, not only override images path but also other attributes, the data comes from an independent policy.
  2. Use the kubectl plug-in to new operations for node group.
  3. The controller runtime version depends on the Kubeedge version update.
  4. Determine the API.
  5. Create a Google Doc for more details about the proposal.

@vincentgoat
Copy link
Member Author

proposal: #3574

@Congrool
Copy link
Member

2022.1.26 Meeting:

  1. Only focus on new applications which have not been deployed.
  2. Find some way to solve the contention when applying propogation policy, such as make pods with specified label/annotation in pending state until the extender finds the relative policy.
  3. Modify the PropagationPolicy API to support scheduling explicit number of pods to specified nodegroups.

@benjaminhuo
Copy link

benjaminhuo commented Feb 16, 2022

If I understand correctly NodeGroup and ServiceScope are trying to achieve goals similar to OpenYurt YurtAppManager and SuperEdge ServiceGroup but in different ways?

Both OpenYurt YurtAppManager and SuperEdge ServiceGroup distribute workload by creating sub-workload while KubeEdge is try to distribute pods in a single deployment to different NodeGroup?

It might work well for deployment, but is this ok for a statefulset to spread pods of a single statefulset to different edge locations? For example:

  • 2 pods for a MySQL statefulset, one is a primary and another is a replica
  • 3 pods for an Elasticsearch statefulset, each pod holds a portion of the data

And where can I find the proposal for Service Scope?

@vincentgoat

Thanks
Ben

@Congrool
Copy link
Member

@benjaminhuo Thanks for your feedback.

It might work well for deployment, but is this ok for a statefulset to spread pods of a single statefulset to different edge locations?

It's the problem that we need take into consideration. This is not the final design and any other suggestions are welcome.

And where can I find the proposal for Service Scope?

Do you mean Serivce Scope based on something like node group? And a pod can only reach endpoints in the same nodegroup. In my mind, it is still under designing.

@benjaminhuo
Copy link

Do you mean Serivce Scope based on something like node group? And a pod can only reach endpoints in the same nodegroup. In my mind, it is still under designing.

Yep, I mean one multi-nodegroup deployment should only have one service, and the access to this service from a pod in the same nodegroup should only be able to find endpoints in the same nodegroup.

@Congrool
Copy link
Member

Congrool commented Feb 16, 2022

Well, because KubeEdge doesn't have the concept of node group before this proposal, I think the proposal of service scope based on it hasn't been post yet. You can watch the edgemesh repo which mantains the network solutions of KubeEdge and any new information will be post there.
@benjaminhuo

@fisherxu
Copy link
Member

@benjaminhuo I personally tend to wrap the workload in a CRD, then distribute the workload to the node group.

I mean one multi-nodegroup deployment should only have one service, and the access to this service from a pod in the same nodegroup should only be able to find endpoints in the same nodegroup.

You mean only create one service for all the node groups? That maybe leads to the pods(ip) from different group located in one endpointSlice. Hard to seprated for groups.

@benjaminhuo
Copy link

benjaminhuo commented Feb 16, 2022

You mean only create one service for all the node groups? That maybe leads to the pods(ip) from different group located in one endpointSlice. Hard to seprated for groups.

@fisherxu My previous conclusion is based on the current proposal because there is only one deployment for multiple node groups in the current proposal.

Current design might work for deployment, but not for statefulset and might have issues for service.

@fisherxu
Copy link
Member

fisherxu commented Feb 22, 2022

As we discussed lately, for the workload like deployment, we decide to deploy sub-workload to every group through CRD, so every group will have a workload(deployment).

But for the service, if we create only one for all workload, now in k8s there're two features related to the topology(worked with kube-proxy):

  1. Service topology keys: It can keep the traffic within a zone. But the feature is deprecated and removed now.
  2. Service topology hints: The new version of service topology. But it can't ensure keeping the traffic within a zone, it's the best-effort strategy currently, maybe not meet our needs.

Another idea is to create one service for every nodegroup, which can solve the problem of isolation, but there will be to many svcs.

And we also need the gateway/ingress for every node group.

How do you think? @benjaminhuo @Congrool @vincentgoat @zc2638

@Congrool
Copy link
Member

Congrool commented Feb 22, 2022

@fisherxu
I'm not familiar with edgemesh, so it's just my thought:
Because edgemesh takes over the work of kube-proxy as well as coreDNS, can we make the edgemesh aware of the nodegroup? So that edgemesh can retain endpoints in the same nodegroup and discard other endpoints that not. In this way, only iptables rules for endpoints in the same nodegroup will be set and the dns will only be resolved to ips of endpoints in the same nodegroup.

Is this implementable?

@fisherxu
Copy link
Member

Beacause edgemesh takes over the work of kube-proxy as well as coreDNS, can we make the edgemesh aware of the nodegroup?

Yes, EdgeMesh is one networking solution for us, and will do this :)
And I think we still need to consider to compatible with other solutions as kube-proxy even service mesh like istio.

@benjaminhuo
Copy link

benjaminhuo commented Feb 23, 2022

I agree that EdgeMesh should be aware of the nodegroup.

As for the kube-proxy approach, OpenYurt is using EndpointSliceProxying together with topologyKeys to use one service for all sub deployment in all nodepool. https://github.com/openyurtio/openyurt/blob/master/docs/tutorial/service-topology.md

But it requires:

  • kube-proxy needs to be configured to connect to Yurthub instead of the API Server.
  • restart kube-proxy

@fisherxu @Congrool

@vincentgoat
Copy link
Member Author

EdgeApplication proposal: https://docs.google.com/document/d/19FHzmHB-a-OSUukmicEVga6neMCeHz_ozPFxaGXqJ1I/edit#heading=h.d5vhf8rdz3fj

@gy95
Copy link
Member

gy95 commented Mar 2, 2022

@benjaminhuo
Copy link

benjaminhuo commented Mar 2, 2022

Yeah, this doc is not open yet, and cannot access it @vincentgoat

@vincentgoat
Copy link
Member Author

2022.3.2 meeting:
1、Readability optimization for unified packaging;
2、Consider the EdgeApplication concept in the cloud and edge side.

@vincentgoat
Copy link
Member Author

EdgeApplication proposal: https://docs.google.com/document/d/19FHzmHB-a-OSUukmicEVga6neMCeHz_ozPFxaGXqJ1I/edit#heading=h.d5vhf8rdz3fj

no authority :)

Plz join the group kubeedge@googlegroups.com

@vincentgoat
Copy link
Member Author

2022.3.7 meeting:

  1. Simplify EdgeApplication status field, the status show sub-resource summary status and where sub-resource located at;
  2. Summary status contains progressing/available.

@koulq
Copy link

koulq commented Mar 15, 2022

hi,In this design, have you ever considered the certificate management of device access?For example, with my same application deployment packaged in node groups A and node group B, do I require node groups A and node group B to use the same set of certificates issued by CA?

@vincentgoat
Copy link
Member Author

hi,In this design, have you ever considered the certificate management of device access?For example, with my same application deployment packaged in node groups A and node group B, do I require node groups A and node group B to use the same set of certificates issued by CA?

Hi @koulq , thanks for the feedback, we currently manage workload and service through nodegroup, and management of device certificates can be considered in future evolutions, and also you are welcome to follow up on this process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants