Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubernetes considerations #1

Closed
AlbertoSoutullo opened this issue Sep 16, 2023 · 1 comment
Closed

Kubernetes considerations #1

AlbertoSoutullo opened this issue Sep 16, 2023 · 1 comment
Labels
documentation Improvements or additions to documentation

Comments

@AlbertoSoutullo
Copy link
Collaborator

AlbertoSoutullo commented Sep 16, 2023

  1. Pods have shared storage and network resources. It would be nice to have the ability to set up several nodes per POD, but as we are mainly interested in bandwidth usage, it will be better to focus on one node per pod.
    Also, as they share the same IP address, we have to consider port clashing if we run several nodes in the same pod. (port-shift flag for waku)

  2. We can use services with type ClusterIP, so the PODs are only reachable from within the cluster, excluding any external bandwidth.
    Also, it would be important that we can change this easily.

  3. Horizontal Pod Autoscaling would deploy more pods on demand, and can be configured per resource metric (we could use bandwidth here). Super useful for testnets.
    As it also automatically scale down, maybe is not a priority for the 10k milestone, as we want to reach that number ""independently"" of the bandwitdth (we will have control over it). There is also a vertical scaling.

  4. Kubernetes has a list of considerations for large clusters. As summary, it is suited for 110 pods and 5000 nodes. Currently we have 2 physical nodes. Taking this into account, we can:
    4.1. Increase the number of containers per POD. Need to do further checking on how to extract single container metrics here.
    4.2. Increase the number of PODs per node (currently doing this). This arises more problems, like having to modify the default maximum POD value in a node, and having to modify the PODs CIDR mask since it only handles 256 IPs by default. Example here
    4.3 If we see that this scaling gets too dificult because CPU/docker issues (most likely), we can create 2 or 3 masters with replicated services, and create a job in amazon where it automatically deploys and shut down everything so a experiment will have a "fixed" cost instead of having to pay for dozens of nodes monthly.

  5. As we are talking about thousands of PODs, we should put parallel policy in StatefulSets. Also, we have to take into account that when working with large StatefulSets, it is faster to deploy a lot of StatefulSets with fewer PODs, than little StatefulSets with a lot of PODs

List of current Kubernetes nodes:

@AlbertoSoutullo AlbertoSoutullo added the documentation Improvements or additions to documentation label Sep 16, 2023
@AlbertoSoutullo
Copy link
Collaborator Author

Closed as we are now using @Zorlin lab.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
Status: Done
Development

No branches or pull requests

1 participant