Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

migrate the internal kube-apiserver to kwok #333

Merged
merged 30 commits into from
Mar 26, 2024

Conversation

sivchari
Copy link
Member

@sivchari sivchari commented Feb 6, 2024

What type of PR is this?

/area simulator

/kind feature
/kind cleanup

What this PR does / why we need it:

Before this PR, we should manage not only kube-scheduler, but also kube-apiserver and etcd.
We wanna focus on kube-scheduler and reduce the time when we maintain various components.

Therefore, this aims to integrate kwok into simulator that simulates Nodes and Clusters..
If this PR is merged, all we have to do only maintain scheduler-simulator.

Which issue(s) this PR fixes:

Fixes #251

Special notes for your reviewer:

/label tide/merge-method-squash

Signed-off-by: sivchari <shibuuuu5@gmail.com>
Signed-off-by: sivchari <shibuuuu5@gmail.com>
Signed-off-by: sivchari <shibuuuu5@gmail.com>
@k8s-ci-robot k8s-ci-robot added area/simulator Issues or PRs related to the simulator. kind/feature Categorizes issue or PR as related to a new feature. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. labels Feb 6, 2024
@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Feb 6, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @sivchari. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Feb 6, 2024
Signed-off-by: sivchari <shibuuuu5@gmail.com>
Signed-off-by: sivchari <shibuuuu5@gmail.com>
Comment on lines 30 to 40
simulator-etcd:
image: quay.io/coreos/etcd:v3.4.26
container_name: simulator-etcd
restart: always
volumes:
- simulator-etcd-data:/var/lib/etcd
command: etcd --advertise-client-urls http://simulator-etcd:2379 --data-dir /var/lib/etcd --listen-client-urls http://0.0.0.0:2379 --initial-cluster-state new --initial-cluster-token tkn
networks:
- simulator-internal-network
volumes:
simulator-etcd-data:
Copy link
Member

@wzshiming wzshiming Feb 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to create clusters use image, so that the simulator doesn't need to be concerned with the creation and deletion of clusters at all, and only needs to be externally supplied with etcd ports.

  simulator-cluster:
    image: registry.k8s.io/kwok/cluster:v0.5.0-k8s.v1.29.0
    container_name: simulator-cluster
    restart: always
    ports:
      - "8080:8080"
    environment:
      - KWOK_KUBE_APISERVER_PORT=8080
    volumes:
      - simulator-etcd-data:/var/lib/etcd
      - ./kwokctl.yaml:/root/.kwok/kwok.yaml
# kwokctl.yaml
kind: KwokctlConfiguration
apiVersion: config.kwok.x-k8s.io/v1alpha1
options:
  etcdPort: 2379
  etcdPrefix: /kube-scheduler-simulator
  disableKubeScheduler: true
componentsPatches:
- name: kube-apiserver
  extraArgs:
  - key: cors-allowed-origins
    value: ^*$

Copy link
Member Author

@sivchari sivchari Feb 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sivchari#1

Sorry for the clutter. Although I tried to fix these codes, it couldn't catch the watchResources response. it's suspended and I don't have no idea why it's caused. Do you have some idea ?

@wzshiming
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Feb 8, 2024
Signed-off-by: sivchari <shibuuuu5@gmail.com>
Signed-off-by: sivchari <shibuuuu5@gmail.com>
Signed-off-by: sivchari <shibuuuu5@gmail.com>
Signed-off-by: sivchari <shibuuuu5@gmail.com>
Signed-off-by: sivchari <shibuuuu5@gmail.com>
Signed-off-by: sivchari <shibuuuu5@gmail.com>
@sivchari
Copy link
Member Author

sivchari commented Feb 9, 2024

@sanposhiho
I'm ready for review.

@wzshiming
Thanks for your so kind support.

@sivchari
Copy link
Member Author

sivchari commented Feb 9, 2024

/cc @sanposhiho
/cc @196Ikuchil

Signed-off-by: sivchari <shibuuuu5@gmail.com>
Signed-off-by: sivchari <shibuuuu5@gmail.com>
@sivchari
Copy link
Member Author

@sanposhiho @wzshiming
I'm sorry making you late. I wasn't able to deal with it.
Since I've handled it, please review some new changes. Thanks.

@@ -5,12 +5,15 @@ This page describes how the simulator works.
### 0. starts the simulator.

The simulator server works with the following:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The simulator server works with the following:
The simulator server contains the following components:

Comment on lines 8 to 12
- [kube-apiserver](kube-apiserver.md)
- etcd
- scheduler
- controllers for core resources
- [HTTP server](api.md)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- [kube-apiserver](kube-apiserver.md)
- etcd
- scheduler
- controllers for core resources
- [HTTP server](api.md)
- scheduler
- [HTTP server](api.md)

@@ -1,5 +1,11 @@
## Kube-apiserver
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can just remove this page

Comment on lines 14 to 16
In advance, the simulator needs to launch etcd, controllers and kube-apiserver outside.
[KWOK](https://github.com/kubernetes-sigs/kwok) can launch these components all at once, thus we recommend using it.
When the simulator server starts, it will start scheduler and HTTP server.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In advance, the simulator needs to launch etcd, controllers and kube-apiserver outside.
[KWOK](https://github.com/kubernetes-sigs/kwok) can launch these components all at once, thus we recommend using it.
When the simulator server starts, it will start scheduler and HTTP server.
In advance, the simulator needs to launch etcd, controller-manager and kube-apiserver outside.
We recommend using [KWOK](https://github.com/kubernetes-sigs/kwok), see [docker-compose.yml](../../docker-compose.yml) to know how we wire things up.

When the simulator server starts, it will start these components with server.
In advance, the simulator needs to launch etcd, controllers and kube-apiserver outside.
[KWOK](https://github.com/kubernetes-sigs/kwok) can launch these components all at once, thus we recommend using it.
When the simulator server starts, it will start scheduler and HTTP server.

### 1. users request creating resource.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kwok should support most of resources and we no longer need to list supported resources. We can simplify this section like:

### 1. users request creating resource.

Users can create resources below by communicating with kube-apiserver of Kwok via any clients 
(e.g., kubectl, kwokctl, k8s client library or [Web UI](../../web))

### 2. the scheduler schedules a new pod
...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, Please modify README too.

time.Sleep(1 * time.Second)
ctx, cancel := context.WithTimeout(context.Background(), 1*time.Minute)
defer cancel()
wait.UntilWithContext(ctx, func(ctx context.Context) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I misled in a previous comment.
We should use PollUntilContextTimeout rather, otherwise wait.UntilWithContext wouldn't panic or error out if reaching timeout seemingly.
https://pkg.go.dev/k8s.io/apimachinery/pkg/util/wait#PollUntilContextTimeout

Signed-off-by: sivchari <shibuuuu5@gmail.com>
Signed-off-by: sivchari <shibuuuu5@gmail.com>
Signed-off-by: sivchari <shibuuuu5@gmail.com>
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Mar 15, 2024
Signed-off-by: sivchari <shibuuuu5@gmail.com>
@@ -41,12 +41,12 @@ We have several components:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we update Getting started section like this otherwise people cannot run up the simulator correctly until a new release is cut.

## Getting started 

git clone git@github.com:kubernetes-sigs/kube-scheduler-simulator.git
cd kube-scheduler-simulator
# switch to the version you want to use.
git switch simulator/v0.1.1
# pull images from the registry and run up all components
make docker_up 
# All things up! 

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I'd revise it.

README.md Outdated
Comment on lines 49 to 51
You can create any resources by communication with kube-apiserver outside via any clients (e.g. kubectl, k8s client library, or web UI described next).
And when you create Pods,
Pods will be scheduled by the [debuggable scheduler](./simulator/docs/debuggable-scheduler.md),
Copy link
Member

@sanposhiho sanposhiho Mar 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Suggested change
You can create any resources by communication with kube-apiserver outside via any clients (e.g. kubectl, k8s client library, or web UI described next).
And when you create Pods,
Pods will be scheduled by the [debuggable scheduler](./simulator/docs/debuggable-scheduler.md),
Simulator runs with a fake cluster powered by [KWOK](https://github.com/kubernetes-sigs/kwok)
You can create any resources in KWOK cluster via any clients (e.g. kubectl, k8s client library, or web UI described next).
And when you create Pods,
Pods will be scheduled by the [debuggable scheduler](./simulator/docs/debuggable-scheduler.md),

_, err := client.CoreV1().Namespaces().Get(context.Background(), "kube-system", metav1.GetOptions{})
if err != nil {
klog.Infof("waiting for kube-system namespace to be ready: %v", err)
return false, err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess if we here return err, it'd immediately fail without retry. Let's just return nill as we log error in L65.

Suggested change
return false, err
return false, nil

url := os.Getenv("KUBE_APISERVER_URL")
if url == "" && configYaml.KubeAPIServerURL != "" {
return configYaml.KubeAPIServerURL
}
p := os.Getenv("KUBE_API_PORT")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove L145-160, but without removing KubeAPIPort and KubeAPIHost from simulator/config/v1alpha1/types.go. We can just ignore deprecated envs (as I said before, removing them would be a breaking change that we should avoid though

Signed-off-by: sivchari <shibuuuu5@gmail.com>
Signed-off-by: sivchari <shibuuuu5@gmail.com>
Signed-off-by: sivchari <shibuuuu5@gmail.com>
@sivchari
Copy link
Member Author

I've updated PR, thanks.

Copy link
Member

@wzshiming wzshiming left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, the image startup script is hardcoded, the only way to set this is through the environment variable.
and the next release (kubernetes-sigs/kwok#999) will support kubeApiserverInsecurePort to instead of this

docker-compose-local.yml Show resolved Hide resolved
kwok.yaml Outdated
kind: KwokctlConfiguration
apiVersion: config.kwok.x-k8s.io/v1alpha1
options:
kubeApiserverPort: 3131
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
kubeApiserverPort: 3131

Signed-off-by: sivchari <shibuuuu5@gmail.com>
@sivchari
Copy link
Member Author

@wzshiming
Thanks!! I've checked the script of kwok's entry point and the plan of next release.
Once, I decided to use a env to connect the cluster in KWOK.

docker-compose-local.yml Show resolved Hide resolved
docker-compose.yml Show resolved Hide resolved
kwok.yaml Show resolved Hide resolved
Copy link
Member

@sanposhiho sanposhiho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two doc changes, otherwise LGTM

@@ -28,15 +33,20 @@ origin for `CORS_ALLOWED_ORIGIN_LIST`.
resources". This variable is used to find Kubeconfig required to
access your cluster for importing resources to scheduler simulator.

`KUBE_APISEVER_URL`: This is the URL of kube-apiserver which the
simulator uses. This variable is used to connect to external kube-apiserver.

`KUBE_API_HOST`: This is the host of kube-apiserver which the
Copy link
Member

@sanposhiho sanposhiho Mar 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, on second thought, I don't think we need to keep the mention of KUBE_API_HOST and KUBE_API_PORT.
Can we just remove L8-L12 and L36-L45?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We forget about the doc in:
https://github.com/kubernetes-sigs/kube-scheduler-simulator/blob/f0faa06ec7773723073ad04c4bb3383dbbd8d3ff/simulator/docs/simulator-server-config.md

Can you

  • remove the mention to kubeApiHost and kubeApiPort
  • mention kubeApiServerUrl

over there?

Signed-off-by: sivchari <shibuuuu5@gmail.com>
@sivchari
Copy link
Member Author

@sanposhiho
I've revised it, thanks.

Copy link
Member

@sanposhiho sanposhiho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@sivchari This is epic, thanks for your all effort!

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 26, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sanposhiho, sivchari

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 26, 2024
@sanposhiho
Copy link
Member

sanposhiho commented Mar 26, 2024

Also thanks @wzshiming for your all support! It'd be hard to achieve if you weren't here.

@k8s-ci-robot k8s-ci-robot merged commit 296766e into kubernetes-sigs:master Mar 26, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/simulator Issues or PRs related to the simulator. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Integrate kwok into simulator
4 participants