Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement listwatchresources API on the simulator #165

Merged
merged 1 commit into from Aug 13, 2022

Conversation

196Ikuchil
Copy link
Contributor

@196Ikuchil 196Ikuchil commented May 8, 2022

What type of PR is this?

/kind feature

What this PR does / why we need it:

Implement new API endpoint /api/v1/listwatchresources on the simulator.

  • This endpoint send all resources's event data to a client by Server-Sent Events.
  • We can specify ' lastResourceVersion`.
  • If we will not specify it, the endpoint calls list API and returns the result before starting watch.

Which issue(s) this PR fixes:

Fixes #120

Special notes for your reviewer:

The implementation of the WebUI side is divided into separate PRs.

/label tide/merge-method-squash

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/feature Categorizes issue or PR as related to a new feature. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. labels May 8, 2022
@k8s-ci-robot
Copy link
Contributor

@196Ikuchil: The label(s) /label do-not-merge/work-in-progress] cannot be applied. These labels are supported: api-review, tide/merge-method-merge, tide/merge-method-rebase, tide/merge-method-squash, team/katacoda, refactor

In response to this:

What type of PR is this?

/kind feature

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #120

Special notes for your reviewer:

Work in progress

/label tide/merge-method-squash
/label do-not-merge/work-in-progress]

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label May 8, 2022
@k8s-ci-robot k8s-ci-robot requested a review from adtac May 8, 2022 16:17
@k8s-ci-robot k8s-ci-robot added the sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. label May 8, 2022
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label May 8, 2022
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 15, 2022
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels May 16, 2022
@196Ikuchil 196Ikuchil marked this pull request as draft May 16, 2022 12:44
@196Ikuchil 196Ikuchil force-pushed the auto_update_webui branch 2 times, most recently from 404ddc8 to d5d6e0b Compare May 16, 2022 15:51
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels May 17, 2022
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 25, 2022
@196Ikuchil
Copy link
Contributor Author

/retitle Update all resources periodically on WebUI

@k8s-ci-robot k8s-ci-robot changed the title [WIP]Update all resources periodically on WebUI Update all resources periodically on WebUI May 26, 2022
@k8s-ci-robot
Copy link
Contributor

@196Ikuchil: The label(s) /remove-label do-not-merge/work-in-progress cannot be applied. These labels are supported: api-review, tide/merge-method-merge, tide/merge-method-rebase, tide/merge-method-squash, team/katacoda, refactor

In response to this:

/remove-label do-not-merge/work-in-progress

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@196Ikuchil
Copy link
Contributor Author

This PR includes #178.
So, we need to merge it first.
/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 26, 2022
@196Ikuchil 196Ikuchil marked this pull request as ready for review May 26, 2022 16:39
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 26, 2022
go s.WatchAndHandleEvent(pcsEventProxy, pcsWatcher, ctx.Done())

<-ctx.Done()
return xerrors.Errorf("terminated to watch: %w", ctx.Err())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<-ctx.Done()

WatchResources will return an error when the frontend browser is reloaded or call cancel() method.
This is to allow the front end to handle stream connection disconnections.

test code

        <-ctx.Done()
        klog.Info("done")  ## this
	return xerrors.Errorf("terminated to watch: %w", ctx.Err())

When the browser reloading

Server's log

I0630 13:38:17.417764   20024 watcher.go:121] done
E0630 13:38:17.417823   20024 watcher.go:39] closed to watch resources: terminated to watch:
    github.com/kubernetes-sigs/kube-scheduler-simulator/watcher.(*Service).WatchResources
        /workspace/kube-scheduler-simulator/watcher/watcher.go:122
  - context canceled
{"time":"2022-06-30T13:38:17.417908424Z","id":"","remote_ip":"127.0.0.1","host":"localhost:1212","method":"GET","uri":"/api/v1/watchresources?podsLastResourceVersion=498&nodesLastResourceVersion=498&pvsLastResourceVersion=498&pvcsLastResourceVersion=498&scsLastResourceVersion=498&pcsLastResourceVersion=498","user_agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Firefox/102.0","status":200,"error":"code=500, message=Internal Server Error","latency":67046223926,"latency_human":"1m7.046223926s","bytes_in":0,"bytes_out":0}

When cancel() calling

We expect the method to be invoked like this.

      watcherAPI
        .watchResources(createLastResourceVersions() as LastResourceVersions)
        .then((response) => {
          if (!response.body) {
            return;
          }
          const stream = response.body.getReader();
          const utf8Decoder = new TextDecoder("utf-8");
          let buffer = "";
          return stream.read().then(function processText({ done, value }): any {
            if (isfinish) {
              stream.cancel(); ## this
              return
            }

The result of when invoked the cancel().

I0630 13:49:16.241880   23480 watcher.go:121] done
E0630 13:49:16.241923   23480 watcher.go:39] closed to watch resources: terminated to watch:
    github.com/kubernetes-sigs/kube-scheduler-simulator/watcher.(*Service).WatchResources
        /workspace/kube-scheduler-simulator/watcher/watcher.go:122
  - context canceled
{"time":"2022-06-30T13:49:16.241962479Z","id":"","remote_ip":"127.0.0.1","host":"localhost:1212","method":"GET","uri":"/api/v1/watchresources?podsLastResourceVersion=214&nodesLastResourceVersion=214&pvsLastResourceVersion=214&pvcsLastResourceVersion=214&scsLastResourceVersion=214&pcsLastResourceVersion=214","user_agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Firefox/102.0","status":200,"error":"code=500, message=Internal Server Error","latency":16756917245,"latency_human":"16.756917245s","bytes_in":0,"bytes_out":3346}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 👍👍👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment for this? (A brief summary is enough.)

Copy link
Member

@sanposhiho sanposhiho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rough reviews only for backend side

docs/api.md Outdated Show resolved Hide resolved
server/di/di.go Outdated Show resolved Hide resolved
server/handler/watcher.go Outdated Show resolved Hide resolved
server/handler/watcher.go Outdated Show resolved Hide resolved
watcher/watcher.go Outdated Show resolved Hide resolved
go s.WatchAndHandleEvent(pcsEventProxy, pcsWatcher, ctx.Done())

<-ctx.Done()
return xerrors.Errorf("terminated to watch: %w", ctx.Err())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 👍👍👍

go s.WatchAndHandleEvent(pcsEventProxy, pcsWatcher, ctx.Done())

<-ctx.Done()
return xerrors.Errorf("terminated to watch: %w", ctx.Err())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment for this? (A brief summary is enough.)

watcher/watcher.go Outdated Show resolved Hide resolved
docs/api.md Outdated
Comment on lines 115 to 122
|parameter|requirement|description|
| ----- | --- | -------- |
|podsLastResourceVersion|MUST|integer|
|nodesLastResourceVersion|MUST|integer|
|pvsLastResourceVersion|MUST|integer|
|pvcsLastResourceVersion|MUST|integer|
|scsLastResourceVersion|MUST|integer|
|pcsLastResourceVersion|MUST|integer|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making these parameters required sounds weird to me.
How about changing it to optional and when not specifying them, start watching with 0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, I don't think so.

Our /watchresources API use NewRetryWatcher in internal logic.
The NewRetryWatcher will return error if we pass the "0" to a lastResourceVersion. If these values are optional, Isn't this likely to be a cause of error?
https://github.com/kubernetes/client-go/blob/3b969f96803febcac81f3a4272a9914270fbb3a3/tools/watch/retrywatcher.go#L66

Also, we shouldn't modify these value.

Resource versions must be treated as opaque by clients and passed unmodified back to the server.
https://kubernetes.io/docs/reference/using-api/api-concepts/#resource-versions

Therefore, I think it is generally expected that this watch API will be called after the list API is called. And at least, I'd liket to require it in this simulator.

BTW, string might be better than integer.

You must not assume resource versions are numeric or collatable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our /watchresources API use NewRetryWatcher in internal logic.
The NewRetryWatcher will return error if we pass the "0" to a lastResourceVersion. If these values are optional, Isn't this likely to be a cause of error?

That makes sense.

But, still the API looks unkind to clients.
What I expect to see in this API is something like kubectl get Pods -w. (= After listing/getting the requested object, watch for changes.)

How about deleting these last resource versions params completely?
In most use cases, the clients need to list all resources directly from kube-apiserver and then call /watchresources after that right?
If so, we don't need to let clients send the last resource versions to server. Just listing all supported objects and start to watch for changes looks more kind to clients.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If so, we don't need to let clients send the last resource versions to server. Just listing all supported objects and start to watch for changes looks more kind to clients.

Good question.
Your proposed feature is already provided as this method.
https://pkg.go.dev/k8s.io/client-go/tools/cache#NewInformer

I use NewRetryWatcher because it allows me to specify the lastresourceversion at the start of the watch.

In the future, we will be able to operate several simulators at once.
At that time, it is not desirable to keep WATCHING all simulators.
That is, watch needs to stop once, and it needs to be possible to resume watch from the middle of the previous one.
And then, In order to restart the watch from the middle, the lastresourceversion needs to be specifiable from the client.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My new idea was to make the lastresourcecersion optional parameter, and if not specified, to run the list on the server side, return the result as an Added event to the client, and then start the watch.

The disadvantage of this is that if the client is already running a listing and does not specify lastresourceversion, the resource passed to the client will be duplicated.

WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah~, okay. I think now I fully understand your idea.

My new idea was to make the lastresourcecersion optional parameter, and if not specified, to run the list on the server side, return the result as an Added event to the client, and then start the watch.

After understood your idea completely, I guess your previous idea, making lastresourcecersion required and using /watchresources only to notice the update, also makes some sense.

But, hmmmm, it's a difficult question. I think there are pros/cons to each strategy.

making lastresourcecersion optional

This is your new idea.

pros
web UI only need to communicate with /watchresources API to fetch the resources. Not need to call kube-apiserver's list API anymore.
That makes the web UI's implementation much simpler, since the client performs the same logic regardless of whether it is the first time access or not. What web UI always needs to do to fetch resources/update resources is only using /watchresources API.

cons
When a user accesses the web UI for the first time, web UI needs to build the resource's view by processing many Added event for all resources one by one.
So, If there are too many resources, the building may take some time.

making lastresourcecersion required

This is your previous idea.

pros

When a user accesses the web UI for the first time, web UI calls list API of kube-apiserver.
By processing another logic for the first time access, the time that takes for first view will be decreased especially when the number of resources is large.

cons

Web UI needs to have two logics to show resources; one is for first-time access, and another is for /watchresources .

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, to summarize, your new idea vs your previous idea looks like maintainability vs performance.

@k8s-ci-robot k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jul 8, 2022
@196Ikuchil
Copy link
Contributor Author

Fixed. (Also some conflicts.)

Copy link
Member

@sanposhiho sanposhiho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only server side

docs/api.md Outdated Show resolved Hide resolved
simulator/resourcewatcher/resourcewatcher.go Outdated Show resolved Hide resolved
simulator/resourcewatcher/resourcewatcher.go Outdated Show resolved Hide resolved
simulator/resourcewatcher/resourcewatcher.go Outdated Show resolved Hide resolved
case watch.Deleted:
eventType = watch.Deleted
default:
// we may receive watch.Bookmark and watch.Error events.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should watch.Error be logged as error logs?

@@ -0,0 +1,484 @@
package resourcewatcher_test
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this test does not need to be written in a separate test package.
Sometimes separate test package strategy is useful to check the behavior from outside of the package, but in most cases, this only increases the test complexity/difficulties.
This time, we need to create export_test.go for the test and it's a non-good signal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I changed the package name was because of a circular reference to a file in the mock file. (I don't remember exactly, but I'm pretty sure it occurred in ResponseStream or somewhere)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, it doesn't make sense to change test to be in separate package only for solving import cycles.

I'm not sure but seems like StreamWriter is one cause...? (Because it uses WatchEvent in its definition.)
If so, one idea to solve this is just to create another package for StreamWriter and move the related implementation into that new package.

simulator/server/server.go Outdated Show resolved Hide resolved
simulator/server/handler/watcher.go Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jul 10, 2022
Copy link
Contributor Author

@196Ikuchil 196Ikuchil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @sanposhiho
I updated it.

docs/api.md Outdated
| pvsLastResourceVersion | OPTIONAL | If not specified, some existing resource data is returned as `ADDED` Events before starting watch. |
| pvcsLastResourceVersion | OPTIONAL | If not specified, some existing resource data is returned as `ADDED` Events before starting watch. |
| scsLastResourceVersion | OPTIONAL | If not specified, some existing resource data is returned as `ADDED` Events before starting watch. |
| pcsLastResourceVersion | OPTIONAL | If not specified, some existing resource data is returned as `ADDED` Events before starting watch. |
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed these queries to OPTIONAL.

func (s *Service) ListAndWatch(p ResourceEventProxy, stopCh <-chan struct{}) error {
lw := createListWatch(p)
// If the lastResourceVersion isn't specified by client, call the list and return the result as ADDED event first.
if p.LastResourceVersion() == "" {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a client will not set the lastresourceversion, the value will be "".
And then the server calls list method of the resource before starting the watch.

type ResourceState<T extends resourceObject> = Array<T>;

// createResourceState returns a new resource store's state.
export function createResourceState<T extends resourceObject>(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost resources will use this page's methods to handle Events. (Pod is an exception.)
This is because the Pod needs to consider to unscheduled pod.

Copy link
Member

@sanposhiho sanposhiho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only for backend

simulator/docs/api.md Outdated Show resolved Hide resolved
simulator/resourcewatcher/resourcewatcher.go Outdated Show resolved Hide resolved
simulator/resourcewatcher/resourcewatcher.go Outdated Show resolved Hide resolved
simulator/resourcewatcher/resourcewatcher.go Outdated Show resolved Hide resolved
simulator/resourcewatcher/resourcewatcher.go Outdated Show resolved Hide resolved
simulator/resourcewatcher/resourcewatcher.go Outdated Show resolved Hide resolved
simulator/resourcewatcher/resourcewatcher.go Outdated Show resolved Hide resolved
simulator/resourcewatcher/resourcewatcher.go Outdated Show resolved Hide resolved
simulator/resourcewatcher/resourcewatcher.go Outdated Show resolved Hide resolved
simulator/resourcewatcher/resourcewatcher.go Outdated Show resolved Hide resolved
@196Ikuchil
Copy link
Contributor Author

@sanposhiho Thanks! Fixed.

Copy link
Member

@sanposhiho sanposhiho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@196Ikuchil
Looks good. Because this PR is so big, could you create the new PR for the frontend and change this PR to only includes the backend changes?
After you do that, I'm gonna approve this.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 13, 2022
@k8s-ci-robot k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Aug 13, 2022
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 13, 2022
@196Ikuchil
Copy link
Contributor Author

/retitle Implement listwatchresources API on the simulator

@k8s-ci-robot k8s-ci-robot changed the title Update all resources periodically on WebUI Implement listwatchresources API on the simulator Aug 13, 2022
@196Ikuchil
Copy link
Contributor Author

@sanposhiho
Thank you for the comment.
I move some implementations of webUI to the new PR #211.

@sanposhiho
Copy link
Member

/lgtm
/apprive

@sanposhiho
Copy link
Member

/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 13, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: 196Ikuchil, sanposhiho

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 13, 2022
@k8s-ci-robot k8s-ci-robot merged commit 030391e into kubernetes-sigs:master Aug 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[web ui] Update all resources periodically
3 participants