Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YUNIKORN-42 (Work in progress changes for recorder) #118

Closed
wants to merge 1 commit into from

Conversation

wangdatan
Copy link
Contributor

No description provided.

@wangdatan
Copy link
Contributor Author

Just added a WIP change for recorder, which includes:

  1. Recorder implementation.
    How to use it:
diagnostic. GetEventsRecorder(). RecordResourceRequestEvent(...)
  1. Changes in api package

  2. Changes in RMProxy

Pending:

  • Haven't added any recording message to scheduler yet.
  • Haven't properly setup golint/IDE plugin, so the format could be wrong.

@wilfred-s , @TaoYang526 , @yangwwei please add your suggestions.

@wilfred-s wilfred-s self-requested a review March 30, 2020 12:24
Copy link

@adamantal adamantal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read the discussion in the jira. Though I agree with @wilfred-s regarding the event publishing, I think for existing K8s users the kubectl describe is a must - this is the most handful tool a kubernetes admin to look at. IMO we should seek a way to only emit K8s relevant scheduling events and if the user wants some more granular scheduler diagnostics, should try to look for the YuniKorn UI or REST API.

Regarding the actual recording message, those could be submitted through another PR.

Comment on lines +22 to +25
timer.SetNanoTimeNow(newTime)
if timer.NanoTimeNow()-after > time.Second.Nanoseconds() {
t.Errorf("SetNanoTime for regular timer should not take effect")
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this test case is wrong. If the regularTimer's SetNanoTimeNow took effect, the test would pass.

timer common.Timer

// Nano-second interval of record for same key
minimumRecordIntervalForSameKeyInNS int64

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can store that in time.Time

return true
}

func (de *DiagEventsRecorder) RecordResourceRequestEvent(level Level, requestId string, appId string, queueName string, nodeId string, message string) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A technical question: just by looking at the datastructure I assume we only allow one event per request. Is this intended? Isn't it a bit too strict? I can imagine scenarios where for a request we can emit multiple event records.

Comment on lines +141 to +145
idx := 0
for _, v := range de.eventMap {
ret[idx] = v
idx ++
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
idx := 0
for _, v := range de.eventMap {
ret[idx] = v
idx ++
}
for idx, v := range de.eventMap {
ret[idx] = v
}

@yangwwei yangwwei closed this May 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants