Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add buffer size and dropped actions fields to ScheduleInfo #311

Merged
merged 3 commits into from
Sep 5, 2023

Conversation

ShahabT
Copy link
Contributor

@ShahabT ShahabT commented Aug 31, 2023

What changed?
Adding two fields to ScheduleInfo proto

Why?
This is for visibility and completeness. Buffer overruns are already exposed in a metric, but we have similar fields in ScheduleInfo for number of runs skipped, etc. so I think it should go there too.

Buffer size is also interesting to monitor, but it doesn't make sense as a metric, it's more useful as a per-schedule thing to check with DescribeSchedule. I could imagine:

  • a user does a large backfill with BufferAll and wants to know when they're all done
  • a user runs a schedule periodically with BufferAll and wants to be alerted when there are > 5 runs backed up, but doesn't care before that

How did you test it?
No test

Potential risks
None

Is hotfix candidate?
No

@ShahabT ShahabT requested review from a team as code owners August 31, 2023 21:27
@ShahabT ShahabT requested a review from dnr August 31, 2023 21:27
@ShahabT ShahabT changed the title shahab/OSS-1374 Add buffer size and dropped actions fields to ScheduleInfo Aug 31, 2023
@cretz
Copy link
Member

cretz commented Aug 31, 2023

Can you add a bit more to the "why", as in specifically how users may use it?

@ShahabT
Copy link
Contributor Author

ShahabT commented Aug 31, 2023

Can you add a bit more to the "why", as in specifically how users may use it?

Great point, @cretz. This is a warm up task for me and I do not have all the context about how users benefit from this. @dnr, could you explain the use case please? (I'll add your reply to the "Why" section)

@dnr
Copy link
Member

dnr commented Sep 1, 2023

It's basically just for visibility and completeness. Buffer overruns are already exposed in a metric, but we have similar fields in ScheduleInfo for number of runs skipped, etc. so I think it should go there too.

Buffer size is also interesting to monitor, but it doesn't make sense as a metric, it's more useful as a per-schedule thing to check with DescribeSchedule. I could imagine:

  • a user does a large backfill with BufferAll and wants to know when they're all done
  • a user runs a schedule periodically with BufferAll and wants to be alerted when there are > 5 runs backed up, but doesn't care before that

Comment on lines 328 to 329
// Number of actions dropped due to the buffer limit.
int64 buffer_dropped = 11;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd put this above in a group with the other counters. buffer_size is a gauge and more related to running_workflows.

@ShahabT ShahabT enabled auto-merge (squash) September 1, 2023 20:59
@cretz
Copy link
Member

cretz commented Sep 1, 2023

Makes sense. I may request a bit more detail into what the "buffer" is on the proto comment. For example, I don't know whether buffer is used for future calendar/interval or just backlog of backfill or otherwise pending actions from the past.

Copy link
Member

@dnr dnr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both backfills and normal runs (and trigger immediately) go in the buffer together

@ShahabT ShahabT merged commit b1f3616 into master Sep 5, 2023
3 checks passed
@ShahabT ShahabT deleted the shahab/OSS-1374 branch September 5, 2023 12:13
ShahabT added a commit to temporalio/temporal that referenced this pull request Sep 6, 2023
<!-- Describe what has changed in this PR -->
**What changed?**
Expose buffer size and number of dropped actions due to buffer limit in
ScheduleInfo
Related API change: temporalio/api#311


<!-- Tell your future self why have you made these changes -->
**Why?**
Give more visibility into schedule state to the clients.


<!-- How have you verified this change? Tested locally? Added a unit
test? Checked in staging env? -->
**How did you test it?**
None so far


<!-- Assuming the worst case, what can be broken when deploying this
change to production? -->
**Potential risks**
None

<!-- Is this PR a hotfix candidate or require that a notification be
sent to the broader community? (Yes/No) -->
**Is hotfix candidate?**
No
dnr pushed a commit to dnr/temporal that referenced this pull request Oct 30, 2023
…4839)

<!-- Describe what has changed in this PR -->
**What changed?**
Expose buffer size and number of dropped actions due to buffer limit in
ScheduleInfo
Related API change: temporalio/api#311

<!-- Tell your future self why have you made these changes -->
**Why?**
Give more visibility into schedule state to the clients.

<!-- How have you verified this change? Tested locally? Added a unit
test? Checked in staging env? -->
**How did you test it?**
None so far

<!-- Assuming the worst case, what can be broken when deploying this
change to production? -->
**Potential risks**
None

<!-- Is this PR a hotfix candidate or require that a notification be
sent to the broader community? (Yes/No) -->
**Is hotfix candidate?**
No
rodrigozhou pushed a commit to temporalio/temporal that referenced this pull request Oct 31, 2023
<!-- Describe what has changed in this PR -->
**What changed?**
Expose buffer size and number of dropped actions due to buffer limit in
ScheduleInfo
Related API change: temporalio/api#311

<!-- Tell your future self why have you made these changes -->
**Why?**
Give more visibility into schedule state to the clients.

<!-- How have you verified this change? Tested locally? Added a unit
test? Checked in staging env? -->
**How did you test it?**
None so far

<!-- Assuming the worst case, what can be broken when deploying this
change to production? -->
**Potential risks**
None

<!-- Is this PR a hotfix candidate or require that a notification be
sent to the broader community? (Yes/No) -->
**Is hotfix candidate?**
No
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants