-
Notifications
You must be signed in to change notification settings - Fork 189
Rename "capacity" to "count" in the aiModels.json #3639
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
This change is non-backward-compatible, but there are only 3 Os using it right now, so we can handle. The context for this change is this GH Issue: #3627
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #3639 +/- ##
===================================================
+ Coverage 30.55638% 30.56127% +0.00489%
===================================================
Files 156 156
Lines 47162 47161 -1
===================================================
+ Hits 14411 14413 +2
+ Misses 31910 31907 -3
Partials 841 841
... and 2 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
@@ -79,6 +79,7 @@ type AIModelConfig struct { | |||
Token string `json:"token,omitempty"` | |||
Warm bool `json:"warm,omitempty"` | |||
Capacity int `json:"capacity,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we remove Capacity
then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we use Capacity, but it has a different meaning, it means "how many request can be done by one runner in parallel". It's not used by Realtime Video. But it is used for AI Jobs (for some models).
@leszko I was able to startup my LLM runner with |
@@ -1298,15 +1298,13 @@ func StartLivepeer(ctx context.Context, cfg LivepeerConfig) { | |||
} | |||
if *cfg.AIWorker { | |||
modelConstraint := &core.ModelConstraint{Warm: config.Warm, Capacity: 1} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modelConstraint := &core.ModelConstraint{Warm: config.Warm, Capacity: 1} | |
modelConstraint := &core.ModelConstraint{Warm: config.Warm, Capacity: config.Capacity} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add the Capacity
set always I think.
@@ -1298,15 +1298,13 @@ func StartLivepeer(ctx context.Context, cfg LivepeerConfig) { | |||
} | |||
if *cfg.AIWorker { | |||
modelConstraint := &core.ModelConstraint{Warm: config.Warm, Capacity: 1} | |||
// External containers do auto-scale; default to 1 or use provided capacity. | |||
if config.URL != "" && config.Capacity != 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then remove this if stmt because we are adding Capacity set to all modelConstraint
@ad-astra-video please check if it works for the AI Jobs.
This change is non-backward-compatible, but there are only 3 Os using it right now, so we can handle.
The context for this change is this GH Issue: #3627
FYI: @pwilczynskiclearcode (but from what I see we haven't used "capacity" in our infra, so it should be fine)
FYI: @victorges