-
Notifications
You must be signed in to change notification settings - Fork 356
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add model streaming #8973
Conversation
✅ Deploy Preview for determined-ui canceled.
|
DROP TRIGGER IF EXISTS stream_model_trigger_seq ON models; | ||
CREATE TRIGGER stream_model_trigger_seq | ||
BEFORE INSERT OR UPDATE OF | ||
name, description, creation_time, last_updated_time, metadata, labels, user_id, archived, notes, workspace_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's likely outside the scope of this ticket; however, due to the limitations of postgres notify/listen queues, we'll want to only communicate over these channels using filterable columns. It doesn't look like the work on making the transition to the strategy is complete.
That being said, this will need to be adjusted with the work so that we don't potentially exceed that 8k character limit with user defined fields
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ticket is related to this issue. I plan to add unit tests next before focusing on this ticket. Let me know if you think this should take priority, and I'll be happy to tackle it first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The streaming updates code looks solid, could integration tests be added?
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #8973 +/- ##
==========================================
+ Coverage 47.55% 47.56% +0.01%
==========================================
Files 1168 1169 +1
Lines 176706 176883 +177
Branches 2356 2353 -3
==========================================
+ Hits 84026 84137 +111
- Misses 92522 92588 +66
Partials 158 158
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Thanks! I refactored the |
master/internal/stream/models.go
Outdated
// determined:stream-gen source=client | ||
type ModelSubscriptionSpec struct { | ||
WorkspaceIDs []int `json:"workspace_ids"` | ||
ModelIDs []int `json:"Model_ids"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this supposed to be capitalized?
// ModelSubscriptionSpec is what a user submits to define a Model subscription. | ||
// | ||
// determined:stream-gen source=client | ||
type ModelSubscriptionSpec struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it might be helpful to be able to subscribe to model streaming updates by model name(s)? i think in most cases users won't easily have access to the model ID, and the model name is unique.
for reference, the python SDK accepts both model ID and model name as possible fetch parameters (client.get_model(ID|name)
), and the CLI only accepts name (in det model describe NAME
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's hesitancy to allow for subscribing based on model name due to limitations on postgres notify/listen queues. To allow for subscribing based on model names, we'd have to enforce an maximum length. The current maximum size of a model name is only restricted by the index limit in postgres as of right now (8191 bytes), which will exceed the capacity of the postgres queues.
// ModelSubscriptionSpec is what a user submits to define a Model subscription. | ||
// | ||
// determined:stream-gen source=client | ||
type ModelSubscriptionSpec struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similarly, would it be a good idea to be subscribe by user ID and/or labels? i'm imagining a use case where you don't want to have to figure out specific identifiers for the models you're interested in, so maybe a broader subscription scope would be helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see an issue with allow subscription based on user id; assuming RBAC is implemented in EE.
I would be hesitant to allow label-based subscriptions for the same reason as above, especially since there doesn't seem to be a limit on the length of a tag.
should model streaming be added to the python streaming client |
a few general design questions wrt streaming updates and not in the scope of this PR (feel free to redirect me if this isn't the right place to ask these, just some thoughts i had while reading through the code as a newcomer to streaming updates 😄 ):
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. commented a few suggestions/questions but up to you whether they're relevant.
Hi @azhou-determined , thanks for your review. To answer you questions:
|
master/internal/stream/models.go
Outdated
type ModelSubscriptionSpec struct { | ||
WorkspaceIDs []int `json:"workspace_ids"` | ||
ModelIDs []int `json:"model_ids"` | ||
ModelNames []string `json:"model_names"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are model names being restricted in length to ensure that we don't exceed the postgres payload limit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe not, thanks for your input, I decided to remove subscription by model name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! 😄
Description
Add model streaming, this is not yet exposed to the WebUI.
There will be a separate PR for EE to handle rbac.
Test Plan
Connect to
ws://localhost:8080/stream
, should be able to subscribe and unsubscribe formodels
.Commentary (optional)
There is a separate ticket for adding unit test, so unit test is not yet included in this PR.
Checklist
docs/release-notes/
.See Release Note for details.
Ticket
MD-261