-
Notifications
You must be signed in to change notification settings - Fork 144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add predibase as a source for adapters #125
Conversation
Todo: move the api token to be read from the request so different tenants can have different tokens for the same endpoint |
resp = requests.get(url, headers=headers) | ||
resp.raise_for_status() | ||
uuid, best_run_id = resp.json()["uuid"], resp.json()["bestRunID"] | ||
return f"{uuid}/{best_run_id}/artifacts/model/model_weights/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels a little too coupled to backend details. Can we have an endpoint that returns the full s3 URI: s3://bucket/.../model_weights/
? I realize the current s3 source is coupled to an environment variable that specifies the bucket, but ideally we should change that, too (it makes it less useful for OSS users who want to use s3).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! Will fix now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine. This is specific for predibase sources - can move the logic into predibase but the result will be the same. We can allow OSS users to override the bucket by passing in AWS creds and specific bucket but that feels out of scope for this. What do you think @tgaddair ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, it's not a release blocker, but it is exposing a lot of unnecessary detail about our underlying storage system that will break if we ever move files around or want to deploy in Azure, etc. So definitely something we need to revisit.
…o add-predibase-source
…o add-predibase-source
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @magdyksaleh, I see you requested my review on this, but it looks like my two comments were not addressed yet.
sorry my comment wasn't added |
3643972
to
7ad75b2
Compare
ad5a6c4
to
9d4d3f5
Compare
|
||
|
||
@lru_cache(maxsize=256) | ||
def map_pbase_model_id_to_s3(model_id: str, api_token: str) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
API token is optional, right? We should raise a useful error if the user didn't provide it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like there are compiler errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
This PR adds predibase as a source for adapters.
It implements a thin wrapper over the S3 model source and resolves the model name and version to a s3 bucket path to pull the adapters from
Move token to header so it isn't loggeddecouple bucketRename to token from predibase token