Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add predibase as a source for adapters #125

Merged
merged 25 commits into from
Dec 14, 2023
Merged

Conversation

magdyksaleh
Copy link
Collaborator

@magdyksaleh magdyksaleh commented Dec 13, 2023

This PR adds predibase as a source for adapters.

It implements a thin wrapper over the S3 model source and resolves the model name and version to a s3 bucket path to pull the adapters from

  • Move token to header so it isn't logged
  • decouple bucket
  • make version optional
  • Rename to token from predibase token

launcher/src/main.rs Outdated Show resolved Hide resolved
@magdyksaleh
Copy link
Collaborator Author

Todo: move the api token to be read from the request so different tenants can have different tokens for the same endpoint

resp = requests.get(url, headers=headers)
resp.raise_for_status()
uuid, best_run_id = resp.json()["uuid"], resp.json()["bestRunID"]
return f"{uuid}/{best_run_id}/artifacts/model/model_weights/"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels a little too coupled to backend details. Can we have an endpoint that returns the full s3 URI: s3://bucket/.../model_weights/? I realize the current s3 source is coupled to an environment variable that specifies the bucket, but ideally we should change that, too (it makes it less useful for OSS users who want to use s3).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! Will fix now

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine. This is specific for predibase sources - can move the logic into predibase but the result will be the same. We can allow OSS users to override the bucket by passing in AWS creds and specific bucket but that feels out of scope for this. What do you think @tgaddair ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, it's not a release blocker, but it is exposing a lot of unnecessary detail about our underlying storage system that will break if we ever move files around or want to deploy in Azure, etc. So definitely something we need to revisit.

server/lorax_server/utils/sources/__init__.py Outdated Show resolved Hide resolved
server/lorax_server/utils/sources/__init__.py Outdated Show resolved Hide resolved
router/src/infer.rs Outdated Show resolved Hide resolved
proto/generate.proto Outdated Show resolved Hide resolved
Copy link
Contributor

@tgaddair tgaddair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @magdyksaleh, I see you requested my review on this, but it looks like my two comments were not addressed yet.

@magdyksaleh
Copy link
Collaborator Author

sorry my comment wasn't added



@lru_cache(maxsize=256)
def map_pbase_model_id_to_s3(model_id: str, api_token: str) -> str:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API token is optional, right? We should raise a useful error if the user didn't provide it.

Copy link
Contributor

@tgaddair tgaddair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there are compiler errors.

Copy link
Contributor

@tgaddair tgaddair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@magdyksaleh magdyksaleh merged commit 143f303 into main Dec 14, 2023
1 check passed
@magdyksaleh magdyksaleh deleted the add-predibase-source branch December 14, 2023 23:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants