-
Notifications
You must be signed in to change notification settings - Fork 67
Support checkpoint_path for endpoint creation #181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
yixu34
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably fine? Think we should get @jenkspt @sam-scale @adlam-scale to get 👀 on the API change
| model_name: str | ||
| source: LLMSource = LLMSource.HUGGING_FACE | ||
| inference_framework: LLMInferenceFramework = LLMInferenceFramework.DEEPSPEED | ||
| inference_framework: LLMInferenceFramework = LLMInferenceFramework.TEXT_GENERATION_INFERENCE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a breaking change? Seems like we're changing default functionality. Might be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't think people could run build deepspeed models right now
| metadata: Dict[str, Any] # TODO: JSON type | ||
| post_inference_hooks: Optional[List[str]] | ||
| endpoint_type: ModelEndpointType = ModelEndpointType.SYNC | ||
| endpoint_type: ModelEndpointType = ModelEndpointType.STREAMING |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this line up with the default on the server?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, but this lines up with the default endpoint type for TGI
|
FYI @juliashuieh ran into trouble with the tar strategy because (I strongly suspect) of out of disk space issues. the tar file is 80G, so unzipped (the original tar file is not deleted) it takes up 160G. Then there are safetensors that are created which take up another 80G which subsequently exceeds the storage space of the pod (which is 200G). Would be nice to untar + delete or increase the hd space |
storage size can be specified in API |
also i believe we do delete after untar, but still we'd need 160G space to untar. default storage is 96GB |
|
I get this is out-of-scope for this PR - but not sure where to put it. Can remove the requirement to tar our checkpoints? |
|
Seems I can't use the default using |
Uh oh!
There was an error while loading. Please reload this page.