-
Notifications
You must be signed in to change notification settings - Fork 67
File API functions #160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
File API functions #160
Conversation
|
|
||
| id: str = Field(..., description="ID of the requested file.") | ||
| """ID of the requested file.""" | ||
| filename: str = Field(..., description="File name.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: newline spacing to let things breathe a bit?
clients/python/llmengine/file.py
Outdated
|
|
||
| class File(APIEngine): | ||
| """ | ||
| File API. This API is used to upload private files to Scale so that fine-tunes can access them for training and validation data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't mention Scale here, because this API will eventually work for self-hosted users too. The client can after all work in dual modes, talking to either the Scale-hosted or self-hosted LLM Engine server.
clients/python/llmengine/file.py
Outdated
| """ | ||
|
|
||
| @classmethod | ||
| def upload(cls, file_path: str) -> UploadFileResponse: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think to start, we can do a file instead of a path.
clients/python/llmengine/file.py
Outdated
| @classmethod | ||
| def get(cls, file_id: str) -> GetFileResponse: | ||
| """ | ||
| Get filename and size of a file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would suggest generalizing this to the metadata of a file, in case we add more.
clients/python/llmengine/file.py
Outdated
| @classmethod | ||
| def list(cls) -> ListFilesResponse: | ||
| """ | ||
| List all files, with information about their filenames and sizes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same thing w.r.t metadata.
clients/python/llmengine/file.py
Outdated
| return DeleteFileResponse.parse_obj(response) | ||
|
|
||
| @classmethod | ||
| def get_content(cls, file_id: str) -> GetFileContentResponse: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to call this download to have symmetry with your upload function. Granted, from the server's perspective, get_content makes more sense, but given that we're in the context of the client, I think it's ok to have the functions be client-centric.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the response be GetFileContentResponse or DownloadFileResponse, then? (I.e. is it more important for this naming to be consistent in the client, or for the client DTOs to be copy-pastable from the server DTOs?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say it's more important to be consistent in the client, because per above, from the server's perspective it's more about getting (extracting) content, whereas from this client, you're more specifically downloading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll change it to DownloadFileResponse in the client, but I'm concerned it might be overridden if someone re-copies the DTOs from the server in the future 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wait sorry what I meant was that the client methods should be named upload/download, but it makes it easier to avoid mistakes when copying DTOs, the DTOs can follow the server. Also curious if @phil-scale or Will have thoughts on this.
| f"v1/files/{file_id}/content", | ||
| timeout=DEFAULT_TIMEOUT, | ||
| ) | ||
| return GetFileContentResponse.parse_obj(response) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually maybe it is fine to have this always be a string - we just need to document the expectations, e.g. if you uploaded text, then it'll ofc be that text; else if it's binary, we'll return it as a string subject to some encoding.
Client functions + some documentation for File API