-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] Improved data preparation for LLM finetuning #8692
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, see minor point
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a comment on a hard-coded value. I am approving this PR as it should work as it is, but I strongly suggest to refactor the code a little bit.
mindsdb/integrations/handlers/anyscale_endpoints_handler/anyscale_endpoints_handler.py
Outdated
Show resolved
Hide resolved
@chandrevdw31 status is marked as If the user reports an actual error in the model record, please ask them to open a bug with it. |
Description
This PR introduces some common data preparation utilities for fine-tuning LLMs that operate following the ChatCompletion format from OpenAI. Currently, these coupled versions of these methods are actively being used in our integrations for Anyscale and OpenAI (hence indirectly by LangChain, too).
Added tests for all of these, so that we can use them on any other LLM engines.
Type of change
Verification Process
To ensure the changes are working as expected:
Additional Media:
Checklist: