-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add EvolInstruct
and EvolInstructGenerator
tasks
#407
Conversation
Differs from the default `EvolInstruct` which will be refactored to be an evolution on top of existing instructions i.e. always expecting `seed_data` (requires modifications on top of the original implementation)
Return `inputs` where not properly formatted, and `yield` was running twice when `generate_answers=True`
Use `enum.Enum` instead of `enum.EnumType`
51ceda6
to
61a560a
Compare
Fair, we can include it, but I'm afraid that the misalignments between their official implementation and the paper don't have a clear reference, should we take more inspiration from the paper instead? My experience so far is that using GPT-4 for generating and evolving instructions works well so far, but we can try to compare with and without
Right, I was using |
Also I've been exploring a bit on GitHub and, is this a more faithful reproduction of EvolInstruct (pre-WizardLM)? https://github.com/nlpxucan/WizardLM/tree/main/Evol_Instruct cc @davidberenstein1957 |
@alvarobartt I would say so, yes. This is also the way we eventually implemented it initially. I would also go with the more extensive prompts to ensure we've got a higher change to make it work without more advancd models. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @alvarobartt, it looks good to me. Some methods are quite big so I would split them
Used https://github.com/nlpxucan/WizardLM/tree/main/Evol_Instruct as reference instead (using same prompts as the paper)
Co-authored-by: Gabriel Martin <gabrielmbmb@users.noreply.github.com>
Co-authored-by: Gabriel Martin <gabrielmbmb@users.noreply.github.com>
Description
This PR adds both the
EvolInstruct
andEvolInstructGenerator
tasks ported from https://github.com/h2oai/h2o-wizardlm/blob/main/wizardlm.py with some slight modifications to suit our needs, but respecting the evolutionary approach.Besides that the
AsyncLLM
has been fixed so as to useasyncio
event loops instead ofasyncio.run
as it was raising some errors when called within a loop, so now theAsyncLLM
implementation is more robust. Also theGeneratorTask
has been included for tasks likeEvolInstructGenerator
i.e. generating data without seed data as input.Closes #408
Example
Reference
https://github.com/h2oai/h2o-wizardlm/blob/main/wizardlm.py