-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What about starting a crowdfunding campaign to collect money to run the examples against GPT-4? #37
Comments
Not as good an idea as it might seem. The instructions themselves are very repetitive. When it comes to writing code, there are like 10 different questions reworded 700 times. Otherwise, it's not necessary to collect money at all. Just create an interface, and then each user can run it locally with his own API key on a specific section of the dataset (e.g. 3650 instruction to 10500 instruction for user @viraniaman94). |
It might be possible to use dialogues from the Open Assistant project (https://github.com/LAION-AI/Open-Assistant) as a seed to create more questions. But maybe paying 500$ to clickworkers to create more questions for Open Assistant might be a better investment. |
We have likely corrected many of the major issues with the original alpaca dataset. What remains are a lot of mostly minor issues (a lot of math issues still remain). So I tend to agree with the sentiment that taking the dataset and feeding into GPT-4 will not result in a huge performance boost; however, what might be interesting is having ChatGPT provide a response for all the instructions we have not curated. This would likely result in a more verbose alpaca model that is closer to the output of ChatGPT.
original alpaca output:
ChatGPT output:
Not sure if it would really do anything in terms of performance though. Ideally, we would diversify the dataset. |
Are there any plans on how to fix the dataset for the math issues @gururise? Is anyone actively working on that? I'd rather try my hand at something that isn't going to be useless by the time I'm done because someone already finished it. |
@HideLord fixed the first few batches of the math issues. I know there are quite a few math issues remaining. As far as I know, there is nobody working on that right now. |
I guess one challenge is to maintain transparency at every step, or what would be the legal implications, but its just USD 500, so it shouldn't matter as much anyway!
The text was updated successfully, but these errors were encountered: