Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update HumanEval and MBPP notebooks #1895

Merged
merged 5 commits into from Mar 2, 2023

Conversation

olliestanley
Copy link
Collaborator

This updates HumanEval and MBPP instruction dataset generation notebooks to correct the output format and push their results to HuggingFace Hub, while also moving them to the new data directory.

It also changes the name of the grade-school-math-instructions directory to grade_school_math_instructions for consistency with other directories in data/datasets.

Copy link
Collaborator

@Vechtomov Vechtomov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Later we can combine all these datasets in one and add SOURCE column. Also we can increase diversity of prompts.

@olliestanley
Copy link
Collaborator Author

Later we can combine all these datasets in one and add SOURCE column. Also we can increase diversity of prompts.

I will at least combine the two code gen datasets together and test gen datasets together for now, I can do it quickly. Prompt diversity may be a bit of a challenge given the nature of the instructions but I agree we can probably introduce some at least.

@olliestanley
Copy link
Collaborator Author

Now updated to just produce 2 datasets (code gen and test gen) rather than 4.

@olliestanley olliestanley enabled auto-merge (squash) March 2, 2023 15:42
@olliestanley olliestanley merged commit 0b6865b into LAION-AI:main Mar 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants