Find the dataset here.
This is a combination of 7 datasets namely:
- Alpaca - Instruction following
- CodeAlpaca - Programming
- Dolly - Instruction following
- Tigerbot GSM - Math
- Tiger StackExchange - Chat
- Glaive Code - Porgramming/Computer Questions
- MetaMath QA - Math
Note: No changes were made to the content in the above datasets. The only changes made were the column names in the above datasets. Input columns were added for some datasets.
This dataset was made to fine tune models for better performance in programming and math.
A fine-tuned version of Mistral is currently being developed using this dataset. The link will be available soon.