GPT-4 Instruction dataset #31

KnutJaegersberg · 2023-04-02T04:59:27Z

Take a look:

https://github.com/teknium1/GPTeacher

PhoebusSi · 2023-04-02T14:53:20Z

We will soon collect them and thank you for your support.

KnutJaegersberg · 2023-04-06T11:01:50Z

This one is a mixture of other datasets, but It should contain a few new records. It now landed on huggingface.

https://huggingface.co/datasets/swype/instruct

PhoebusSi · 2023-04-06T12:19:57Z

Thank you very much for your reminder. We 'll collect it soon.

KnutJaegersberg · 2023-04-07T10:35:26Z

Here is another one, alpaca but generated gpt-4. Includes Chinese translations :)

https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM#fine-tuning-with-the-data

KnutJaegersberg · 2023-04-09T09:45:55Z

Related to your project, because you started out with chain-of-thoughts fine tuning:

Researchers alpaca finetuned Galactica, Galpaca, which seems to have better reasoning in science and technological domains than llama:

https://twitter.com/oijna/status/1637566839235518464

https://huggingface.co/GeorgiaTechResearchInstitute/galpaca-30b

dkqkxx · 2023-04-11T03:44:40Z

I'll pay attention to these, thx.

KnutJaegersberg · 2023-04-12T18:02:47Z

This is so insanely fast moving, I get confused.

https://github.com/databrickslabs/dolly/tree/master/data

KnutJaegersberg · 2023-04-16T17:33:45Z

Author description (not mine):
"CAMEL datasets:PhysicsChemistry and Biology. Each dataset contains 20K problem-solution pairs, consisting of 25 topics, 25 subtopics and 32 problems for each "topic, subtopic" pair generated and solved by GPT4"

https://github.com/lightaime/camel#data-hosted-on-hugging-face

KnutJaegersberg · 2023-04-24T06:08:57Z

https://github.com/DreamerGPT/DreamerGPT/tree/main/data

PhoebusSi closed this as completed Apr 2, 2023

PhoebusSi reopened this Apr 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPT-4 Instruction dataset #31

GPT-4 Instruction dataset #31

KnutJaegersberg commented Apr 2, 2023

PhoebusSi commented Apr 2, 2023

KnutJaegersberg commented Apr 6, 2023

PhoebusSi commented Apr 6, 2023

KnutJaegersberg commented Apr 7, 2023

KnutJaegersberg commented Apr 9, 2023

dkqkxx commented Apr 11, 2023

KnutJaegersberg commented Apr 12, 2023

KnutJaegersberg commented Apr 16, 2023

KnutJaegersberg commented Apr 24, 2023

GPT-4 Instruction dataset #31

GPT-4 Instruction dataset #31

Comments

KnutJaegersberg commented Apr 2, 2023

PhoebusSi commented Apr 2, 2023

KnutJaegersberg commented Apr 6, 2023

PhoebusSi commented Apr 6, 2023

KnutJaegersberg commented Apr 7, 2023

KnutJaegersberg commented Apr 9, 2023

dkqkxx commented Apr 11, 2023

KnutJaegersberg commented Apr 12, 2023

KnutJaegersberg commented Apr 16, 2023

KnutJaegersberg commented Apr 24, 2023