Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a file for listing relevant research papers #217

Merged
merged 2 commits into from
Jan 1, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
23 changes: 23 additions & 0 deletions docs/research/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Research

This page lists research papers that are relevant to the project.

## Automatically Generating Instruction Data for Training

This line of work is about significantly reducing the need for manually annotated data for the purpose of training [instruction-aligned](https://openai.com/blog/instruction-following/) language models.

### SELF-INSTRUCT: Aligning Language Model with Self Generated Instructions [[ArXiv](https://arxiv.org/pdf/2212.10560.pdf)], [[Github](https://github.com/yizhongw/self-instruct)].

> We introduce SELF-INSTRUCT, a framework for improving the instruction-following capabilities of pretrained language models by bootstrapping off its own generations.
> Our pipeline generates instruction, input, and output samples from a language model, then prunes them before using them to finetune the original model.
> Applying our method to vanilla GPT3, we demonstrate a 33% absolute improvement over the original model on SuperNaturalInstructions, on par with the performance of InstructGPT-0011, which is trained with private user data and human annotations.

### Tuning Language Models with (Almost) No Human Labor. [[ArXiv](https://arxiv.org/pdf/2212.09689.pdf)], [[Github](https://github.com/orhonovich/unnatural-instructions)].

> In this work, we introduce
> Unnatural Instructions: a large dataset of creative and diverse instructions, collected with virtually no human labor.
> We collect 64,000 examples by prompting a language model with three seed examples of instructions and eliciting a fourth.
> This set is then expanded by prompting the model to rephrase each instruction, creating a total of approximately 240,000 examples of instructions, inputs, and outputs.
> Experiments show that despite containing a fair amount of noise, training on Unnatural Instructions rivals the effectiveness of training
> on open-source manually-curated datasets, surpassing the performance of models such as
> T0++ and Tk-Instruct across various benchmarks.