Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 21 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,22 @@
# PromptSource
Toolkit for collecting and applying templates of prompting instances.
Promptsource is a toolkit for collecting and applying prompts to NLP datasets.

WIP
Promptsource uses a simple templating language to programatically map an example of a dataset into a text input and a text target.

Promptsource contains a growing collection of prompts (which we call **P3**: **P**ublic **P**ool of **P**rompts). As of October 18th, there are ~2'000 prompts for 170+ datasets in P3.
Feel free to use these prompts as they are (you'll find citation details [here](##Citation)).

Note that a subset of the prompts are still *Work in Progress*. You'll find the list of the prompts which will potentially be modified in the near future [here](WIP.md). Modifications will in majority consist of metadata collection, but in some cases, will impact the templates themselves. To facilitate traceability, Promptsource is currently pinned at version `0.1.0`.

## Setup
1. Download the repo
2. Navigate to root directory of the repo
3. Install requirements with `pip install -r requirements.txt` in a Python 3.7 environment

## Running
From the root directory of the repo, you can launch the editor with
You can browse through existing prompts on the [hosted versiond of Promptsource](https://bigscience.huggingface.co/promptsource).

If you want to launch a local version (in particular to write propmts, from the root directory of the repo, launch the editor with:
```
streamlit run promptsource/app.py
```
Expand Down Expand Up @@ -68,3 +75,14 @@ For more information, read the [Contribution guidelines](CONTRIBUTING.md).
**Warning or Error about Darwin on OS X:** Try downgrading PyArrow to 3.0.0.

**ConnectionRefusedError: [Errno 61] Connection refused:** Happens occasionally. Try restarting the app.

## Development structure

Promptsource was developed as part of the [BigScience project for open research 🌸](https://bigscience.huggingface.co/), a year-long initiative targeting the study of large models and datasets. The goal of the project is to research language models in a public environment outside large technology companies. The project has 600 researchers from 50 countries and more than 250 institutions.

## Citation

If you want to cite this P3 or Promptsource, you can use this bibtex:
```bibtex
TODO
```
286 changes: 286 additions & 0 deletions WIP.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,286 @@
# Which prompts are finalized?

A subset of the prompts in P3 are still *Work in Progress*. For information, we provide the lists of the datasets for which prompts have been finalized and datasetsf for which prompts are suceptible to be modified in the near future. Modifications will in majority consist of metadata collection, but in some cases, will impact the templates themselves.

To facilitate traceability, Promptsource is currently pinned at version `0.1.0`.

# Finalized datasets

|Dataset|Subset (optional)|
|-|-|
|adversarial_qa|dbert|
|adversarial_qa|dbidaf|
|adversarial_qa|droberta|
|adversarial_qa|adversarialQA|
|ag_news||
|ai2_arc|ARC-Challenge|
|ai2_arc|ARC-Easy|
|amazon_polarity||
|anli||
|app_reviews||
|circa||
|cnn_dailymail|3.0.0|
|common_gen||
|coqa||
|cos_e|v1.11|
|cos_e|v1.0|
|cosmos_qa||
|crows_pairs||
|craffel/openai_lambada||
|dbpedia_14||
|dream||
|drop||
|duorc|ParaphraseRC|
|duorc|SelfRC|
|emo||
|gigaword||
|glue|cola|
|glue|mrpc|
|glue|qqp|
|glue|sst2|
|glue|stsb|
|hans||
|hellaswag||
|imdb||
|jeopardy||
|jigsaw_toxicity_pred||
|kilt_tasks|nq|
|lambada||
|mc_taco||
|multi_news||
|nq_open||
|openbookqa|main|
|openbookqa|additional|
|paws|labeled_final|
|paws|labeled_swap|
|paws|unlabeled_final|
|paws-x|en|
|piqa||
|qa_srl||
|qasc||
|quac||
|quail||
|quarel||
|quartz||
|quoref||
|race|high|
|race|middle|
|race|all|
|ropes||
|rotten_tomatoes||
|samsum||
|sciq||
|scitail|snli_format|
|scitail|tsv_format|
|social_i_qa||
|squad_v2||
|super_glue|wsc.fixed|
|super_glue|boolq|
|super_glue|cb|
|super_glue|copa|
|super_glue|multirc|
|super_glue|record|
|super_glue|rte|
|super_glue|wic|
|swag|regular|
|trec||
|trivia_qa|rc|
|tydiqa||
|web_questions||
|wiki_bio||
|wiki_hop|original|
|wiki_qa||
|winobias|*|
|winogender||
|winogrande|winogrande_debiased|
|winogrande|winogrande_l|
|winogrande|winogrande_m|
|winogrande|winogrande_s|
|winogrande|winogrande_xl|
|winogrande|winogrande_xs|
|wiqa||
|xsum||
|yelp_review_full||
|Zaid/coqa_expanded||
|Zaid/quac_expanded||

# Work in Progress datasets

|Dataset|Subset (optional)|
|-|-|
|acronym_identification||
|ade_corpus_v2|Ade_corpus_v2_classification|
|ade_corpus_v2|Ade_corpus_v2_drug_ade_relation|
|ade_corpus_v2|Ade_corpus_v2_drug_dosage_relation|
|aeslc||
|amazon_reviews_multi|en|
|amazon_us_reviews|Wireless_v1_00|
|ambig_qa|light|
|aqua_rat|raw|
|art||
|asnq||
|asset|ratings|
|asset|simplification|
|banking77||
|billsum||
|bing_coronavirus_query_set||
|blended_skill_talk||
|boolq||
|cbt|CN|
|cbt|NE|
|cbt|P|
|cbt|raw|
|cbt|V|
|cc_news||
|climate_fever||
|codah|codah|
|codah|fold_0|
|codah|fold_1|
|codah|fold_2|
|codah|fold_3|
|codah|fold_4|
|commonsense_qa||
|conv_ai||
|conv_ai_2||
|conv_ai_3||
|coqa||
|cord19|metadata|
|covid_qa_castorini||
|craigslist_bargains||
|discofuse|discofuse-sport|
|discofuse|discofuse-wikipedia|
|discovery|discovery|
|docred||
|e2e_nlg_cleaned||
|ecthr_cases|alleged-violation-prediction|
|emotion||
|esnli||
|evidence_infer_treatment|1.1|
|evidence_infer_treatment|2|
|fever|v1.0|
|fever|v2.0|
|financial_phrasebank|sentences_allagree|
|freebase_qa||
|generated_reviews_enth||
|glue|ax|
|glue|mnli|
|glue|mnli_matched|
|glue|mnli_mismatched|
|glue|qnli|
|glue|rte|
|glue|wnli|
|google_wellformed_query||
|guardian_authorship|cross_genre_1|
|guardian_authorship|cross_topic_1|
|guardian_authorship|cross_topic_4|
|guardian_authorship|cross_topic_7|
|gutenberg_time||
|head_qa|en|
|health_fact||
|hlgd||
|hotpot_qa|distractor|
|hotpot_qa|fullwiki|
|humicroedit|subtask-1|
|humicroedit|subtask-2|
|hyperpartisan_news_detection|byarticle|
|hyperpartisan_news_detection|bypublisher|
|jfleg||
|kelm||
|liar||
|limit||
|math_dataset|algebra__linear_1d|
|math_dataset|algebra__linear_1d_composed|
|math_dataset|algebra__linear_2d|
|math_dataset|algebra__linear_2d_composed|
|math_qa||
|mdd|task1_qa|
|mdd|task2_recs|
|mdd|task3_qarecs|
|medical_questions_pairs||
|meta_woz|dialogues|
|mocha||
|movie_rationales||
|multi_nli||
|multi_nli_mismatch||
|multi_x_science_sum||
|mwsc||
|narrativeqa||
|ncbi_disease||
|neural_code_search|evaluation_dataset|
|newspop||
|nlu_evaluation_data||
|numer_sense||
|onestop_english||
|poem_sentiment||
|pubmed_qa|pqa_labeled|
|qa_zre||
|qed||
|quora||
|samsum||
|scan|addprim_jump|
|scan|addprim_turn_left|
|scan|filler_num0|
|scan|filler_num1|
|scan|filler_num2|
|scan|filler_num3|
|scan|length|
|scan|simple|
|scan|template_around_right|
|scan|template_jump_around_right|
|scan|template_opposite_right|
|scan|template_right|
|scicite||
|scientific_papers|arxiv|
|scientific_papers|pubmed|
|scitldr|Abstract|
|selqa|answer_selection_analysis|
|sem_eval_2010_task_8||
|sem_eval_2014_task_1||
|sent_comp||
|sick||
|sms_spam||
|snips_built_in_intents||
|snli||
|species_800||
|spider||
|squad||
|squad_adversarial|AddSent|
|squadshifts|amazon|
|squadshifts|new_wiki|
|squadshifts|nyt|
|sst|default|
|stsb_multi_mt|en|
|subjqa|books|
|subjqa|electronics|
|subjqa|grocery|
|subjqa|movies|
|subjqa|restaurants|
|subjqa|tripadvisor|
|tab_fact|tab_fact|
|tmu_gfm_dataset||
|turk||
|tweet_eval|emoji|
|tweet_eval|emotion|
|tweet_eval|hate|
|tweet_eval|irony|
|tweet_eval|offensive|
|tweet_eval|sentiment|
|tweet_eval|stance_abortion|
|tweet_eval|stance_atheism|
|tweet_eval|stance_climate|
|tweet_eval|stance_feminist|
|tweet_eval|stance_hillary|
|tydiqa|primary_task|
|tydiqa|secondary_task|
|wiki_hop|masked|
|wiki_qa||
|wiki_split||
|winograd_wsc|wsc273|
|winograd_wsc|wsc285|
|xnli|en|
|xquad|xquad.en|
|xquad_r|en|
|yahoo_answers_qa||
|yahoo_answers_topics||
|yelp_polarity||
|zest||
Binary file modified assets/promptsource_app.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.