Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the difference between tasksource and FLAN? #7

Open
imoneoi opened this issue Nov 28, 2023 · 2 comments
Open

What is the difference between tasksource and FLAN? #7

imoneoi opened this issue Nov 28, 2023 · 2 comments

Comments

@imoneoi
Copy link

imoneoi commented Nov 28, 2023

Thanks for the great work! I'm also using FLAN for training, so I'm wondering how to include only tasks that are in Tasksource but not in FLAN.

@sileod
Copy link
Owner

sileod commented Nov 28, 2023

Thanks !
Tasksource is designed for transparency and quick addition of new tasks, and composability. Tasksource tasks can be recasted programmatically into instructions or used for classification.

Tasksource has the only symbol-tuning available to my knowledge, greatly improves few shot learning
https://huggingface.co/datasets/tasksource/icl-symbol-tuning-instruct
in addition to tasksource instruct-v0

tasksource is more up to date and I tried to be exhaustive (focus on reasoning/logic/NLI), but it has lower prompt formulation diversity.
flan also include some tasks that are not very reasoning intensive like formulating hypothesis given a premise, these are quite interesting but should be sampled

Are you talking about Flan or Flan with SNI ?

Also see https://www.dataprovenance.org/ for many instruction datasets, we are planning to work with prominent model builders, I would be glad to chat with you on e.g. discord

Task id not in FlanV2/Bigbench/MMLU/truthfulQA/chatbot_arena_conversations:

[' 'WANLI',
'recast/recast_verbnet',
'recast/recast_verbcorner',
'recast/recast_ner',
'recast/recast_sentiment',
'recast/recast_puns',
'recast/recast_factuality',
'recast/recast_megaveridicality',
'probability_words_nli/reasoning_1hop',
'probability_words_nli/usnli',
'probability_words_nli/reasoning_2hop',
'nan-nli/joey234--nan-nli',
'nli_fever',
'breaking_nli',
'conj_nli',
'fracas',
'dialogue_nli',
'mpe',
'dnc',
'recast_white/fnplus',
'recast_white/sprl',
'recast_white/dpr',
'robust_nli/IS_CS',
'robust_nli/LI_LI',
'robust_nli/ST_WO',
'robust_nli/PI_SP',
'robust_nli/PI_CD',
'robust_nli/ST_SE',
'robust_nli/ST_NE',
'robust_nli/ST_LM',
'robust_nli_is_sd',
'robust_nli_li_ts',
'gen_debiased_nli/snli_seq_z',
'gen_debiased_nli/snli_z_aug',
'gen_debiased_nli/snli_par_z',
'gen_debiased_nli/mnli_par_z',
'gen_debiased_nli/mnli_z_aug',
'gen_debiased_nli/mnli_seq_z',
'add_one_rte',
'hlgd',
'conll2003/pos_tags',
'conll2003/chunk_tags',
'conll2003/ner_tags',
'hh-rlhf',
'model-written-evals',
'fig-qa',
'social_i_qa',
'balanced-copa',
'e-CARE',
'insincere-questions',
'TuringBench',
'vitaminc/tals--vitaminc',
'rumoureval_2019/RumourEval2019',
'tweet_eval/irony',
'tweet_eval/stance_abortion',
'tweet_eval/hate',
'tweet_eval/stance_atheism',
'tweet_eval/stance_climate',
'tweet_eval/emoji',
'tweet_eval/offensive',
'tweet_eval/sentiment',
'tweet_eval/emotion',
'tweet_eval/stance_feminist',
'tweet_eval/stance_hillary',
'discovery/discovery',
'pragmeval/verifiability',
'pragmeval/mrda',
'pragmeval/switchboard',
'pragmeval/emergent',
'pragmeval/gum',
'pragmeval/sarcasm',
'pragmeval/stac',
'pragmeval/pdtb',
'silicone/dyda_e',
'silicone/oasis',
'silicone/meld_s',
'silicone/meld_e',
'silicone/maptask',
'silicone/dyda_da',
'silicone/sem',
'silicone/iemocap',
'lex_glue/scotus',
'lex_glue/ledgar',
'language-identification',
'rotten_tomatoes',
'hate_speech18',
'sms_spam',
'snips_built_in_intents',
'hate_speech_offensive',
'hyperpartisan_news',
'sciie',
'citation_intent',
'scicite',
'lexical_relation_classification/ROOT09',
'lexical_relation_classification/CogALexV',
'lexical_relation_classification/K&H+N',
'lexical_relation_classification/BLESS',
'lexical_relation_classification/EVALution',
'crowdflower/political-media-bias',
'crowdflower/tweet_global_warming',
'crowdflower/text_emotion',
'crowdflower/political-media-message',
'crowdflower/political-media-audience',
'crowdflower/economic-news',
'crowdflower/corporate-messaging',
'crowdflower/airline-sentiment',
'crowdflower/sentiment_nuclear_power',
'ethics/commonsense',
'ethics/deontology',
'ethics/justice',
'ethics/virtue',
'tweets_hate_speech_detection',
'wnut_17/wnut_17',
'ncbi_disease/ncbi_disease',
'acronym_identification',
'jnlpba/jnlpba',
'ontonotes_english/SpeedOfMagic--ontonotes_english',
'blog_authorship_corpus/gender',
'blog_authorship_corpus/horoscope',
'blog_authorship_corpus/job',
'open_question_type',
'mc_taco',
'discosense',
'EffectiveFeedbackStudentWriting',
'phrase_similarity',
'scientific-exaggeration-detection',
'fever-evidence-related/mwong--fever-related',
'dynasent/dynabench.dynasent.r1.all/r1',
'dynasent/dynabench.dynasent.r2.all/r2',
'sem_eval_2010_task_8',
'medmcqa',
'logiqa',
'cycic_classification',
'cycic_multiplechoice',
'commonsense_qa_2.0',
'lingnli',
'monotonicity-entailment',
'arct',
'scinli',
'naturallogic',
'onestop_qa',
'moral_stories/full',
'prost',
'dynahate',
'syntactic-augmentation-nli',
'autotnli',
'CONDAQA',
'webgpt_comparisons',
'synthetic-instruct-gptj-pairwise',
'scruples',
'wouldyourather',
'attempto-nli',
'defeasible-nli/snli',
'defeasible-nli/atomic',
'help-nli',
'nli-veridicality-transitivity',
'natural-language-satisfiability',
'lonli',
'dadc-limit-nli',
'FLUTE',
'summarize_from_feedback/comparisons',
'folio',
'tomi-nli',
'avicenna',
'SHP',
'MedQA-USMLE-4-options-hf',
'wikimedqa/medwiki',
'cicero',
'mutual',
'NeQA',
'quote-repetition',
'redefine-math',
'puzzte',
'implicatures',
'race-c',
'spartqa-yn',
'spartqa-mchoice',
'temporal-nli',
'riddle_sense',
'clcd-english',
'twentyquestions',
'reclor',
'counterfactually-augmented-imdb',
'counterfactually-augmented-snli',
'cnli',
'boolq-natural-perturbations',
'equate',
'ScienceQA_text_only',
'ekar_english',
'implicit-hate-stg1',
'logiqa-2.0-nli',
'PARARULE-Plus',
'mindgames',
'universal_dependencies/en_partut/deprel',
'universal_dependencies/en_lines/deprel',
'universal_dependencies/en_gum/deprel',
'universal_dependencies/en_ewt/deprel',
'ambient',
'path-naturalness-prediction',
'cloth',
'dgen',
'oasst1_pairwise_rlhf_reward',
'I2D2',
'args_me',
'Touche23-ValueEval',
'starcon',
'banking77',
'ruletaker',
'lsat_qa/all',
'ConTRoL-nli',
'tracie',
'sherliic',
'sen-making/1',
'sen-making/2',
'mbib-base/cognitive-bias',
'mbib-base/fake-news',
'mbib-base/gender-bias',
'mbib-base/hate-speech',
'mbib-base/linguistic-bias',
'mbib-base/political-bias',
'mbib-base/racial-bias',
'mbib-base/text-level-bias',
'robustLR',
'v1/gen_train234_test2to10',
'logical-fallacy',
'parade',
'cladder',
'subjectivity',
'MOH',
'VUAC',
'TroFi',
'sharc_modified/mod',
'conceptrules_v2',
'disrpt/eng.dep.scidtb',
'conll2000',
'few-nerd/supervised',
'zero-shot-label-nli',
'com2sense',
'scone',
'winodict',
'fool-me-twice',
'monli',
'corr2cause',
'apt',
'twitter-financial-news-sentiment',
'icl-symbol-tuning-instruct',
'SpaceNLI',
'propsegment/nli',
'HatemojiBuild',
'regset',
'esci',
'dnd_style_intents']

@imoneoi
Copy link
Author

imoneoi commented Nov 29, 2023

@sileod Thanks for the detailed response!

I'm using the FLAN 2022 dataset (https://huggingface.co/datasets/Open-Orca/FLAN). What is FLAN with SNI? Also, are these tasks listed not present in FLAN 2022 and Bigbench and MMLU?

Besides, I'm also interested in symbol tuning. My Discord is imonenext, feel free to DM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants