# Dataset Parameters

----------------

Here we define the parameters for each dataset: <br>
(1) the name of the dataset <br>
(2) [if applicable] the sub-dataset (e.g. rte for glue) <br>
(3) whether the training set or test set should be used <br>
(4) whether the dataset is on huggingface <br>

----------------

## Table of Contents
- <a href='#0)-Prereqs'>0) Prereqs</a>
- <a href='#1)-Define-Dataset-Parameters'>1) Define Dataset Parameters</a>
  * <a href='#1.1)-SST-2'>1.1) SST-2</a>
  * <a href='#1.2)-AGNews'>1.2) AGNews</a>
  * <a href='#1.3)-TREC'>1.3) TREC</a>
  * <a href='#1.4)-DBPedia'>1.4) DBPedia</a>
  * <a href='#1.5)-RTE'>1.5) RTE</a>
  * <a href='#1.6)-MRPC'>1.6) MRPC</a>
  * <a href='#1.7)-TweetEval-Hate'>1.7) TweetEval-Hate</a>
  * <a href='#1.8)-SICK'>1.8) SICK</a>
  * <a href='#1.9)-Poem-Sentiment'>1.9) Poem-Sentiment</a>
  * <a href='#1.10)-Ethos'>1.10) Ethos</a>
  * <a href='#1.11)-Financial-Phrasebank'>1.11) Financial-Phrasebank</a>
  * <a href='#1.12)-MedQ-Pairs'>1.12) MedQ-Pairs</a>
  * <a href='#1.13)-TweetEval-Feminist'>1.13) TweetEval-Feminist</a>
  * <a href='#1.14)-TweetEval-Atheism'>1.14) TweetEval-Atheism</a>
  * <a href='#1.15)-Unnatural'>1.15) Unnatural</a>
  * <a href='#1.16)-SST-2-A/B'>1.16) SST-2-A/B</a>
- <a href='#2)-Save-Dataset-Parameters'>2) Save Dataset Parameters</a>

## 0) Prereqs

In [None]:
import json

## 1) Define Dataset Parameters

### 1.1) SST-2

In [None]:
sst2_dataset_params = {
    "set_name": "sst2",
    "config": None,
    "train_or_test": "train",
    "on_hugging_face": True,
}

### 1.2) AGNews

In [None]:
agnews_dataset_params = {
    "set_name": "ag_news",
    "config": None,
    "train_or_test": "train",
    "on_hugging_face": True,
}

### 1.3) TREC

In [None]:
trec_dataset_params = {
    "set_name": "trec",
    "config": None,
    "train_or_test": "train",
    "on_hugging_face": True,
}

### 1.4) DBPedia

In [None]:
dbpedia_dataset_params = {
    "set_name": "dbpedia_14",
    "config": None,
    "train_or_test": "train",
    "on_hugging_face": True,
}

### 1.5) RTE

In [None]:
rte_dataset_params = {
    "set_name": "glue",
    "config": "rte",
    "train_or_test": "train",
    "on_hugging_face": True,
}

### 1.6) MRPC

In [None]:
mrpc_dataset_params = {
    "set_name": "glue",
    "config": "mrpc",
    "train_or_test": "train",
    "on_hugging_face": True,
}

### 1.7) TweetEval-Hate

In [None]:
tweet_eval_hate_dataset_params = {
    "set_name": "tweet_eval",
    "config": "hate",
    "train_or_test": "train",
    "on_hugging_face": True,
}

### 1.8) SICK

In [None]:
sick_dataset_params = {
    "set_name": "sick",
    "config": None,
    "train_or_test": "train",
    "on_hugging_face": True,
}

### 1.9) Poem-Sentiment

In [None]:
poem_sentiment_dataset_params = {
    "set_name": "poem_sentiment",
    "config": None,
    "train_or_test": "train",
    "on_hugging_face": True,
}

### 1.10) Ethos

In [None]:
ethos_dataset_params = {
    "set_name": "ethos",
    "config": "binary",
    "train_or_test": "train",
    "on_hugging_face": True,
}

### 1.11) Financial-Phrasebank

In [None]:
financial_phrasebank_dataset_params = {
    "set_name": "financial_phrasebank",
    "config": None,
    "train_or_test": "",
    "on_hugging_face": False,
}

### 1.12) MedQ-Pairs

In [None]:
medical_questions_pairs_dataset_params = {
    "set_name": "medical_questions_pairs",
    "config": None,
    "train_or_test": "train",
    "on_hugging_face": True,
}

### 1.13) TweetEval-Feminist

In [None]:
tweet_eval_stance_feminist_dataset_params = {
    "set_name": "tweet_eval",
    "config": "stance_feminist",
    "train_or_test": "train",
    "on_hugging_face": True,
}

### 1.14) TweetEval-Atheism

In [None]:
tweet_eval_stance_atheism_dataset_params = {
    "set_name": "tweet_eval",
    "config": "stance_atheism",
    "train_or_test": "train",
    "on_hugging_face": True,
}

### 1.15) Unnatural

In [None]:
unnatural_dataset_params = {
    "set_name": "unnatural",
    "config": None,
    "train_or_test": "",
    "on_hugging_face": False,
}

### 1.16) SST-2-A/B

In [None]:
sst2_ab_dataset_params = {
    "set_name": "sst2",
    "config": None,
    "train_or_test": "train",
    "on_hugging_face": True,
}

## 2) Save Dataset Parameters

In [None]:
dataset_params = {
    "sst2": sst2_dataset_params,
    "agnews": agnews_dataset_params,
    "trec": trec_dataset_params,
    "dbpedia": dbpedia_dataset_params,
    "rte": rte_dataset_params,
    "mrpc": mrpc_dataset_params,
    "tweet_eval_hate": tweet_eval_hate_dataset_params,
    "sick": sick_dataset_params,
    "poem_sentiment": poem_sentiment_dataset_params,
    "ethos": ethos_dataset_params,
    "financial_phrasebank": financial_phrasebank_dataset_params,
    "medical_questions_pairs": medical_questions_pairs_dataset_params,
    "tweet_eval_stance_feminist": tweet_eval_stance_feminist_dataset_params,
    "tweet_eval_stance_atheism": tweet_eval_stance_atheism_dataset_params,
    "unnatural": unnatural_dataset_params,
    "sst2_ab": sst2_ab_dataset_params,
}

In [None]:
with open('data/dataset_params.json', 'w') as fp:
    json.dump(dataset_params, fp)