This repository gives you access to trained LLMs that can directly be used for any zero-shot NLP task.
Please install conda (miniconda recommended) for the environment manager.
Once conda is installed, you can create and activate the environment using the following commads.
conda env create -f zero-nlp.yml
conda activate zero-nlp
Sentence classification is the most basic NLP task. It consists of classifying one sentence (i.e a text) into a target class. It is necessary for most applications that requires NLU components.
The most used models for this task are encoder-only architecture based on BERT and the most often used is Roberta. However, Roberta needs to be finetuned on every specific NLU task making it unpractical and impossible to use for domains where data is not largely available.
We propose a unified model, based on Google's T5 architecture that achieves extremly good results on any classification task without having been trained on it. This model can directly be used for:
- Topic classification
- Intent recognition
- Boolean question-answering
- Sentiment analysis
- and more...
The base model is hosted on HuggingFace's hub here. Please dowmload it from there.
The additional trained weights are in the model folder.
For this example, we will use a ~5B parameters model trained on NLI task and load it with the peft library.
model_path = 'model/3way-nli-mixture'
from peft import PeftModel, PeftConfig
from transformers import AutoTokenizer, AutoConfig
from model.modeling import T5ForClassification
peft_config = PeftConfig.from_pretrained(model_path)
base_config = AutoConfig.from_pretrained(model_path)
model = T5ForClassification.from_pretrained(pretrained_model_name_or_path=peft_config.base_model_name_or_path, **base_config.to_diff_dict(), load_in_8bit=True, device_map={'': 0})
model = PeftModel.from_pretrained(model, model_path, device_map={'': 0})
model.eval()
tokenizer = AutoTokenizer.from_pretrained(model_path, model_max_length=512)
We will instantiate a Zero-Shot Boolean Question-Answering dataset. If you wish to use other tasks, please refer to the different tasks available for the data format. You can also create your own zero-shot task by following the template.
task_name = 'boolqa'
data = dict(question='Do you like being a child?', answer='I hate being a child', candidate_labels=['yes', 'no'])
from zero import ZeroDataset
dataset = ZeroDataset(task_name)
dataset.from_dict(data)
The model true_id
is 0. true_id
corresponds to the output neuron we are interested in (other neurons will be ignored). For this model, it corresponds to the entailment
class in NLI.
from zero import ZeroClassifier
classifier = ZeroClassifier(model, tokenizer, true_id=0)
res = classifier.classify(dataset, batch_size=1, threshold=0.8)
pred_text = data['candidate_labels'][res['preds'][0]]
print('Prediction: {} ({})'.format(pred_text, res['preds'][0]))
# Prediction: no (1)
print('Probabilities: {}'.format(res['probs'][0]))
# Probabilities: [0.08075528591871262, 0.9192447066307068]