In [None]:
%pip install --upgrade pip
%pip install openicl
# Restart the kernel after the installation is completed

# 2. Using Different Language Models with OpenICL

In this chapter, we will show you how to use OpenICL to do in-context learning (ICL) with different language models. Mainly including [GPT-2](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf), [FLAN-T5](https://arxiv.org/abs/2109.01652), [XGLM](https://arxiv.org/abs/2112.10668), OpenAI's [GPT-3](https://arxiv.org/abs/2005.14165) API and [OPT-175B](https://arxiv.org/abs/2205.01068) API.

## 2-1 Huggingface Library's Models

In this section, we will take GPT2, FLAN-T5, and XGLM as examples to show you how to use the models in the [huggingface library](https://huggingface.co/models) with OpenICL. Generally speaking, you only need to assign the corresponding name to the `model_name` parameter when declaring `Inferencer`, but we will still provide you with some specific examples.

### 2-1-1 GPT-2

This example can be found in [tutorial1](https://github.com/Shark-NLP/OpenICL/blob/main/examples/tutorials/openicl_tutorial1_getting_started.ipynb). But this time, we set `batch_size` for `TopkRetriever` and `PPLInference` to speed up. It can be noticed that the values ​​of the two `batch_size`(s) could be set to be different (`8` and `6`). That is because, at the beginning of retrieval and inference, the corresponding components will receive the complete dataset or the retrieval results for the entire test set, instead of processing the data in batches.

In [None]:
from openicl import DatasetReader, PromptTemplate, TopkRetriever, PPLInferencer

# Define a DatasetReader, loading dataset from huggingface.
data = DatasetReader('gpt3mix/sst2', input_columns=['text'], output_column='label')

# SST-2 Template Example
tp_dict = {
     0: '</E>Positive Movie Review: </text>',
     1: '</E>Negative Movie Review: </text>' 
}
template = PromptTemplate(tp_dict, {'text' : '</text>'}, ice_token='</E>')

# TopK Retriever
retriever = TopkRetriever(data, ice_num=2, batch_size=8)

# Define a Inferencer
inferencer = PPLInferencer(model_name='gpt2', batch_size=6)

# Inference
predictions = inferencer.inference(retriever, ice_template=template, output_json_filename='gpt2_sst2')
print(predictions)

### 2-1-2 XGLM

When it comes to machine translation, it is a good choice to use XGLM. But when using XGLM, we **don't suggest** to set `batch_size` in `GenInferencer`. (When calling the `model.generate` method of [huggingface transformers library](https://huggingface.co/docs/transformers/index), padding is needed if you want to input multiple pieces of data at a time. But we found in the test that if padding exists, the generation of XGLM will be affected). The code for evaluating the ICL performance of XGLM (7.5B) on WMT16 (de-en) dataset
with direct inference strategy is as follows:

In [None]:
from openicl import DatasetReader, PromptTemplate, BM25Retriever, GenInferencer
from datasets import load_dataset

# Loading dataset from huggingface 
dataset = load_dataset('wmt16', name='de-en')

# Data Preprocessing
dataset = dataset.map(lambda example: example['translation']).remove_columns('translation')

# Define a DatasetReader, selecting 5 pieces of data randomly.
data = DatasetReader(dataset, input_columns='de', output_column='en', ds_size=5)

# WMT16 en->de Template Example
template = PromptTemplate('</E></de> = </en>', {'en' : '</en>', 'de' : '</de>'}, ice_token='</E>')

# BM25 Retriever
retriever = BM25Retriever(data, ice_num=1, index_split='validation', test_split='test', batch_size=5)

# Define a Inferencer
inferencer = GenInferencer(model_name='facebook/xglm-7.5B')

# Inference
predictions = inferencer.inference(retriever, ice_template=template, output_json_filename='xglm_wmt')
print(predictions)

### 2-1-3 FLAN-T5

In this section, we will use FLAN-T5 with OpenICL to reproduce the results in the figure below:

<div align="center">
<img src="https://s1.ax1x.com/2023/03/10/ppuZnQP.png"  width=300px />
<p>(figure in <a href="https://arxiv.org/abs/2109.01652">Finetuned Language Models Are Zero-Shot Learners</a>)</p>
</div>

In [13]:
from openicl import DatasetReader, PromptTemplate, ZeroRetriever, GenInferencer

# Define a DatasetReader, loading dataset from huggingface and selecting 10 pieces of data randomly.
data = DatasetReader('snli', input_columns=['premise', 'hypothesis'], output_column='label', ds_size=10)

# SNLI Template
tp_str = '</E>Premise:</premise>\nHypothesis:</hypothesis>\nDoes the premise entail the hypothesis?\nOPTIONS:\n-yes -It is not possible to tell -no'
template = PromptTemplate(tp_str, column_token_map={'premise' : '</premise>', 'hypothesis' : '</hypothesis>'}, ice_token='</E>')

# ZeroShot Retriever (do nothing)
retriever = ZeroRetriever(data, index_split='train', test_split='test')

# Define a Inferencer
inferencer = GenInferencer(model_name='google/flan-t5-small', max_model_token_num=1000)

# Inference
predictions = inferencer.inference(retriever, ice_template=template, output_json_filename='flan-t5-small')
print(predictions)

Found cached dataset snli (/home/zhangyudejia/.cache/huggingface/datasets/snli/plain_text/1.0.0/1f60b67533b65ae0275561ff7828aad5ee4282d0e6f844fd148d05d3c6ea251b)
100%|██████████| 3/3 [00:00<00:00, 323.35it/s]
[2023-03-10 15:40:19,712] [openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████| 10/10 [00:00<00:00, 15.22it/s]

['yes', 'yes', 'yes', 'no', 'yes', 'no', 'it is not possible to tell', 'yes', 'yes', 'it is not possible to tell']





## 2-2 Using API-based model

OpenICL also currently supports OpenAI's GPT-3 API and OPT-175B API. But before using them, users need to do some configuration.

### 2-2-1 OpenAI's GPT-3 API

OpenAI provides its own open-source library -- [openai](https://github.com/openai/openai-python), for users to call their API services. To use this library in OpenICL, you need to set environment variable `OPEN_API_KEY` in advance. Here is a simple way (for detailed information, see openai's documentation [here](https://platform.openai.com/docs/api-reference/introduction)):

In [None]:
# Replace 'your_api_key' with your key, and run this command in bash
export OPENAI_API_KEY="your_api_key"

After the setting is complete, set `api_name='gpt3'` in `Inferencer` to use it normally. Below is a code snippet:

In [1]:
from openicl import DatasetReader, PromptTemplate, BM25Retriever, GenInferencer
from datasets import load_dataset

dataset = load_dataset("iohadrubin/mtop")
dataset['train'] = dataset['train'].select([0, 1, 2])
dataset['test'] = dataset['test'].select([0])

dr = DatasetReader(dataset, input_columns=['question'], output_column='logical_form')  

tp_str = "</E></Q>\t</L>"      
tp = PromptTemplate(tp_str, column_token_map={'question' : '</Q>', 'logical_form' : '</L>'}, ice_token='</E>')

rtr = BM25Retriever(dr, ice_num=1)

infr = GenInferencer(api_name='gpt3', engine='text-davinci-003', sleep_time=3)

print(infr.inference(rtr, ice_template=tp))

  from .autonotebook import tqdm as notebook_tqdm
Found cached dataset mtop (/home/zhangyudejia/.cache/huggingface/datasets/iohadrubin___mtop/mtop/1.0.0/4ba6d9db9efaebd4f6504db7e36925632e959f456071b9d7f1b86a85cce52448)
100%|██████████| 3/3 [00:00<00:00, 814.96it/s]
[2023-03-10 19:03:00,481] [openicl.icl_retriever.icl_bm25_retriever] [INFO] Retrieving data for test set...
100%|██████████| 1/1 [00:00<00:00, 1504.95it/s]
[2023-03-10 19:03:00,486] [openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████| 1/1 [00:06<00:00,  6.38s/it]

['[IN:GET_EVENTS [SL:TYPE music festivals ] [SL:DATE in 2018 ] ]']





Some models of OpenAI are charged and have a rate limit. So we set `sleep_time`(3 seconds) here to control the frequency of data requests. In order to prevent data loss caused by throwing exceptions, we also recommend using this API on a small-scale test set every time. For more information about API parameter configuration in OpenICL, please view [api_service.py](https://github.com/Shark-NLP/OpenICL/blob/main/openicl/utils/api_service.py).

### 2-2-2 OPT-175B API 

For OPT-175B, you need to deploy the model yourself (or get a URL of a deployed model from your friends :\) ). 
Visit the [metaseq](https://github.com/facebookresearch/metaseq) repository for more information on deployment.

In [None]:
from openicl import GenInferencer

URL = "xxx"
inferencer = GenInferencer(api_name='opt-175b', URL=URL)