# GPT-J6B Batch Prediction with Ray Data

In [1]:
prompt = (
    "In a shocking finding, scientists discovered a herd of unicorns living in a remote, "
    "previously unexplored valley, in the Andes Mountains. Even more surprising to the "
    "researchers was the fact that the unicorns spoke perfect English."
)
model_id = "EleutherAI/gpt-j-6B"
revision = "float16"

In [2]:
import ray

2023-02-27 13:23:22.394733: I tensorflow/core/util/util.cc:169] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.


In [None]:
ray.init(
    runtime_env={
        "pip": [
            "accelerate>=0.16.0",
            "transformers>=4.26.0",
        ]
    }
)

In [3]:
import ray.data
import pandas as pd

ds = ray.data.from_pandas(pd.DataFrame([prompt]*10, columns=["prompt"]))

2023-02-27 13:23:27,459	INFO worker.py:1364 -- Connecting to existing Ray cluster at address: 10.0.60.136:6379...
2023-02-27 13:23:27,468	INFO worker.py:1544 -- Connected to Ray cluster. View the dashboard at [1m[32mhttps://console.anyscale-staging.com/api/v2/sessions/ses_84eyfpdey2169juxz5uk8f9k2d/services?redirect_to=dashboard [39m[22m
2023-02-27 13:23:27,493	INFO packaging.py:330 -- Pushing file package 'gcs://_ray_pkg_169586d667d471140904b2eeaf9f5b5d.zip' (7.88MiB) to Ray cluster...
2023-02-27 13:23:27,591	INFO packaging.py:343 -- Successfully pushed file package 'gcs://_ray_pkg_169586d667d471140904b2eeaf9f5b5d.zip'.


We specify an Actor to be used in `map_batches`, so that we only need to initialize the model once, instead of having to do so on each function call.

In [4]:
class PredictActor:
    def __init__(self, model_id: str, revision: str = None):
        from transformers import AutoModelForCausalLM, AutoTokenizer
        import torch

        self.model = AutoModelForCausalLM.from_pretrained(model_id, revision=revision, torch_dtype=torch.float16, low_cpu_mem_usage=True, device_map="auto")
        self.tokenizer = AutoTokenizer.from_pretrained(model_id)
        assert str(self.model.device.type) == "cuda"

    def __call__(self, batch: pd.DataFrame) -> pd.DataFrame:
        input_ids = self.tokenizer(list(batch["prompt"]), return_tensors="pt").input_ids.to(self.model.device)

        gen_tokens = self.model.generate(
            input_ids,
            do_sample=True,
            temperature=0.9,
            max_length=100,
        )
        return pd.DataFrame(self.tokenizer.batch_decode(gen_tokens), columns=["responses"])

In [6]:
ret = ds.map_batches(
    PredictActor,
    batch_size=4,
    fn_constructor_kwargs=dict(model_id=model_id, revision=revision),
    compute="actors",
    num_gpus=1
)
ret.take_all()

2023-02-27 13:36:47,694	INFO bulk_executor.py:39 -- Executing DAG InputDataBuffer[Input] -> ActorPoolMapOperator[MapBatches(PredictActor)]
MapBatches(PredictActor):   0%|          | 0/1 [00:00<?, ?it/s](_MapWorker pid=13474) 2023-02-27 13:36:49.646242: I tensorflow/core/util/util.cc:169] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
(_MapWorker pid=13474)   from pandas import MultiIndex, Int64Index
(_MapWorker pid=13474) The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
(_MapWorker pid=13474) Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
(_MapWorker pid=13474) The attention mask and the pad token id were not set. As a consequence, you may observe unexpe

[{'responses': 'In a shocking finding, scientists discovered a herd of unicorns living in a remote, previously unexplored valley, in the Andes Mountains. Even more surprising to the researchers was the fact that the unicorns spoke perfect English. Dr. Richard Taylor, head of the unicorn research team, said “We have never seen a herd so large in our whole career. There has never been a previously unknown population of this large.”\n\nThe herd, which can be found on the mountains'},
 {'responses': 'In a shocking finding, scientists discovered a herd of unicorns living in a remote, previously unexplored valley, in the Andes Mountains. Even more surprising to the researchers was the fact that the unicorns spoke perfect English.\n\nResearchers from the Wildlife Conservation Society (WCS) have discovered a herd of giant, horned, hoofed livestock in a remote Andean valley, a unicorn herd thought to be extinct for centuries.\n\nIn a paper published today in the journal Scientific Reports'},
 {