Skip to content

[BUG] Inference predictions dont match Huggingface for GPT-J #2230

@rahul003

Description

@rahul003

Describe the bug

hf_output [{'generated_text': 'Try without sampling the data.\n\nA:\n\nYou can use the following code to get the data from the database.\n$sql = "SELECT * FROM `table`";\n$result = mysqli_query($conn,'}]
ds output [{'generated_text': 'Try without sampling the ( � hub ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ('}]

To Reproduce
Steps to reproduce the behavior:

import torch
from transformers import pipeline
import deepspeed

query_text = "Try without sampling"
from transformers import GPTJForCausalLM
model = GPTJForCausalLM.from_pretrained("EleutherAI/gpt-j-6B",
                revision="float16",
                torch_dtype=torch.float16,
                low_cpu_mem_usage=True)

pipe = pipeline("text-generation", model=model, tokenizer="EleutherAI/gpt-j-6B", device=0, framework="pt")

pipe.model.half()

hf_output = pipe(query_text, do_sample=False)

pipe.model = deepspeed.init_inference(
    pipe.model,
    mp_size=1,
    dtype=torch.half,
    replace_method="auto",
    replace_with_kernel_inject=True,
)

ds_output = pipe(query_text, do_sample=False)

print('HUGGINGFACE:', hf_output[0])
print('DEEPSPEED:', ds_output[0])

Expected behavior
Output predictions match HF predictions

ds_report output

oot@2f0b3a15b3d0:/fsx/huilgolr/inference/rubik# ds_report
--------------------------------------------------
DeepSpeed C++/CUDA extension op report
--------------------------------------------------
NOTE: Ops not installed will be just-in-time (JIT) compiled at
      runtime if needed. Op compatibility means that your system
      meet the required dependencies to JIT install the op.
--------------------------------------------------
JIT compiled ops requires ninja
ninja .................. [OKAY]
--------------------------------------------------
op name ................ installed .. compatible
--------------------------------------------------
cpu_adam ............... [NO] ....... [OKAY]
cpu_adagrad ............ [NO] ....... [OKAY]
fused_adam ............. [NO] ....... [OKAY]
fused_lamb ............. [NO] ....... [OKAY]
sparse_attn ............ [NO] ....... [OKAY]
transformer ............ [NO] ....... [OKAY]
stochastic_transformer . [NO] ....... [OKAY]
 [WARNING]  async_io requires the dev libaio .so object and headers but these were not found.
 [WARNING]  async_io: please install the libaio-dev package with apt
 [WARNING]  If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found.
async_io ............... [NO] ....... [NO]
utils .................. [NO] ....... [OKAY]
quantizer .............. [NO] ....... [OKAY]
transformer_inference .. [NO] ....... [OKAY]
--------------------------------------------------
DeepSpeed general environment info:
torch install path ............... ['/opt/conda/lib/python3.8/site-packages/torch']
torch version .................... 1.11.0+cu113
torch cuda version ............... 11.3
torch hip version ................ None
nvcc version ..................... 11.3
deepspeed install path ........... ['/deepspeed']
deepspeed info ................... 0.7.1+8b2a6371, 8b2a6371, master
deepspeed wheel compiled w. ...... torch 1.11, cuda 11.3

Screenshots
If applicable, add screenshots to help explain your problem.
Screenshots
NA

System info (please complete the following information):

OS: [e.g. Ubuntu 18.04] Ubuntu
GPU count and types A100 GPU
Interconnects (if applicable) [e.g., two machines connected with 100 Gbps IB] N/A
Python version 3.8.3
Any other relevant info about your setup

Launcher context
inference, single process

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions