Explicitly specify pad token id when generating tokens #3565

sivanantha321 · 2024-04-02T13:33:41Z

What this PR does / why we need it:
We have added a fallback pad token if it is not already present in the tokenizer as part of the PR #3459. But it does not explicitly specifies pad_token_id when invoking generate method which leads to huggingface using eos_token_id as the pad_token_id. The log from huggingface server is show below. To avoid this we should explicitly specify the pad_token_id

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #3536

Type of changes
Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

Feature/Issue validation/testing:

Please describe the tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.

Test A
Test B
Logs

Special notes for your reviewer:

Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

Checklist:

Have you added unit/e2e tests that prove your fix is effective or that this feature works?
Has code been commented, particularly in hard-to-understand areas?
Have you made corresponding changes to the documentation?

Release note:

Explicitly specify pad token id when generating tokens

sivanantha321 · 2024-04-03T10:53:43Z

supersedes #3535

yuzisun · 2024-04-03T12:41:10Z

python/huggingfaceserver/huggingfaceserver/model.py

@@ -198,6 +197,16 @@ def load(self) -> bool:
                raise ValueError(
                    f"Unsupported task {self.task}. Please check the supported `task` option."
                )
+            if not self.tokenizer.pad_token:


Seems like this is moved to only for the predictor case, for the transformer mode do we still need to apply the padding ?

We won't have access to the model to update the vocabulary size. So, Even if we add the pad token we will get an Index out of range error.

sivanantha321 · 2024-04-03T14:15:05Z

python/huggingfaceserver/huggingfaceserver/test_model.py

+    request_two = "my name is teven and i am"
+    response = asyncio.run(model({"instances": [request_one, request_two]}, headers={}))
+    assert request_one in response["predictions"][0]
+    assert request_two in response["predictions"][1]


I am asserting it this way because the output changes occasionally.

I think we can specify the temperature to 0 to get deterministic response

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

saileshd1402 · 2024-05-02T13:17:03Z

Greetings!
There was a certain bug I faced while I was testing master branch which might be related to this PR:

CMD to run hf backend run:

python -m huggingfaceserver --model_id=microsoft/phi-2 --model_name=phi --backend=huggingface

cURL request:

curl -v -H "Content-Type: application/json" http://localhost:8080/openai/v1/completions -d '{"model":"phi", "prompt":"Hello give me a hello world python program", "stream": true, "max_tokens": 5}'

Error:

Exception in thread Thread-4 (generate):
Traceback (most recent call last):
  File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ubuntu/.testenv/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/.testenv/lib/python3.11/site-packages/transformers/generation/utils.py", line 1527, in generate
    result = self._greedy_search(
             ^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/.testenv/lib/python3.11/site-packages/transformers/generation/utils.py", line 2452, in _greedy_search
    raise ValueError("If `eos_token_id` is defined, make sure that `pad_token_id` is defined.")
ValueError: If `eos_token_id` is defined, make sure that `pad_token_id` is defined.

After displaying this error, the runtime freezes and won't take any other requests

CC: @johnugeorge

yuzisun · 2024-05-02T13:32:05Z

@sivanantha321 can you help resolve the merge conflict ?

spolti

/lgtm

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

sivanantha321 · 2024-05-02T17:00:07Z

/rerun-all

sivanantha321 · 2024-05-02T17:03:05Z

/rerun-workflow test-llm

sivanantha321 · 2024-05-02T17:07:45Z

/rerun-workflow E2E Tests

sivanantha321 · 2024-05-02T17:16:55Z

/rerun-all

yuzisun · 2024-05-02T23:55:58Z

/approve

oss-prow-bot · 2024-05-02T23:56:04Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sivanantha321, spolti, yuzisun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [yuzisun]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

yuzisun · 2024-05-02T23:56:41Z

/lgtm

* Add fall back pad token for tokenizer Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Make linter happy Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Update test Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Rebase master Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> --------- Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> Signed-off-by: asd981256 <asd981256@gmail.com>

sivanantha321 force-pushed the support-to-specify-pad-token branch 4 times, most recently from c926e10 to 8589658 Compare April 3, 2024 09:32

sivanantha321 requested review from yuzisun and gavrishp April 3, 2024 10:52

yuzisun reviewed Apr 3, 2024

View reviewed changes

sivanantha321 commented Apr 3, 2024

View reviewed changes

sivanantha321 force-pushed the support-to-specify-pad-token branch 2 times, most recently from ae81834 to 3b91b2b Compare April 6, 2024 16:32

sivanantha321 added 3 commits May 1, 2024 13:00

Add fall back pad token for tokenizer

44df208

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Make linter happy

1fb0740

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Update test

ddeb23a

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

spolti approved these changes May 2, 2024

View reviewed changes

oss-prow-bot bot assigned spolti May 2, 2024

oss-prow-bot bot added the lgtm label May 2, 2024

Rebase master

5c00b2e

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

sivanantha321 force-pushed the support-to-specify-pad-token branch from 3b91b2b to 5c00b2e Compare May 2, 2024 16:08

oss-prow-bot bot removed the lgtm label May 2, 2024

oss-prow-bot bot added the approved label May 2, 2024

oss-prow-bot bot assigned yuzisun May 2, 2024

oss-prow-bot bot added the lgtm label May 2, 2024

yuzisun merged commit 9c6a6b8 into kserve:master May 2, 2024
57 of 58 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Explicitly specify pad token id when generating tokens #3565

Explicitly specify pad token id when generating tokens #3565

sivanantha321 commented Apr 2, 2024

sivanantha321 commented Apr 3, 2024

yuzisun Apr 3, 2024

sivanantha321 Apr 3, 2024

sivanantha321 Apr 3, 2024

yuzisun Apr 7, 2024

saileshd1402 commented May 2, 2024 •

edited

yuzisun commented May 2, 2024

spolti left a comment

sivanantha321 commented May 2, 2024

sivanantha321 commented May 2, 2024

sivanantha321 commented May 2, 2024

sivanantha321 commented May 2, 2024

yuzisun commented May 2, 2024

oss-prow-bot bot commented May 2, 2024

yuzisun commented May 2, 2024

Explicitly specify pad token id when generating tokens #3565

Explicitly specify pad token id when generating tokens #3565

Conversation

sivanantha321 commented Apr 2, 2024

sivanantha321 commented Apr 3, 2024

yuzisun Apr 3, 2024

Choose a reason for hiding this comment

sivanantha321 Apr 3, 2024

Choose a reason for hiding this comment

sivanantha321 Apr 3, 2024

Choose a reason for hiding this comment

yuzisun Apr 7, 2024

Choose a reason for hiding this comment

saileshd1402 commented May 2, 2024 • edited

yuzisun commented May 2, 2024

spolti left a comment

Choose a reason for hiding this comment

sivanantha321 commented May 2, 2024

sivanantha321 commented May 2, 2024

sivanantha321 commented May 2, 2024

sivanantha321 commented May 2, 2024

yuzisun commented May 2, 2024

oss-prow-bot bot commented May 2, 2024

yuzisun commented May 2, 2024

saileshd1402 commented May 2, 2024 •

edited