Skip to content

Generation framework should be tokens in tokens out #5

@parthchadha

Description

@parthchadha

Describe the bug

To avoid discrepancies of different tokenizer versions between training and generation framework, generation framework should just get the tokens in, special token_id's (ex: stop, eos token) and it should return just tokens.

Steps/Code to reproduce bug

The current vllm implementation has reference to the tokenizer.

The fix for this issue should include a test which ensures we don't add tokenizer to any generation backend.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions