-
Notifications
You must be signed in to change notification settings - Fork 31.6k
Open
Labels
Feature requestRequest for a new featureRequest for a new feature
Description
Feature request
| # After the symbol for a new line, we flush the cache. |
code example
>>> generation_kwargs = dict(inputs, streamer=streamer, max_new_tokens=20)
>>> thread = Thread(target=model.generate, kwargs=generation_kwargs)
>>> thread.start()
>>> generated_text = ""
>>> for new_text in streamer:
... generated_text += new_text
>>> generated_textMotivation
special token and common token will return in same time in my project, for example 123<|obervation|> return in one time.we need to handle this situation. we create a new class and remove part of cache code.
i wonder to know if necessary to add a param to control streamer cache or another way to handle it.
Your contribution
i can submit a pr , it it is necessary to change it.
Metadata
Metadata
Assignees
Labels
Feature requestRequest for a new featureRequest for a new feature