add a param to control cache in streamer when return output

### Feature request

https://github.com/huggingface/transformers/blob/dcbdf7e962c4b36140cc9ee76f870016121e69e5/src/transformers/generation/streamers.py#L102    put() under class TextStreamer will save serval tokens in cache, then return serveral token str in response.  need a paramer to control token cache on or off.

code example
```python
        >>> generation_kwargs = dict(inputs, streamer=streamer, max_new_tokens=20)
        >>> thread = Thread(target=model.generate, kwargs=generation_kwargs)
        >>> thread.start()
        >>> generated_text = ""
        >>> for new_text in streamer:
        ...     generated_text += new_text
        >>> generated_text
```

### Motivation

special token and common token will return in same time in my project, for example  `123<|obervation|>`  return in one time.we need to handle this situation. we create a new class and remove part of cache code.
i wonder to know if necessary to add a param to control streamer cache or another way to handle it.



### Your contribution

i can submit a pr , it it is necessary to change it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add a param to control cache in streamer when return output #36505

Feature request

Motivation

Your contribution

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

add a param to control cache in streamer when return output #36505

Description

Feature request

Motivation

Your contribution

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions