Open
Description
Which component is this feature for?
Ollama Instrumentation
🔖 Feature description
Currently TTFT(GEN_AI_SERVER_TIME_TO_FIRST_TOKEN) is already supported in ollama instrumentation. Maybe we can also add meter LLM_STREAMING_TIME_TO_GENERATE
to ollama streaming mode?
Now the definition of LLM_STREAMING_TIME_TO_GENERATE
is llm.openai.chat_completions.streaming_time_to_generate
which is exclusive to openai. Maybe we should also change the value of this key?
Related to #2934
🎤 Why is this feature needed ?
It's already implemented in other frameworks which supports streaming mode like openai.
✌️ How do you aim to achieve this?
Record and report this meter like openai.
🔄️ Additional Information
No response
👀 Have you spent some time to check if this feature request has been raised before?
- I checked and didn't find similar issue
Are you willing to submit PR?
Yes I am willing to submit a PR!
Metadata
Metadata
Assignees
Labels
No labels