When using Javascript, at the moment if you want to make chat completions to be cancellable, you need to use the OpenAI webservice to cancel your request. This works ok, but cancelling the completion streaming should also be doable using the internal API.
I'm not sure if the issue is also there for other languages, but here what the API would look like for Typescript :
class ChatClient {
....
completeStreamingChat(messages: any[], tools: any[], signal: AbortSignal ): AsyncIterable<any>;
}
This would allow user to stop models in user interface. Indeed it's common for users to see that the SLM has taken a wrong course of action, or is bugging (repeating same words again and again) and the ability to cancel these is necessary.
When using Javascript, at the moment if you want to make chat completions to be cancellable, you need to use the OpenAI webservice to cancel your request. This works ok, but cancelling the completion streaming should also be doable using the internal API.
I'm not sure if the issue is also there for other languages, but here what the API would look like for Typescript :
This would allow user to stop models in user interface. Indeed it's common for users to see that the SLM has taken a wrong course of action, or is bugging (repeating same words again and again) and the ability to cancel these is necessary.