feat: stream responses support #21

slavarazum · 2022-12-21T19:25:56Z

Brings stream support to return partial progress with server-sent events.

⚠️ requests with best_of option can't be streamed.

You can think of it as the progress of AI typing.

The main advantage of using streams is that you can return a response to the user much earlier.
See examples above to compare the result.

Request completion model

openai-php-request.mp4

Stream completion model

openai-php-stream.mp4

Currently for completions model.

Usage

Pass the stream parameter and get a generator object.
You can iterate over it until stream ends.

$client = OpenAI::client('YOUR_API_KEY');

$stream = $client->completions()->create([
    'model' => 'davinci',
    'prompt' => 'PHP is',
    'stream' => true,
]);

$fullText = '';

foreach ($stream as $item) {
    $fullText += $item['choices'][0]['text'];
}

Each iteration of stream read returns the same object as non-streamed responses returns, except usage and finishReason parameters, which are not present in partial streamed responses.

Stream text results to the client with Laravel response:

response()->stream(
    function () use ($stream) {
        foreach ($stream as $item) {
            echo $item['choices'][0]['text'];
            ob_flush();
            flush();
        }
    },
    200,
    [
        'X-Accel-Buffering' => 'no',
    ]
);

You can create your own event stream (server-sent events) to send partially data to the client.

Example with Laravel response:

response()->stream(
    function () use ($stream) {
        foreach ($stream as $item) {
            echo 'data: ' . json_encode($item) . PHP_EOL . PHP_EOL;
            ob_flush();
            flush();
        }
    },
    200,
    [
        'Content-Type' => 'text/event-stream',
        'Connection' => 'keep-alive',
        'Cache-Control' => 'no-cache',
        'X-Accel-Buffering' => 'no',
    ]
);

The current draft tries to create an implementation with minimal API changes.

completions model now returns array|Generator
usage parameter marked as optional in CreateResponse
finishReason changed to nullable
HttpTransporter client dependency changed to GuzzleHttp\ClientInterface

I would be happy for discussion and bring this PR to a stable version.

slavarazum · 2022-12-22T01:51:15Z

In the new commit, replaced custom Stream object with Generator to stay more native.

You can iterate directly through returned generator object:

- foreach ($stream->read() as $item) {
+ foreach ($stream as $item) {
    $fullText += $item['choices'][0]['text'];
}

slavarazum · 2022-12-28T13:57:28Z

@nunomaduro @gehrisandro Any thoughts or suggestions? 💭

nunomaduro · 2022-12-29T14:40:36Z

@slavarazum Currently a little bit busy - I will check this as soon as possible.

nunomaduro · 2023-01-05T01:45:34Z

@slavarazum There is any reason why this makes pull request does not make the client to use stream always?

slavarazum · 2023-01-08T04:00:51Z

@nunomaduro The main reason is that the stream parameter is optional and false by default in OpenAI API.
Stream response requires to handle iterable type.
It might be confusing if we set it to true by default.

However, in my opinion, stream responses absolutely necessary to provide better user experience if response expects as soon as possible on the client side.

ijjimem · 2023-01-08T13:34:44Z

How to make it work now? Is it possible without changes to core code?

nunomaduro · 2023-01-08T14:28:50Z

src/Contracts/Transporter.php

     * @throws ErrorException|UnserializableResponse|TransporterException
     */
-    public function requestObject(Payload $payload): array;
+    public function requestObject(Payload $payload, bool $stream = false): array|Generator;


Can you somehow, make this method return a single object - for both cases - and have the stream option sent on the Payload object?

Created new Response object which returns Transporter::requestObject. Stream option handling moved to Payload object.

nunomaduro · 2023-01-08T14:33:18Z

src/Resources/Completions.php

+
+    private function stream(Generator $stream): Generator
+    {
+        foreach ($stream as $data) {


Can't you create a single CreateResponse object, that behind the scenes allows to iterate on the stream?

CreateResponse for completions now implements Iterator interface which allows to iterate over the stream. How do you feel about that?

slavarazum · 2023-01-09T12:21:18Z

@ijjimem Let's wait for the implementation of this PR. Continued working on it.

- New Transporter Response object which works with raw or stream responses

gehrisandro · 2023-01-17T22:04:27Z

Hi @slavarazum

Thank you very much for your work so far. And sorry, for my delayed response.

I had a look into your implementation and I can see some good starting points, but nevertheless I would like to make a step back and talk first about the use cases and how the usage should look like.

Mainly I can see two use cases or goals to be achieved:

Return as quick as possible the completion retrieved so far. For example as a stream response in Laravel: return response()->stream(...)
Do something with the completion retrieved so far and when done fetch again the completion retrieved so far (not sure if this one is clear, maybe the code example below helps to clarify).

To achieve this two different use cases the user needs a way to use the response in different ways. Therefore I think it's better to have a dedicated CreateResponseStream class which should be separated from the CreateResponse.

Example for use case 1:

$stream = $client->completions()->createStreamed([
    'model' => 'text-davinci-003',
    'prompt' => 'PHP is ',
    'max_tokens' => 100,
]); // CreateStreamResponse

return response()->stream(function () use ($stream) {
    foreach($stream->iterator() as $newPart){
        echo $newPart; // CompletionPartial
    }
});

Example for use case 2:

$stream = $client->completions()->createStreamed([
    'model' => 'text-davinci-003',
    'prompt' => 'PHP is ',
    'max_tokens' => 100,
]); // CreateStreamResponse

while(!$stream->finished()){
    $response = $stream->response(); // CreateResponse object with the full completion received so far
    sleep(1); // do some work with the response
}

Some explanations how I would structure the code:

CreateStreamResponse would implement a new interface StreamResponse with three methods:

iterator() returns a class which implements a new interface StreamResponseIterator (see below)
finished() return a boolean if the stream (completion) has finished (maybe completed() would be a better name, but it's a bit weird in context of "completions")
response() a Response object (in this case a CreateResponse) with the full completion retrieved so far

StreamResponseIterator interface would extend the Iterator interface with objects from a new interface ResponsePartial

StreamResponsePartial interface for a newly received part of the stream response. In case of the completions this would be an object holding a simple string. When listing fine tune events as a stream (the only other endpoint which supports streaming so far) this would be an object holding a new FineTune event.

CompletionPartial would implement StreamResponsePartial and holds only the new part of the completion as a string (and implements __toString())

@nunomaduro and @slavarazum: Can you come up with different use cases and what do you think about having a dedicated method (createStreamed()) and response (CreateStreamResponse)?

slavarazum · 2023-01-18T23:32:06Z

Hi @gehrisandro 🙌

In my vision, stream option should be passed with other options to create method for 2 reasons:

Stay consistent. Developers might expect that all options from OpenAI API Docs able to pass into create method.
A stream response is not guaranteed even if we pass the stream parameter. For example when best_of option is presented too.

A method that returns the full content of a stream response might makes sense.

Let's define the final API.

create method should return a single object for both cases or separate?
to iterate through the stream should we call some method or it would be better to iterate over returned object?

Single object with isStream method and ability to iterate over it may provide more simple API:

$response = $client->completions()->create([
    // ...
]);

if ($response->isStream()) {
    foreach ($response as $part) {
        // ...
    }
}

vs

$response = $client->completions()->create([
    // ...
]);

if ($response instanceof CreateStreamResponse) {
    foreach ($response->iterator() as $part) {
        // ...
    }
}

nunomaduro · 2023-01-18T23:35:55Z

Can I have examples, of our other OpenAI API Clients (in other languages) solved this problem?

ps: @slavarazum really super sorry if this issue is taking forever to decide, but currently I am so busy that's been difficult.

slavarazum · 2023-01-19T00:35:38Z

@nunomaduro NP 🤝 Me too have some troubles in Ukraine with availability 😅 It's ok to not rush with this to create a truly good implementation. As I can see at first glance, currently it partially solved in other languages. So we can serve as a good example.

I have a look around libraries listed in docs and some others.

Examples:

implementation in Crystal library
example from another PHP library
example in unofficial Rust library

Other:

not supported in official Node.js library
I'm not familiar with Python, but this is what I found in the library
haven't found anything for compeletions in .NET library
opened issue in Go Library
nothing in Java Library
in TODO section of Ruby library

gehrisandro · 2023-01-19T22:11:45Z

I tried various combinations with the stream and best_of parameter, because the OpenAI documentation is not very clear about, what happens if you provide both.

As far as I was able to see in my tests, if you provide both parameters together it still returns a stream response but it waits sending the stream until the completion is done. What means it is not faster than a request without the stream option.
But technically it is still a stream which contains the full completion within a single event. And in consequence it does not include the usage.

So I would still prefer to have two different methods but I understand your concern that developers probably are going to try the stream = true parameter on the normal create() method. To mitigate this issue we could throw an exception "stream is not support here, please use createStreamed() instead."

In the opposite we could throw an exception if best_of is passed to the createStreamed method as well even if it technically works, but there is no benefit and may leads to some confusion.

Additionally I found more combinations where the API has, at least in my opinion, a weird or unpredictable behaviour. For example if you pass n = 2 and stream = true the API returns the expected stream response but only with one completion instead of two. In other words the n parameter is completely ignored.
Update: I didn't test carefully enough. It actually returns two completions. Sorry about that.

@slavarazum Do you still think that having a single method is more convenient?
Personally I do not like the necessary if statements to determine how to handle the response.
Furthermore in most use cases developers will not reach for the stream option and therefore I think we should keep the "normal" create() method and response as simple as possible.

slavarazum · 2023-01-24T20:29:11Z

Looks like separate createStreamed method with appropriate exceptions might make sense.

At the moment I see more use cases for streamed responses than for conventional ones. For reasons of the longer response time, normal requests are more likely to be suitable for background operations when the user is not waiting for a response as soon as possible. Perhaps I'm missing something, since the industry is just emerging.

jhull · 2023-02-16T06:58:23Z

What is the latest on this? Wold LOVE to be able to start using this for streaming, but can't find a reasonable way of doing it anywhere ... hoping it's soon!

slavarazum · 2023-02-17T00:42:07Z

@nunomaduro How do you feel about separate createStreamed method, any thoughts about implementation in general?

genesiscz · 2023-03-12T14:42:08Z

Any update on this? Would love to see it working :-) I very like the implementation and agree on the single-method approach.

CaReS0107 · 2023-03-13T14:29:11Z

Any update on this PR ? 🙏

slavarazum · 2023-03-13T23:08:42Z

Working on refactoring with chat endpoint support. Trying to simplify the implementation.

Pierquinto · 2023-03-21T18:52:17Z

do someone of you deal with this on Laravel with InertiaJs and Vue3? I would love to see your implementation!

oppsDayly · 2023-03-23T14:42:47Z

oh why not support stream responses? it is so nice to user . it can tell user ai is working .if wait long time user may think ai is not working .

huangdijia · 2023-03-23T14:48:06Z

I also hope to support stream as soon as possible.

slavarazum · 2023-03-23T15:01:26Z

Working on it exactly right now. Will update a PR with new Chat completions draft as soon as possible.
Stay tuned 😉

@Pierquinto When it will be finished I will share an examples with client side part.

slavarazum · 2023-03-23T22:44:34Z

Let's continue here - #84

slavarazum · 2023-04-24T23:25:58Z

@Pierquinto simple stream reading implementation with fetch:

async function askAi() {
  const response = await fetch("/ask-ai");

  const reader = response.body?.pipeThrough(new TextDecoderStream()).getReader();

  let delta = await reader.read();

  while (!delta.done) {
    // do something with the chunk

    delta = await reader.read();
  }
}

slavarazum added 2 commits December 20, 2022 21:46

Stream tests

b179180

Stream support for completion endpoint

6ee0aa8

nunomaduro requested a review from gehrisandro December 21, 2022 20:49

slavarazum added 2 commits December 22, 2022 03:25

Lint

497e5c7

Stream response based on Generator

a1f2d8a

nunomaduro requested changes Jan 8, 2023

View reviewed changes

slavarazum added 3 commits January 12, 2023 17:48

- Completions\CreateResponse now allows to iterate over the stream

3e3d94b

- New Transporter Response object which works with raw or stream responses

Quality improvements

e5b8529

Type improvements

7f422d2

slavarazum mentioned this pull request Jan 13, 2023

completions create: support stream? #19

Closed

huangdijia approved these changes Mar 15, 2023

View reviewed changes

gehrisandro mentioned this pull request Mar 23, 2023

[0.4] Adds stream support #84

Merged

slavarazum closed this Mar 24, 2023

Uh oh!

feat: stream responses support #21

feat: stream responses support #21

Uh oh!

Conversation

slavarazum commented Dec 21, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Request completion model

Stream completion model

Usage

Uh oh!

slavarazum commented Dec 22, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

slavarazum commented Dec 28, 2022

Uh oh!

nunomaduro commented Dec 29, 2022

Uh oh!

nunomaduro commented Jan 5, 2023

Uh oh!

slavarazum commented Jan 8, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ijjimem commented Jan 8, 2023

Uh oh!

nunomaduro Jan 8, 2023

Choose a reason for hiding this comment

Uh oh!

slavarazum Jan 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nunomaduro Jan 8, 2023

Choose a reason for hiding this comment

Uh oh!

slavarazum Jan 13, 2023

Choose a reason for hiding this comment

Uh oh!

slavarazum commented Jan 9, 2023

Uh oh!

gehrisandro commented Jan 17, 2023

Uh oh!

slavarazum commented Jan 18, 2023

Uh oh!

nunomaduro commented Jan 18, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

slavarazum commented Jan 19, 2023

Uh oh!

gehrisandro commented Jan 19, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

slavarazum commented Jan 24, 2023

Uh oh!

jhull commented Feb 16, 2023

Uh oh!

slavarazum commented Feb 17, 2023

Uh oh!

genesiscz commented Mar 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CaReS0107 commented Mar 13, 2023

Uh oh!

slavarazum commented Mar 13, 2023

Uh oh!

Pierquinto commented Mar 21, 2023

Uh oh!

oppsDayly commented Mar 23, 2023

Uh oh!

huangdijia commented Mar 23, 2023

Uh oh!

slavarazum commented Mar 23, 2023

Uh oh!

slavarazum commented Mar 23, 2023

Uh oh!

slavarazum commented Apr 24, 2023

Uh oh!

Reviewers

Assignees

Labels

slavarazum commented Dec 21, 2022 •

edited

Loading

slavarazum commented Dec 22, 2022 •

edited

Loading

slavarazum commented Jan 8, 2023 •

edited

Loading

slavarazum Jan 13, 2023 •

edited

Loading

nunomaduro commented Jan 18, 2023 •

edited

Loading

gehrisandro commented Jan 19, 2023 •

edited

Loading

genesiscz commented Mar 12, 2023 •

edited

Loading