Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: ✨ ollama streaming mode #522

Merged
merged 1 commit into from
May 3, 2024

Conversation

philippart-s
Copy link
Contributor

@philippart-s philippart-s commented Apr 28, 2024

PR to try to add the streaming feature to Ollama provider through the extension πŸ˜‰

First commits are the setup of the the feature πŸ—οΈ.

Copy link
Collaborator

@geoand geoand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the start!

I added some comments

@@ -26,7 +26,36 @@
<artifactId>quarkus-langchain4j-core</artifactId>
<version>${project.version}</version>
</dependency>

<dependency>
<groupId>dev.langchain4j</groupId>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I needed it for one of my tests, but it's no longer in use.
I've deleted it (and I'll see if I need some classes from this dependency later πŸ˜‰).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

➑️ 35a077b

Copy link
Contributor Author

@philippart-s philippart-s Apr 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@geoand after removed it i remember (or Maven remember to me πŸ˜‰) why I've needed the add the dependency: it was to import the class dev.langchain4j.model.ollama.OllamaStreamingChatModel.

If I well understood, for the supplier I need a builder and this builder is in this class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confess that at this stage, I don't understand yet why, for some models, some classes are used from langchain4j and other rewritten with duplicated code.

Other thinks I need to understand, for example, is why for some models the langchain4j client is used or a new one is created (like Mistral for example).

I'm trying to understand the architecture of the code, and I think I'm going to make some mistakes, sorry.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general idea was that we try to reuse upstream bit, but for some integrations the upstream code is so mimimal that it just does not bring any benefit to do so.
Such is the case of the Ollama extension

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I begin to understand this and I agree with you πŸ˜‰

@geoand
Copy link
Collaborator

geoand commented Apr 29, 2024

Thanks, this is a nice start! The most important part now is the implementation of OllamaStreamingChatLanguageModel.

@philippart-s
Copy link
Contributor Author

Thanks, this is a nice start! The most important part now is the implementation of OllamaStreamingChatLanguageModel.

Yes, I've started the implementation, I need to correct one or two mistakes but it's coming along!

@andreadimaio
Copy link
Contributor

@philippart-s I think that this link can help you to understand how to create the REST API call for the streaming. About the code implementation, you might want to take a look at what I did for the BAM module here and here (I hope this can help you).

@philippart-s
Copy link
Contributor Author

@geoand : I've a first version that I want to test but when I run the integration tests (for example https://github.com/quarkiverse/quarkus-langchain4j/tree/main/integration-tests/ollama, I've this error: 2024-04-30 15:13:06,715 ERROR [io.qua.dep.dev.IsolatedDevModeMain] (main) Failed to start quarkus: java.lang.IllegalStateException: No config found for interface io.quarkiverse.langchain4j.runtime.aiservice.ChatMemoryConfig

I run the quarkus dev command.
Is there a tips to run the examples in the integration-test folder?

@geoand
Copy link
Collaborator

geoand commented Apr 30, 2024

πŸŽ‰

I would first build the entire project and then do mvn quarkus:dev in the integration-tests/ollama directory.

@philippart-s
Copy link
Contributor Author

πŸŽ‰

I would first build the entire project and then do mvn quarkus:dev in the integration-tests/ollama directory.

it's what I did:

➜ cd quarkus-langchain4j/integration-tests/ollama 

➜ mvn quarkus:dev
[INFO] Scanning for projects...
[INFO] 
[INFO] --< io.quarkiverse.langchain4j:quarkus-langchain4j-integration-test-ollama >--
[INFO] Building Quarkus LangChain4j - Integration Tests - Ollama 999-SNAPSHOT
[INFO]   from pom.xml
[INFO] --------------------------------[ jar ]---------------------------------
[INFO] 
[INFO] --- quarkus:3.8.2:dev (default-cli) @ quarkus-langchain4j-integration-test-ollama ---
[INFO] Invoking enforcer:3.4.1:enforce (enforce-java-version) @ quarkus-langchain4j-integration-test-ollama
[INFO] Rule 0: org.apache.maven.enforcer.rules.BannedRepositories passed
[INFO] Rule 1: org.apache.maven.enforcer.rules.version.RequireJavaVersion passed
[INFO] Invoking enforcer:3.4.1:enforce (enforce-maven-version) @ quarkus-langchain4j-integration-test-ollama
[INFO] Rule 0: org.apache.maven.enforcer.rules.version.RequireMavenVersion passed
[INFO] Invoking sundr:0.103.1:generate-bom (default) @ quarkus-langchain4j-integration-test-ollama
[INFO] Invoking buildnumber:3.2.0:create (get-scm-revision) @ quarkus-langchain4j-integration-test-ollama
[INFO] Executing: /bin/sh -c cd '/Users/stef/Dev/quarkus-langchain4j/integration-tests/ollama' && 'git' 'rev-parse' '--verify' 'HEAD'
[INFO] Working directory: /Users/stef/Dev/quarkus-langchain4j/integration-tests/ollama
[INFO] Storing buildNumber: efc0de2213cdc181d26d9377a46890c403b0642b at timestamp: 1714568697753
[INFO] Executing: /bin/sh -c cd '/Users/stef/Dev/quarkus-langchain4j/integration-tests/ollama' && 'git' 'symbolic-ref' 'HEAD'
[INFO] Working directory: /Users/stef/Dev/quarkus-langchain4j/integration-tests/ollama
[INFO] Storing scmBranch: ollama-streaming
[INFO] Invoking formatter:2.23.0:format (format-sources) @ quarkus-langchain4j-integration-test-ollama
[INFO] Processed 1 files in 259ms (Formatted: 0, Skipped: 1, Unchanged: 0, Failed: 0, Readonly: 0)
[INFO] Invoking impsort:1.9.0:sort (sort-imports) @ quarkus-langchain4j-integration-test-ollama
[INFO] Processed 1 files in 00:00.005 (Already Sorted: 1, Needed Sorting: 0)
[INFO] Invoking resources:3.3.1:resources (default-resources) @ quarkus-langchain4j-integration-test-ollama
[INFO] Copying 1 resource from src/main/resources to target/classes
[INFO] Invoking compiler:3.12.1:compile (default-compile) @ quarkus-langchain4j-integration-test-ollama
[INFO] Nothing to compile - all classes are up to date.
[INFO] Invoking resources:3.3.1:testResources (default-testResources) @ quarkus-langchain4j-integration-test-ollama
[INFO] skip non existing resourceDirectory /Users/stef/Dev/quarkus-langchain4j/integration-tests/ollama/src/test/resources
[INFO] Invoking compiler:3.12.1:testCompile (default-testCompile) @ quarkus-langchain4j-integration-test-ollama
[INFO] No sources to compile
[WARNING] [io.quarkus.bootstrap.devmode.DependenciesFilter] Live reload was disabled for the following project artifacts:
- io.quarkiverse.langchain4j:quarkus-langchain4j-ollama:999-SNAPSHOT
- io.quarkiverse.langchain4j:quarkus-langchain4j-core:999-SNAPSHOT
- io.quarkiverse.langchain4j:quarkus-langchain4j-core-runtime-spi:999-SNAPSHOT
- io.quarkiverse.langchain4j:quarkus-langchain4j-core-deployment:999-SNAPSHOT
- io.quarkiverse.langchain4j:quarkus-langchain4j-ollama-deployment:999-SNAPSHOT
The artifacts above appear to be either dependencies of non-reloadable application dependencies or Quarkus extensions
Listening for transport dt_socket at address: 5005
2024-05-01 15:05:00,120 INFO  [io.qua.dep.dev.IsolatedDevModeMain] (main) Attempting to start live reload endpoint to recover from previous Quarkus startup failure
2024-05-01 15:05:00,335 ERROR [io.qua.dep.dev.IsolatedDevModeMain] (main) Failed to start quarkus: java.lang.IllegalStateException: No config found for interface io.quarkiverse.langchain4j.runtime.aiservice.ChatMemoryConfig
        at io.quarkus.deployment.ExtensionLoader.loadStepsFrom(ExtensionLoader.java:186)
        at io.quarkus.deployment.QuarkusAugmentor.run(QuarkusAugmentor.java:107)
        at io.quarkus.runner.bootstrap.AugmentActionImpl.runAugment(AugmentActionImpl.java:330)
        at io.quarkus.runner.bootstrap.AugmentActionImpl.createInitialRuntimeApplication(AugmentActionImpl.java:251)
        at io.quarkus.runner.bootstrap.AugmentActionImpl.createInitialRuntimeApplication(AugmentActionImpl.java:60)
        at io.quarkus.deployment.dev.IsolatedDevModeMain.firstStart(IsolatedDevModeMain.java:112)
        at io.quarkus.deployment.dev.IsolatedDevModeMain.accept(IsolatedDevModeMain.java:433)
        at io.quarkus.deployment.dev.IsolatedDevModeMain.accept(IsolatedDevModeMain.java:55)
        at io.quarkus.bootstrap.app.CuratedApplication.runInCl(CuratedApplication.java:138)
        at io.quarkus.bootstrap.app.CuratedApplication.runInAugmentClassLoader(CuratedApplication.java:93)
        at io.quarkus.deployment.dev.DevModeMain.start(DevModeMain.java:131)
        at io.quarkus.deployment.dev.DevModeMain.main(DevModeMain.java:62)

I have this error for all tests in the integration-tests, I should do something wrong but I don't understand what.

@geoand
Copy link
Collaborator

geoand commented May 1, 2024

Weird... I've never seen that happen and it works fine for me (and CI)

@philippart-s
Copy link
Contributor Author

I'll check again all my configuration to see if I have an issue on my local configuration.

@geoand
Copy link
Collaborator

geoand commented May 2, 2024

Would you like to push what you have so I can check it out locally?

@philippart-s
Copy link
Contributor Author

after a fresh build it's working πŸŽ‰
I'm going to push my current code but I think it's not perfect (I haven't dev the number of tokens yet).

@philippart-s
Copy link
Contributor Author

@geoand here is the first version with streaming mode.
Sorry but at the end there is an error:

 ERROR [org.jbo.res.rea.ser.han.PublisherResponseHandler] (vert.x-eventloop-thread-0) Exception in SSE server handling, impossible to send it to client: org.jboss.resteasy.reactive.ClientWebApplicationException: HTTP 200 OK
        at io.quarkus.rest.client.reactive.jackson.runtime.serialisers.ClientJacksonMessageBodyReader.readFrom(ClientJacksonMessageBodyReader.java:57)
        at org.jboss.resteasy.reactive.client.impl.ClientReaderInterceptorContextImpl.proceed(ClientReaderInterceptorContextImpl.java:86)
        at org.jboss.resteasy.reactive.client.impl.ClientSerialisers.invokeClientReader(ClientSerialisers.java:160)
        at org.jboss.resteasy.reactive.client.impl.RestClientRequestContext.readEntity(RestClientRequestContext.java:208)
        at org.jboss.resteasy.reactive.client.impl.MultiInvoker$3.handle(MultiInvoker.java:321)
        at org.jboss.resteasy.reactive.client.impl.MultiInvoker$3.handle(MultiInvoker.java:288)
        at io.vertx.core.impl.ContextInternal.dispatch(ContextInternal.java:276)
        at io.vertx.core.http.impl.HttpEventHandler.handleChunk(HttpEventHandler.java:51)
        at io.vertx.core.http.impl.HttpClientResponseImpl.handleChunk(HttpClientResponseImpl.java:239)
        at io.vertx.core.http.impl.Http1xClientConnection$StreamImpl.lambda$new$0(Http1xClientConnection.java:426)
        at io.vertx.core.streams.impl.InboundBuffer.handleEvent(InboundBuffer.java:255)
        at io.vertx.core.streams.impl.InboundBuffer.write(InboundBuffer.java:134)
        at io.vertx.core.http.impl.Http1xClientConnection$StreamImpl.handleChunk(Http1xClientConnection.java:701)
        at io.vertx.core.impl.ContextImpl.execute(ContextImpl.java:320)
        at io.vertx.core.impl.DuplicatedContext.execute(DuplicatedContext.java:171)
        at io.vertx.core.http.impl.Http1xClientConnection.handleResponseChunk(Http1xClientConnection.java:889)
        at io.vertx.core.http.impl.Http1xClientConnection.handleHttpMessage(Http1xClientConnection.java:808)
        at io.vertx.core.http.impl.Http1xClientConnection.handleMessage(Http1xClientConnection.java:775)
        at io.vertx.core.net.impl.ConnectionBase.read(ConnectionBase.java:159)
        at io.vertx.core.net.impl.VertxHandler.channelRead(VertxHandler.java:153)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
        at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:436)
        at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:346)
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:318)
        at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:251)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: com.fasterxml.jackson.core.io.JsonEOFException: Unexpected end-of-input: expected close marker for Object (start marker at [Source: REDACTED (`StreamReadFeature.INCLUDE_SOURCE_IN_LOCATION` disabled); line: 1, column: 1])
 at [Source: REDACTED (`StreamReadFeature.INCLUDE_SOURCE_IN_LOCATION` disabled); line: 1, column: 93]
        at com.fasterxml.jackson.core.base.ParserMinimalBase._reportInvalidEOF(ParserMinimalBase.java:699)
        at com.fasterxml.jackson.core.base.ParserBase._handleEOF(ParserBase.java:514)
        at com.fasterxml.jackson.core.base.ParserBase._eofAsNextChar(ParserBase.java:531)
        at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._skipWSOrEnd(UTF8StreamJsonParser.java:3107)
        at com.fasterxml.jackson.core.json.UTF8StreamJsonParser.nextToken(UTF8StreamJsonParser.java:760)
        at com.fasterxml.jackson.databind.deser.BuilderBasedDeserializer.vanillaDeserialize(BuilderBasedDeserializer.java:286)
        at com.fasterxml.jackson.databind.deser.BuilderBasedDeserializer.deserialize(BuilderBasedDeserializer.java:217)
        at com.fasterxml.jackson.databind.deser.DefaultDeserializationContext.readRootValue(DefaultDeserializationContext.java:342)
        at com.fasterxml.jackson.databind.ObjectReader._bindAndClose(ObjectReader.java:2125)
        at com.fasterxml.jackson.databind.ObjectReader.readValue(ObjectReader.java:1501)
        at io.quarkus.rest.client.reactive.jackson.runtime.serialisers.ClientJacksonMessageBodyReader.readFrom(ClientJacksonMessageBodyReader.java:53)
        ... 42 more

I think I've reached the limit of my knowledge on how to call Ollama and how to integrate the streaming function into this extension.
If you have some time to help me finish the development, I'd be grateful.

@geoand
Copy link
Collaborator

geoand commented May 3, 2024

Thanks a lot @philippart-s!

I will take your code and finish the implementation soon.

@philippart-s
Copy link
Contributor Author

Thanks, don't hesitate to ping me, I'll see what you fix in my code and be more autonomous the next time.
Sorry to not be able to do all the code alone.

@geoand
Copy link
Collaborator

geoand commented May 3, 2024

No need to apologize!

Thanks a lot for getting the ball rolling on this one!

I will ping you when I've completed the PR :)

Co-Authored-By: Georgios Andrianakis <geoand@gmail.com>
@geoand
Copy link
Collaborator

geoand commented May 3, 2024

The problem with the exception you are seeing turns out to be weirder than I thought...

All we can do for the time being is workaround it as I have done.

Do you mind checking if things work for you with the latest version (I've force pushed to your branch so you'll have to be careful when pulling)?

@geoand geoand marked this pull request as ready for review May 3, 2024 08:12
@geoand geoand requested a review from a team as a code owner May 3, 2024 08:12
@philippart-s
Copy link
Contributor Author

yes, I'm going to try it as soon as I have a bit of time in my day πŸ˜‰

@philippart-s
Copy link
Contributor Author

@geoand I tested with my app and it seems ok πŸ‘Œ.
Thanks to completed the code to made it correct!

Just for my information, for the next PR, what type of configuration for the code formatter is used? (to avoid the maven verify error on the CI πŸ˜…)

@geoand
Copy link
Collaborator

geoand commented May 3, 2024

Thanks for checking!

All I do is mvn install -f ollama and the build does the formatting automatically

@geoand geoand merged commit f51e321 into quarkiverse:main May 3, 2024
12 checks passed
@adriens
Copy link

adriens commented May 4, 2024

πŸ‘

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat: Add the streaming mode to Ollama
4 participants