AIService interaction point #9

geoand · 2023-11-13T15:46:10Z

The idea is to leverage AI services to provide a declarative interface like REST Client.
You would describe the interaction point with the LLM using an interface and annotations.

For example:

@RegisterAiService(
    name=... // Optional - override the config key, can also be named `configKey`
    chatModel = ...// String - the chat model (or streaming chat model) identified as a name. If not set, use the default one (validated and set at build time)
    tools = ...// List<String> - the list of bean identifiers providing tools (validated at build time); if not set, all tools are available
    chatMemory = ...// String - the bean identifier for the chat memory; if not set, use the default one 
    moderationModel = .. // String - the bean identifier for the chat memory; if not set, use the default one (no moderation)
    retriever = ... // String - the bean identifier of the RAG 
)
public interface MyAiService { 
  // ...
}

All the attributes are optional, meaning that the following snippet would use sensible defaults:

@RegisterAiService
public interface MyAiService { 
  // ...
}

All the configurations can be set in the application.properties using the quarkus.aiservices.$name.attr=value syntax (prefix not decided yet).

While the skeleton will be doable at build time, it will not be possible to initialize everything at build time, as the RAG may connect to the store (it might be interesting to preload the in-memory store at build time, but in the general case, it would not work).

It should be possible to use fault-tolerance annotation on the AiService methods. (timeout, retry, or even circuit breaker...).
If OTel is available, each method would be timed and counter automatically. The outcome would also be monitored.
If an audit service is available (See #12), each method invocation will be audited.

Other extensions:

Tools category - in a bean providing tools, we may need security feature (authentication) or identify the tools category. Specific processing (such as authentication) can be applied before calling the tool. That also means we need tools interceptor. We can reuse CDI interceptors. However, we may need access to the context (both the conversational context and the duplicated context).

The text was updated successfully, but these errors were encountered:

geoand · 2023-11-13T15:46:27Z

The declarative approach has been implemented. The rest is still pending

geoand added the epic label Nov 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AIService interaction point #9

AIService interaction point #9

geoand commented Nov 13, 2023

geoand commented Nov 13, 2023

AIService interaction point #9

AIService interaction point #9

Comments

geoand commented Nov 13, 2023

geoand commented Nov 13, 2023