Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow to programmatically define system messages in the AiServices #862

Merged
merged 7 commits into from
Apr 10, 2024

Conversation

mariofusco
Copy link
Contributor

@mariofusco mariofusco commented Mar 29, 2024

This pull request is a variation and simplification of what I proposed here. As anticipated there instead of making the AiServices aware of the concept of a state machine I simply added the possibility of making the system messages configurable in a programmatic way. Currently this pull request has the same limitation of the other one: lack of the possibility of configuring messages for each user/conversation and of support of tools, but I believe that both things could be added once we agree on the main direction.

I tried this pull request and the other with the same use case that I implemented here and both produce an output similar to the following:

User: hi
Agent:  Hello! May I please have your full name for our records? And could you kindly share your age with me as well? Thank you.
User: My name is Mario Fusco and I'm 70
Extracted Customer { firstName = "Mario", lastName = "Fusco", age = 70 }
Agent:  Hello Mario, thank you for sharing your name and age. Could you please tell me which flight you recently took with us, including the flight number? Also, was there any delay in your flight? If so, could you provide the approximate duration of the delay? Thank you.
User: I was on flight 22 and it had a delay of 4 hours
Extracted Flight { number = "22", delayInMinutes = 240 }
Agent: Thank you Mario Fusco, you are eligible for a refund of $528.0

Copy link
Owner

@langchain4j langchain4j left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mariofusco, thanks a lot! Pardon for the delay, I was on a PTO and sick leave.
Please check my comments, I think with a bit of changes it could make it to the next release (15 April). But no rush of course!


if (context.getCustomer() == null) {
try {
Customer customer = customerExtractor.extractData(userMessage);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can imagine this will be a common use case where LLM must first collect some information from the user before moving to the next state/action. While trying to extract this explicitly on each interaction works, this is probably not the most effective way.

I was thinking about introducing this as a first-class feature into AI Services. The idea is to give the user an option to define a POJO/Map which will keep all the fields for a customer/flight/etc. Then LLM will be able to use setters and getters of this POJO to read and write those fields. "setter tools" can be automatically generated by AI Services from the fields of the provided POJO. So, if there are firstName and lastName fields in the Customer POJO, AI Services will automatically generate 2 tools: setFirstName(String firstName) and setLastName(String lastName) (or, alternatively, a single tool with a set(String field, String value) signature). For example, when the user says only his first name while greeting, LLM could automatically invoke setFirstName(...) to record this information. When he provides some information later, it can be also added to the POJO. When the conversation grows beyond maximum messages/tokens, initial message from the user with his first name will be evicted, but POJO with his name will stay and can be used when needed. As for reading, we could just dump all the fields with current up-to-date values into the system message, something like this:

You are a helpful assistant. You are having a conversation with the customer.
Before doing any actions, you must collect the following information:
- customer's first name
- last name
- booking number

This is what you already know:
{{current_customer_information}}

Which will be rendered into:

You are a helpful assistant. You are having a conversation with the customer.
Before doing any actions, you must collect the following information:
- customer's first name
- last name
- booking number

This is what you already know:
firstName: Klaus
lastName: [missing]
bookingNumber: [missing]

WDYT?

I plan to work this feature soon, will keep you updated.

Copy link
Contributor Author

@mariofusco mariofusco Apr 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very interesting (and in my opinion common) use case indeed. In my PoC I solved this problem with a simpler and maybe more naive hack. In practice I implemented an alternative chat memory (I know it is not a proper ChatMemory since it doesn't implement that interface, but the purpose is exactly the same) that simply concatenates all the prompts sent by user in a given state and sends the String resulting from this concatenation at each and every interaction with the extractor.

Note that this behavior is different in a relevant way (at least for my use case) from what provided by the MessageWindowChatMemory because this memory sends to the LLM the whole concatenation of all the messages generated by the extractor plus all the LLM response that in my situation confuses the model in an unrecoverable way.

In essence, in relation to the interaction in this screenshot, my solution at the end of the customer extraction phase sends to the LLM a prompts like

"Extract information about a customer from this text 'hi. My name is Mario Fusco. I'm 70'. The response must contain only the JSON with customer's data and without any other sentence. You must answer strictly in the following JSON format: {\n"firstName": (type: string),\n"lastName": (type: string),\n"age": (type: integer)"

while if I use the MessageWindowChatMemory (as I attempted) I would have a prompt like

"Extract information about a customer from this text 'hi'. The response must contain only the JSON with customer's data and without any other sentence. You must answer strictly in the following JSON format: {\n"firstName": (type: string),\n"lastName": (type: string),\n"age": (type: integer)"; <1st LLM response>; "Extract information about a customer from this text 'My name is Mario Fusco'. The response must contain only the JSON with customer's data and without any other sentence. You must answer strictly in the following JSON format: {\n"firstName": (type: string),\n"lastName": (type: string),\n"age": (type: integer)"; <2nd LLM response>; "Extract information about a customer
from this text 'I'm 70'. The response must contain only the JSON with customer's data and without any other
sentence. You must answer strictly in the following JSON format: {\n"firstName": (type: string),\n"lastName": (type: string),\n"age": (type: integer)";

that clearly is not workable for the LLM.

I wonder if simply making available out-of-the-box an alternative ChatMemory implementation similar to what I did in my PoC wouldn't be enough to solve this problem (or even if it could be useful also for different use cases).

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see... It is an interesting use case, usually memory is not used for extraction, so there was no need before. Need to think a bit more about it...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly, and that's why my custom implementation isn't a proper ChatMemory: I couldn't find a way to fit it into that interface. Once you have decided if this is the way you want to go just let me know. I could help with the implementation using the scenario that I already have.

@mariofusco
Copy link
Contributor Author

@langchain4j I updated this pull request following your suggestions. Please let me know if I'm missing something.

@mariofusco mariofusco changed the title [PoC] Allow to programmatically plug message in the AiServices [Allow to programmatically define system and user messages in the AiServices Apr 10, 2024
@mariofusco mariofusco changed the title [Allow to programmatically define system and user messages in the AiServices Allow to programmatically define system and user messages in the AiServices Apr 10, 2024
Copy link
Owner

@langchain4j langchain4j left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mariofusco thanks a lot, great job!

@langchain4j langchain4j merged commit 9c69d96 into langchain4j:main Apr 10, 2024
1 check passed
@mariofusco mariofusco deleted the pluggable-messages branch April 10, 2024 13:36
@langchain4j
Copy link
Owner

Hi @mariofusco, I am now experimenting with this new feature and covering it with ITs and it feels to me that we rushed a bit with it.
Regarding user messages, there are already multiple ways how they can be defined programmatically, here are a few:

String chat(String userMessage);

String chat(@UserMessage String userMessage);

@UserMessage("What is the capital of {{country}}?")
String chat(String country);

So having yet another way (via userMessageProvider) does not seem to bring a lot of value, but it brings extra complexity in the implementation and ITs. I am now trying to cover all combinations in ITs and it is overwhelming.

Regarding system messages, it is a different story as there is currently no way to define them programmatically unless with this hack:

@SystemMessage("{{systemMessage}}")
String chat(@UserMessage String userMessage, @V("systemMessage") String systemMessage);

And I am wondering now if it is a good idea to introduce another way/style of defining it (via systemMessageProvider), or it will be more in line with the current approach to allow doing this:

String chat(@UserMessage String userMessage, @SystemMessage String systemMessage);

WDYT?

@mariofusco
Copy link
Contributor Author

Quick premise: when I started working on this I was actually focusing on the langchain4j integration made by Quarkus, so it is indeed possible that the feature I needed was only missing on the Quarkus integration part but not here.

That said I believe that you're mixing things at 2 different levels or at least this is how it works in the integration use case that I provided with this pull request. When you use one of the 2 methods that I added to the AiServices you are not defining a complete message but only a template for it. For instance at the beginning of the iteration my user message provider sets an user message like:

"Ask question to the customer regarding his name and age. +++ {{it}} +++ "

Then with the agent configured in this way it reads the input provided by the users and sends it to that agent

Scanner scanner = new Scanner(System.in);
String userMessage = scanner.nextLine();
agent.chat(sessionId, userMessage);

There langchain4j replaces the {{it}} placehoder in the message template with the input provided by the user (or with the whole context if you're using a chat memory) and sends the result of this string manipulation to the underlying LLM.

Finally it's possible to dynamically change that template, so when the application has finished collecting customer information, it does the same about the flight and the user message template changes in

"Ask question to the customer regarding the number of the flight and its eventual delay. +++ {{it}} +++ "

If my analysis is correct then the 2 methods that I added don't overlap with any existing feature, but please correct me if I'm wrong.

@langchain4j
Copy link
Owner

That said I believe that you're mixing things at 2 different levels or at least this is how it works in the integration use case that I provided with this pull request. When you use one of the 2 methods that I added to the AiServices you are not defining a complete message but only a template for it.

Yes, I do not make a clear distinction between the complete message and a template. At least for me it is kind of the same thing. If there are placeholders, replace them with values and go futher. If no placeholders - ok, just send it as-is. Maybe I am wrong and majority of users see it differently and want a clear distinction...

Currently, when you do like this, the string in the annotation is treated as a template:

@UserMessage("What is the capital of {{it}}?")
String chat(String country);

But when you do like this, the userMessage is not treated as a template, which I think is a bug.

String chat(@UserMessage String userMessage, @V("country") String country));

Some people expect that userMessage in this case should be treated as a template, and I think I agree.
So, when this bug is fixed, one can supply template directly into the method, without a need in userMessageProvider.

@mariofusco
Copy link
Contributor Author

But when you do like this, the userMessage is not treated as a template, which I think is a bug.

String chat(@UserMessage String userMessage, @V("country") String country));

Some people expect that userMessage in this case should be treated as a template, and I think I agree. So, when this bug is fixed, one can supply template directly into the method, without a need in userMessageProvider.

I agree that what you described is a bug (or a missing feature) and has to be fixed.

I disagree (but I understand that this is more questionable and a matter of tastes and opinions) that this replaces the need of userMessageProvider. To me with the userMessageProvider I'm configuring the general behavior of the agent, that can change at a later stage when the application reaches a different state, while when I invoke agent.chat(userMessage) I'm just conveying the actual message received from the end-user to an agent that has been pre-configured to treat the user's message as she expects.

In essence I'd prefer having a clear separation between the agent configuration and its usage, as allowed by the userMessageProvider, but I understand that this introduced some additional complexity and it's up to you to decide if it is justified or not. In case you decide to keep this feature and need an help to increase its test coverage just let me know.

@langchain4j
Copy link
Owner

To me with the userMessageProvider I'm configuring the general behavior of the agent, that can change at a later stage when the application reaches a different state, while when I invoke agent.chat(userMessage) I'm just conveying the actual message received from the end-user to an agent that has been pre-configured to treat the user's message as she expects.

I see what you mean, but to configure the general behavior of the agent, system message is usually used. User message is an actual message from the user and it theory should not contain any instructions on how the agent should behave.
With this in mind, I would expect user message to be supplied directly to the agent as a method argument. And system message via @SystemMessage annotation or systemMessageProvider.

In essence I'd prefer having a clear separation between the agent configuration and its usage, as allowed by the userMessageProvider

I agree on "having a clear separation between the agent configuration and its usage", but for the "agent configuraton" I would expect system message and for "its usage" - user message.

I understand that this introduced some additional complexity and it's up to you to decide if it is justified or not.

I already see how this becomes more an more complicated, so I would prefer to reduce the complexity where possible. For now I propose to keep systemMessageProvider and remove userMessageProvider, we can always bring it back if needed. Hope you are ok with this.

In case you decide to keep this feature and need an help to increase its test coverage just let me know.

Thanks a lot, but no worries, I am already finalizing this

@mariofusco
Copy link
Contributor Author

I already see how this becomes more an more complicated, so I would prefer to reduce the complexity where possible. For now I propose to keep systemMessageProvider and remove userMessageProvider, we can always bring it back if needed. Hope you are ok with this.

Big +1 on this! I was speaking about the user and system messages in an interchangeable way and assuming that my considerations were valid for both, but indeed they are 2 different things with different purposes and indeed what I needed to configure with the state machine in my PoC is the system message.

@langchain4j langchain4j changed the title Allow to programmatically define system and user messages in the AiServices Allow to programmatically define system messages in the AiServices Apr 12, 2024
langchain4j added a commit that referenced this pull request Apr 16, 2024
## Change
- testing, refactoring and documentation after #862, #930
- `userMessageProvider` is removed as discussed in #862

## Checklist
Before submitting this PR, please check the following points:
- [X] I have added unit and integration tests for my change
- [X] All unit and integration tests in the module I have added/changed
are green
- [X] All unit and integration tests in the
[core](https://github.com/langchain4j/langchain4j/tree/main/langchain4j-core)
and
[main](https://github.com/langchain4j/langchain4j/tree/main/langchain4j)
modules are green
- [X] I have added/updated the
[documentation](https://github.com/langchain4j/langchain4j/tree/main/docs/docs)
- [ ] I have added an example in the [examples
repo](https://github.com/langchain4j/langchain4j-examples) (only for
"big" features)
- [ ] I have added my new module in the
[BOM](https://github.com/langchain4j/langchain4j/blob/main/langchain4j-bom/pom.xml)
(only when a new module is added)

## Checklist for adding new embedding store integration
- [ ] I have added a {NameOfIntegration}EmbeddingStoreIT that extends
from either EmbeddingStoreIT or EmbeddingStoreWithFilteringIT
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants