-
-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added ChatMessage class to generalize chat interface #652
base: main
Are you sure you want to change the base?
Conversation
module Messages | ||
# Generic messages that can be used in the `chat` method of the LLM adapter | ||
# It is recommended that `assistants/messages/*_messages.rb` be unified with this one in the future. | ||
class ChatMessage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to simplify the namespace structure to just Langchain::ChatMessage
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had placed it in a namespace aligned with lib/langchain/assistants/messages/openai_message.rb. I think it should be fine to put it at the root level.
end | ||
|
||
def to_hash | ||
{role: role.to_s, content:} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you envision we would account for cases like this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I don't intend to go that far within the scope of this PR, if I expand in the future, wouldn't it be desirable to split ChatMessage
into two classes, such as ChatMessage::Basic
and ChatMessage::ToolCall
?
The former would be a simple class that only holds role
and content
, while the latter would be a class that additionally holds information necessary for tool calls.
By defining to_hash
for both classes, they can be handled transparently.
For parts that cannot be fully handled by the default to_hash
, each LLM adapter can absorb them. This way, I believe we can reduce the need for individual LLM support within ChatMessage
and Assistant
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The former would be a simple class that only holds
role
andcontent
, while the latter would be a class that additionally holds information necessary for tool calls.
Take a look at, for example, how tool results are sent to Anthropic (inside the content
key): https://github.com/patterns-ai-core/langchainrb/blob/main/lib/langchain/assistants/messages/anthropic_message.rb#L32-L39
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for the purposes of this PR I would just collapse:
Langchain::Messages::AnthropicMessage
Langchain::Messages::GoogleGeminiMessage
Langchain::Messages::OpenAIMessage
into =>
class Langchain::ChatMessage
def initialize()
def to_openai_hash()
def to_google_gemini_hash()
def to_anthropic_hash()
def to_hash()
end
What do you think? I can try take a stab at it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was not very familiar with function calling, so I was surprised to find that the message formats are completely incompatible. Although I haven't been able to check all the documents, is my understanding below correct?
- In messages where the LLM instructs the use of a tool, the details of the tool to be called are specified in
tool_calls
(without usingcontent
ortool_call_id
) - In messages representing the result of the Assistant calling a tool as instructed, the call ID is stored in
tool_call_id
, and the result of the call is stored incontent
(without usingtool_calls
) - While the above concepts are generally common among LLMs, the hash keys and hash formats differ
If this understanding is correct, couldn't we generalize it by creating classes like the following? (The namespace and implementation details of the classes are shown in a very simplified manner.)
# Message for LLM to instruct the use of a tool
class ToolCalls < ChatMessage::Base
# Single tool call
class ToolCall < Data.new(:id, :tool_name, :method_name, :arguments)
def initialize(tool_calls:)
@tool_calls = tool_calls)
end
# ...
end
# Message representing the result of a tool call
class ToolCallResult < ChatMessage::Base
def initialize(tool_call_id:, result:)
@tool_call_id = tool_call_id
@result = result
end
# ...
end
Since it seems difficult to generalize to_hash, how about performing the conversion for each LLM model as needed?
class LLM::OpenAI
def chat(params = {})
if params[:messages].all? { *1.is*a?(Langchain::Messages::ChatMessage) }
convert_chat_messages!(params)
end
# ...
end
def onvert_chat_messages!(params)
params[:messages] = params[:messages].map | message |
case message
in ToolCalls(tool_calls:)
tool_calls.map { tool_call_to_hash(_1) }
in ToolCallResult(tool_call_id:, result:)
{ role: 'tool', tool_call_id:, content: result }
else
message.to_h
end
end
end
end
Although the details need a bit more consideration, I am thinking of a policy to make the Messages structure as common as possible in this way and hide the differences among LLMs within each LLM class. What do you think?
class ChatMessage | ||
attr_reader :role, :content | ||
|
||
ROLES = [:system, :user, :assistant].freeze |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These roles would be hidden, right? The user would only use these 3 when creating an instance, right? I think we'd still need one for :tool
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I overlooked that because I hadn't thoroughly investigated LLMs other than OpenAI and Anthropic. In Gemini, it's model instead of assistant, right?
To maximize compatibility between LLMs, I think it's best not to allow the model role and have the LLM adapter layer convert the assistant role.
As mentioned earlier, I believe it's best to handle tool
or function
with specialized classes since the data they hold is different.
However, since this is an area where the differences in specifications between LLMs are significant, would it be okay to focus on supporting only user, system, and assistant roles in this PR for now?
I believe there will be no functional issues with proceeding incrementally, as the specification still allows receiving the previous Hash Array.
Description
This PR is the first prototype based on discussion #629.
The purpose is to make it possible to use ChatMessage instead of raw Hash for chat messages, and to generalize it so that it can be used regardless of LLM.
First of all, this PR will aim to handle the basic roles such as system, user, and assistant.