patterns-ai-core · andreibondarev · Jan 6, 2024 · Dec 10, 2023 · Dec 11, 2023 · Dec 11, 2023
diff --git a/Gemfile.lock b/Gemfile.lock
@@ -151,7 +151,7 @@ GEM
     eqn (1.6.5)
       treetop (>= 1.2.0)
     erubi (1.12.0)
-    event_stream_parser (0.3.0)
+    event_stream_parser (1.0.0)
     faraday (2.7.12)
       base64
       faraday-net_http (>= 2.0, < 3.1)
@@ -352,8 +352,8 @@ GEM
     ruby-next-core (0.15.3)
     ruby-next-parser (3.1.1.3)
       parser (>= 3.0.3.1)
-    ruby-openai (6.1.0)
-      event_stream_parser (>= 0.3.0, < 1.0.0)
+    ruby-openai (6.3.1)
+      event_stream_parser (>= 0.3.0, < 2.0.0)
       faraday (>= 1)
       faraday-multipart (>= 1)
     ruby-progressbar (1.13.0)
@@ -453,7 +453,7 @@ DEPENDENCIES
   roo (~> 2.10.0)
   rspec (~> 3.0)
   rubocop
-  ruby-openai (~> 6.1.0)
+  ruby-openai (~> 6.3.0)
   safe_ruby (~> 1.0.4)
   sequel (~> 5.68.0)
   standardrb

diff --git a/README.md b/README.md
@@ -30,6 +30,7 @@ Available for paid consulting engagements! [Email me](mailto:andrei@sourcelabs.i
 - [Output Parsers](#output-parsers)
 - [Building RAG](#building-retrieval-augment-generation-rag-system)
 - [Building chat bots](#building-chat-bots)
+- [Assistants](#assistants)
 - [Evaluations](#evaluations-evals)
 - [Examples](#examples)
 - [Logging](#logging)
@@ -73,7 +74,7 @@ Langchain.rb wraps all supported LLMs in a unified interface allowing you to eas
 
 #### OpenAI
 
-Add `gem "ruby-openai", "~> 6.1.0"` to your Gemfile.
+Add `gem "ruby-openai", "~> 6.3.0"` to your Gemfile.
 
 ```ruby
 llm = Langchain::LLM::OpenAI.new(api_key: ENV["OPENAI_API_KEY"])
@@ -405,43 +406,65 @@ client.ask(
 )
 ```
 
-## Building chat bots
+## Evaluations (Evals)
+The Evaluations module is a collection of tools that can be used to evaluate and track the performance of the output products by LLM and your RAG (Retrieval Augmented Generation) pipelines.
 
-### Conversation class
+## Assistants
+Assistants are Agent-like objects that leverage helpful instructions, LLMs, tools and knowledge to respond to user queries. Assistants can be configured with an LLM of your choice (currently only OpenAI), any vector search database and easily extended with additional tools.
 
-Choose and instantiate the LLM provider you'll be using:
+### Creating an Assistant
+1. Instantiate an LLM of your choice
 ```ruby
 llm = Langchain::LLM::OpenAI.new(api_key: ENV["OPENAI_API_KEY"])
 ```
-Instantiate the Conversation class:
+2. Instantiate a Thread. Threads keep track of the messages in the Assistant conversation.
+```ruby
+thread = Langchain::Thread.new
+```
+You can pass old message from previously using the Assistant:
+```ruby
+thread.messages = messages
+```
+3. Instantiate an Assistant
+```ruby
+assistant = Langchain::Assistant.new(
+  llm: llm,
+  thread: thread,
+  instructions: "You are a Meteorologist Assistant that is able to pull the weather for any location",
+  tools: [
+    Langchain::Tool::GoogleSearch.new(api_key: ENV["SERPAPI_API_KEY"])
+  ]
+)
+```
+### Using an Assistant
+You can now add your message to an Assistant.
 ```ruby
-chat = Langchain::Conversation.new(llm: llm)
+assistant.add_message content: "What's the weather in New York City?"
 ```
 
-(Optional) Set the conversation context:
+Run the Assistant to generate a response. 
 ```ruby
-chat.set_context("You are a chatbot from the future")
+assistant.run
 ```
 
-Exchange messages with the LLM
+If a Tool is invoked you can manually submit an output.
 ```ruby
-chat.message("Tell me about future technologies")
+assistant.submit_tool_output tool_call_id: "...", output: "It's 70 degrees and sunny in New York City"
 ```
 
-To stream the chat response:
+Or run the assistant with `auto_tool_execution: tool` to call Tools automatically.
 ```ruby
-chat = Langchain::Conversation.new(llm: llm) do |chunk|
-  print(chunk)
-end
+assistant.add_message content: "How about San Diego, CA?"
+assistant.run(auto_tool_execution: true)
 ```
 
-Open AI Functions support
+### Accessing Thread messages
+You can access the messages in a Thread by calling `assistant.thread.messages`.
 ```ruby
-chat.set_functions(functions)
+assistant.thread.messages
 ```
 
-## Evaluations (Evals)
-The Evaluations module is a collection of tools that can be used to evaluate and track the performance of the output products by LLM and your RAG (Retrieval Augmented Generation) pipelines.
+The Assistant checks the context window limits before every request to the LLM and remove oldest thread messages one by one if the context window is exceeded.
 
 ### RAGAS
 Ragas helps you evaluate your Retrieval Augmented Generation (RAG) pipelines. The implementation is based on this [paper](https://arxiv.org/abs/2309.15217) and the original Python [repo](https://github.com/explodinggradients/ragas). Ragas tracks the following 3 metrics and assigns the 0.0 - 1.0 scores:

diff --git a/langchain.gemspec b/langchain.gemspec
@@ -68,7 +68,7 @@ Gem::Specification.new do |spec|
   spec.add_development_dependency "replicate-ruby", "~> 0.2.2"
   spec.add_development_dependency "qdrant-ruby", "~> 0.9.4"
   spec.add_development_dependency "roo", "~> 2.10.0"
-  spec.add_development_dependency "ruby-openai", "~> 6.1.0"
+  spec.add_development_dependency "ruby-openai", "~> 6.3.0"
   spec.add_development_dependency "safe_ruby", "~> 1.0.4"
   spec.add_development_dependency "sequel", "~> 5.68.0"
   spec.add_development_dependency "weaviate-ruby", "~> 0.8.9"

diff --git a/lib/langchain.rb b/lib/langchain.rb
@@ -24,6 +24,7 @@
   "sql_query_agent" => "SQLQueryAgent"
 )
 loader.collapse("#{__dir__}/langchain/llm/response")
+loader.collapse("#{__dir__}/langchain/assistants")
 loader.setup
 
 # Langchain.rb a is library for building LLM-backed Ruby applications. It is an abstraction layer that sits on top of the emerging AI-related tools that makes it easy for developers to consume and string those services together.
@@ -82,7 +83,7 @@ def logger=(logger)
     attr_reader :root
   end
 
-  self.logger ||= ::Logger.new($stdout, level: :warn)
+  self.logger ||= ::Logger.new($stdout, level: :debug)
 
   @root = Pathname.new(__dir__)
 

diff --git a/lib/langchain/assistants/assistant.rb b/lib/langchain/assistants/assistant.rb
@@ -0,0 +1,196 @@
+# frozen_string_literal: true
+
+module Langchain
+  class Assistant
+    attr_reader :llm, :thread, :instructions
+    attr_accessor :tools
+
+    # Create a new assistant
+    #
+    # @param llm [Langchain::LLM::Base] LLM instance that the assistant will use
+    # @param thread [Langchain::Thread] The thread that'll keep track of the conversation
+    # @param tools [Array<Langchain::Tool::Base>] Tools that the assistant has access to
+    # @param instructions [String] The system instructions to include in the thread
+    def initialize(
+      llm:,
+      thread:,
+      tools: [],
+      instructions: nil
+    )
+      raise ArgumentError, "Invalid LLM; currently only Langchain::LLM::OpenAI is supported" unless llm.instance_of?(Langchain::LLM::OpenAI)
+      raise ArgumentError, "Thread must be an instance of Langchain::Thread" unless thread.is_a?(Langchain::Thread)
+      raise ArgumentError, "Tools must be an array of Langchain::Tool::Base instance(s)" unless tools.is_a?(Array) && tools.all? { |tool| tool.is_a?(Langchain::Tool::Base) }
+
+      @llm = llm
+      @thread = thread
+      @tools = tools
+      @instructions = instructions
+
+      # The first message in the thread should be the system instructions
+      # TODO: What if the user added old messages and the system instructions are already in there? Should this overwrite the existing instructions?
+      add_message(role: "system", content: instructions) if instructions
+    end
+
+    # Add a user message to the thread
+    #
+    # @param content [String] The content of the message
+    # @param role [String] The role attribute of the message. Default: "user"
+    # @param tool_calls [Array<Hash>] The tool calls to include in the message
+    # @param tool_call_id [String] The ID of the tool call to include in the message
+    # @return [Array<Langchain::Message>] The messages in the thread
+    def add_message(content: nil, role: "user", tool_calls: [], tool_call_id: nil)
+      message = build_message(role: role, content: content, tool_calls: tool_calls, tool_call_id: tool_call_id)
+      thread.add_message(message)
+    end
+
+    # Run the assistant
+    #
+    # @param auto_tool_execution [Boolean] Whether or not to automatically run tools
+    # @return [Array<Langchain::Message>] The messages in the thread
+    def run(auto_tool_execution: false)
+      running = true
+
+      while running
+        # TODO: I think we need to look at all messages and not just the last one.
+        case (last_message = thread.messages.last).role
+        when "system"
+          # Do nothing
+          running = false
+        when "assistant"
+          if last_message.tool_calls.any?
+            if auto_tool_execution
+              run_tools(last_message.tool_calls)
+            else
+              # Maybe log and tell the user that there's outstanding tool calls?
+              running = false
+            end
+          else
+            # Last message was from the assistant without any tools calls.
+            # Do nothing
+            running = false
+          end
+        when "user"
+          # Run it!
+          response = chat_with_llm
+
+          if response.tool_calls
+            # Re-run the while(running) loop to process the tool calls
+            running = true
+            add_message(role: response.role, tool_calls: response.tool_calls)
+          elsif response.chat_completion
+            # Stop the while(running) loop and add the assistant's response to the thread
+            running = false
+            add_message(role: response.role, content: response.chat_completion)
+          end
+        when "tool"
+          # Run it!
+          response = chat_with_llm
+          running = true
+
+          if response.tool_calls
+            add_message(role: response.role, tool_calls: response.tool_calls)
+          elsif response.chat_completion
+            add_message(role: response.role, content: response.chat_completion)
+          end
+        end
+      end
+
+      thread.messages
+    end
+
+    # Add a user message to the thread and run the assistant
+    #
+    # @param content [String] The content of the message
+    # @param auto_tool_execution [Boolean] Whether or not to automatically run tools
+    # @return [Array<Langchain::Message>] The messages in the thread
+    def add_message_and_run(content:, auto_tool_execution: false)
+      add_message(content: content, role: "user")
+      run(auto_tool_execution: auto_tool_execution)
+    end
+
+    # Submit tool output to the thread
+    #
+    # @param tool_call_id [String] The ID of the tool call to submit output for
+    # @param output [String] The output of the tool
+    # @return [Array<Langchain::Message>] The messages in the thread
+    def submit_tool_output(tool_call_id:, output:)
+      # TODO: Validate that `tool_call_id` is valid
+      add_message(role: "tool", content: output, tool_call_id: tool_call_id)
+    end
+
+    private
+
+    # Call to the LLM#chat() method
+    #
+    # @return [Langchain::LLM::BaseResponse] The LLM response object
+    def chat_with_llm
+      llm.chat(
+        messages: thread.openai_messages,
+        tools: tools.map(&:to_openai_tool),
+        # TODO: Not sure that tool_choice should always be "auto"; Maybe we can let the user toggle it.
+        tool_choice: "auto"
+      )
+    end
+
+    # Run the tools automatically
+    #
+    # @param tool_calls [Array<Hash>] The tool calls to run
+    def run_tools(tool_calls)
+      # Iterate over each function invocation and submit tool output
+      tool_calls.each do |tool_call|
+        tool_call_id = tool_call.dig("id")
+        tool_name = tool_call.dig("function", "name")
+        tool_arguments = JSON.parse(tool_call.dig("function", "arguments"), symbolize_names: true)
+
+        tool_instance = tools.find do |t|
+          t.name == tool_name
+        end or raise ArgumentError, "Tool not found in assistant.tools"
+
+        output = tool_instance.execute(**tool_arguments)
+
+        submit_tool_output(tool_call_id: tool_call_id, output: output)
+      end
+
+      response = chat_with_llm
+
+      if response.tool_calls
+        add_message(role: response.role, tool_calls: response.tool_calls)
+      elsif response.chat_completion
+        add_message(role: response.role, content: response.chat_completion)
+      end
+    end
+
+    # Build a message
+    #
+    # @param role [String] The role of the message
+    # @param content [String] The content of the message
+    # @param tool_calls [Array<Hash>] The tool calls to include in the message
+    # @param tool_call_id [String] The ID of the tool call to include in the message
+    # @return [Langchain::Message] The Message object
+    def build_message(role:, content: nil, tool_calls: [], tool_call_id: nil)
+      Message.new(role: role, content: content, tool_calls: tool_calls, tool_call_id: tool_call_id)
+    end
+
+    # # TODO: Fix the message truncation when context window is exceeded
+    # def build_assistant_prompt(instructions:, tools:)
+    #   while begin
+    #     # Check if the prompt exceeds the context window
+    #     # Return false to exit the while loop
+    #     !llm.class.const_get(:LENGTH_VALIDATOR).validate_max_tokens!(
+    #       thread.messages,
+    #       llm.defaults[:chat_completion_model_name],
+    #       {llm: llm}
+    #     )
+    #   # Rescue error if context window is exceeded and return true to continue the while loop
+    #   rescue Langchain::Utils::TokenLength::TokenLimitExceeded
+    #     # Should be using `retry` instead of while()
+    #     true
+    #   end
+    #     # Truncate the oldest messages when the context window is exceeded
+    #     thread.messages.shift
+    #   end
+
+    #   prompt
+    # end
+  end
+end