You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
GoogleGenAIAPI: Implement a GoogleGenAIAPI backend that can be used to connect to Gemini on both the Gemini Developer API and on Vertex AI via the new unified Google Gen AI SDK. This new backend subsumes OneTwo's existing GeminiAPI and VertexAIAPI backends, while providing additional functionality such as control over the model's thinking config, multimodal output, support for the tokenize operation, configurable retry delays, smarter retry based on the specific error code, more efficient parallelization, more flexible control over the system instruction, and better control over the operations for which the backend is registered and the default parameters to be applied.
Core
Structured Output: Introduce enhanced support for structured output generation across various backends with the generate_object method. This feature enables you to provide a Python type or dataclass or Pydantic model, and have the backend return a response that conforms to the specified structure.
Multimodal Output: Introduce a new operation llm.generate_content that subsumes the functionality of llm.chat, but returns a ChunkList rather than a string, so as to allow for multimodal output. Ensure that return values of type ChunkList are handled properly through the core code, including for example the caching layer. Provide support for returning details of thoughts, etc., via the include_details=True option. Implement concrete support for this operation in the GoogleGenAIAPI backend.
Parallelism: Provide a max_parallel_executions context manager that allows overriding the maximum number of parallel execution branches, and improve the efficiency of the parallel execution code so as to ensure that the worker pool is constantly kept full up to the specified max_parallel_executions. While previously it had only been possible to maintain up to 100 simultaneous LLM requests in parallel, by adjusting the max_parallel_executions value, it is now possible in some cases to achieve as many as 10,000 LLM requests in parallel within a single process.
Agents
Search Tool (Agent as a tool): Using GoogleGenAIAPI backend, we provide an improved search tool by wrapping an LLM call that leverages the Google Generative AI SDK's built-in search functionality. This demonstrates how easily custom tools can be created in OneTwo, even by encapsulating complex operations like LLM calls, effectively creating an "Agent as a tool".
Standard library
Sandbox: Enhanced sandboxing capabilities with the introduction of PythonSandboxSafeSubsetMultiProcess, that wraps the existing PythonSandboxSafeSubset but executes the code in a separate, persistent process using Python's standard multiprocessing library. This new sandbox offers true parallelism and greater robustness against crashes or hangs, especially when executing LLM-generated code in parallel.
Evaluation
Fuzzy LLM critic: Implement a naive_fuzzy_evaluation_critic that is similar to naive_evaluation_critic, except returns a continuous score between 0 and 1 rather than a binary rating.
Visualization
Make HTMLRenderer more robust to rendering of objects with cyclic references.
Documentation
Update the tutorial colab to support the latest Gemini and OpenAI models and GoogleGenAIAPI backend and to illustrate best practices for structured output generation. Includes, among other things, new sections illustrating using LLMs as a Tool in ReActAgent and PythonPlanningAgent, and sandbox selection.
Other
Fix to the Multimodal section of Colab tutorial: Added images directly to the codebase, with licensing and source details tracked in image_provenance.py. Also resolved issues with image data handling in the GoogleGenAIAPI backend.
Various other bug fixes and incremental improvements to caching, exception handling, tracing , ReActAgent, PythonPlanningAgent, evaluation summary output, conversion and formatting of chunks and chunk lists, test utilities, type annotations, and the GeminiAPI and VertexAIAPI backends.