Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add multiple modalities: tools, functions, json_mode #218

Merged
merged 6 commits into from
Nov 25, 2023
Merged

Conversation

jxnl
Copy link
Owner

@jxnl jxnl commented Nov 23, 2023

git checkout multiple-modalities
pip install -e .

Summary by CodeRabbit

  • New Features

    • Enhanced client functionality for backward compatibility with new keywords and structured output modes.
    • Introduced structured output methods for improved interaction with the OpenAI client.
  • Documentation

    • Updated documentation to include new keywords and structured output examples.
  • Tests

    • Added new tests for validating the OpenAI client's user information extraction capabilities.
    • Implemented tests for the new structured output modes and response model handling.
  • Refactor

    • Refactored existing functions to support new modes for handling response models and retries.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 8

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 7de55a9 and 2c7942f.
Files selected for processing (4)
  • instructor/function_calls.py (3 hunks)
  • instructor/patch.py (12 hunks)
  • tests/openai/test_modes.py (1 hunks)
  • tests/openai/test_patch.py (1 hunks)
Additional comments: 10
tests/openai/test_modes.py (3)
  • 7-10:

The UserExtract class is well-defined and follows the schema pattern expected for OpenAI API interactions.

  • 12-39:

The test case test_tool_call correctly tests the tool call functionality, including the extraction of user information and validation of the tool call details.

  • 42-60:

The test case test_json_mode correctly tests the JSON mode functionality, including the extraction of user information and validation against the expected JSON schema.

instructor/function_calls.py (1)
  • 8-16: The PatchMode enum is well-defined and follows Python's enum module conventions. The members are correctly assigned string values.
tests/openai/test_patch.py (6)
  • 6-7:
    The import statements are correctly updated to reflect the new dependencies.

  • 9-9:
    The import of PatchMode is correctly added to support the new test cases.

  • 19-37:
    The test_runmodel function is correctly updated to test different modes of operation.

  • 41-50:
    The UserExtractValidated class is correctly defined with a field validator for the name field.

  • 52-69:
    The test_runmodel_validator function is correctly updated to test different modes of operation with a validated response model.

  • 72-90:
    The test_runmodel_async_validator function is correctly updated to test different modes of operation with a validated response model in an asynchronous context.

instructor/function_calls.py Outdated Show resolved Hide resolved
Comment on lines 7 to 14
from openai.types.chat import ChatCompletion, ChatCompletionMessage
from pydantic import BaseModel, ValidationError

from .function_calls import OpenAISchema, openai_schema
from .function_calls import OpenAISchema, openai_schema, PatchMode

import logging

logger = logging.getLogger(__name__)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The import of PatchMode and the logging module are correctly added as per the change summary. However, ensure that the logging module is used consistently throughout the file instead of print statements for logging purposes.

instructor/patch.py Outdated Show resolved Hide resolved
Comment on lines 139 to 140
except (ValidationError, JSONDecodeError) as e:
kwargs["messages"].append(response.choices[0].message) # type: ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The retry_async function attempts to append a message to kwargs["messages"] after catching an exception. However, this assumes that kwargs will always contain a messages key with a list value, which may not be the case. This could raise a KeyError or TypeError if messages is not present or not a list. The code should check for the existence and type of messages before appending.

instructor/patch.py Show resolved Hide resolved
instructor/patch.py Outdated Show resolved Hide resolved
instructor/patch.py Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 11

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 2c7942f and de9dcd0.
Files selected for processing (4)
  • instructor/patch.py (12 hunks)
  • tests/openai/test_modes.py (1 hunks)
  • tests/openai/test_patch.py (1 hunks)
  • tests/test_validators.py (1 hunks)
Files skipped from review due to trivial changes (1)
  • tests/openai/test_modes.py
Additional comments: 5
tests/test_validators.py (3)
  • 9-9:

The import of llm_validator is correct and aligns with the change summary.

  • 16-18:

The addition of the AfterValidator annotation to the Response class is consistent with the change summary.

  • 24-38:

The new test function test_runmodel_validator_error correctly implements a test for the BeforeValidator annotation.

instructor/patch.py (2)
  • 34-42: The dump_message function assumes that the message object has a model_dump method and that the resulting dictionary will have a "tool_calls" key. Verify that all message objects conform to this expectation and handle cases where these assumptions might not hold true.

  • 289-291: The patch function wraps the client.chat.completions.create method without checking if the method already exists or if it's callable. Verify that client.chat.completions.create is present and callable before attempting to wrap it.

tests/openai/test_patch.py Show resolved Hide resolved
tests/openai/test_patch.py Outdated Show resolved Hide resolved
Comment on lines 41 to 60
@pytest.mark.parametrize(
"mode", [PatchMode.FUNCTION_CALL, PatchMode.JSON_MODE, PatchMode.TOOL_CALL]
)
@pytest.mark.asyncio
async def test_runmodel_async_validator():
async def test_runmodel_async(mode):
aclient = instructor.patch(AsyncOpenAI(), mode=mode)
model = await aclient.chat.completions.create(
model="gpt-3.5-turbo",
model="gpt-3.5-turbo-1106",
response_model=UserExtract,
max_retries=2,
messages=[
{"role": "user", "content": "Extract jason is 25 years old"},
],
)
assert isinstance(model, UserExtract), "Should be instance of UserExtract"
assert model.name == "JASON"
assert model.name.lower() == "jason"
assert model.age == 25
assert hasattr(
model, "_raw_response"
), "The raw response should be available from OpenAI"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test test_runmodel_async mirrors the issue with test_runmodel in that it does not test the validation logic of the UserExtract model. It should also include an assertion to check if the validate_name method raises a ValueError when the name is not uppercase.

tests/openai/test_patch.py Outdated Show resolved Hide resolved
Comment on lines 93 to 109
@pytest.mark.parametrize("mode", [PatchMode.FUNCTION_CALL, PatchMode.JSON_MODE])
@pytest.mark.asyncio
async def test_runmodel_async_validator(mode):
aclient = instructor.patch(AsyncOpenAI(), mode=mode)
model = await aclient.chat.completions.create(
model="gpt-3.5-turbo-1106",
response_model=UserExtractValidated,
max_retries=2,
messages=[
{"role": "user", "content": "Extract jason is 25 years old"},
],
)
assert isinstance(
model, UserExtractSimple
), "Should be instance of UserExtractSimple"
assert model.name.lower() == "jason"
assert isinstance(model, UserExtractValidated), "Should be instance of UserExtract"
assert model.name == "JASON"
assert hasattr(
model, "_raw_response"
), "The raw response should be available from OpenAI"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to test_runmodel_validator, the test_runmodel_async_validator should also include an assertion to check for the ValueError when the name is not uppercase to validate the UserExtractValidated model's logic.

Comment on lines +192 to 195
kwargs["messages"].append(response.choices[0].message)
kwargs["messages"].append(
{
"role": "user",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same issue as in retry_async applies to retry_sync. There should be a check for the existence and type of kwargs["messages"] before appending to it.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 9

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between de9dcd0 and ae47bcd.
Files ignored due to filter (1)
  • mkdocs.yml
Files selected for processing (5)
  • docs/concepts/patching.md (1 hunks)
  • instructor/function_calls.py (3 hunks)
  • instructor/patch.py (12 hunks)
  • tests/openai/test_modes.py (1 hunks)
  • tests/openai/test_patch.py (1 hunks)
Additional comments: 3
instructor/function_calls.py (2)
  • 8-16: The Mode enum is well-defined and follows Python's naming conventions for enums. However, the summary mentions it as PatchMode, which could be a discrepancy. Verify that the correct name is used consistently throughout the codebase.

  • 204-240: > Note: This review was outside of the patch, so it was mapped to the patch with the greatest overlap. Original lines [187-237]

The from_response method has been updated to handle different modes using a match statement. This is a good use of Python's pattern matching feature introduced in PEP 634. Ensure that the completion object's structure is consistent with the expected structure in each case block, especially since the message attribute is accessed differently for each mode (message.function_call.name, message.tool_calls, message.content). Also, verify that the model_validate_json method exists and is capable of handling the input for each mode.

tests/openai/test_modes.py (1)
  • 43-60:
    The test function test_json_mode correctly uses the Mode.JSON enum value and asserts the extracted user data.

docs/concepts/patching.md Show resolved Hide resolved
tests/openai/test_modes.py Show resolved Hide resolved
instructor/patch.py Outdated Show resolved Hide resolved
instructor/patch.py Show resolved Hide resolved
instructor/patch.py Show resolved Hide resolved
instructor/patch.py Show resolved Hide resolved
instructor/patch.py Show resolved Hide resolved
jxnl and others added 2 commits November 23, 2023 17:08
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 9

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between ae47bcd and 2305586.
Files selected for processing (2)
  • instructor/function_calls.py (3 hunks)
  • instructor/patch.py (12 hunks)
Additional comments: 3
instructor/patch.py (3)
  • 4-10: The import of Mode from .function_calls should be verified to ensure that it is the correct Mode enum that was mentioned in the summary of changes. If there is a different Mode enum defined elsewhere, this could lead to unexpected behavior.

  • 34-42: The dump_message function assumes that the message object has a model_dump method and that the resulting dictionary will always have a "tool_calls" key. Verify that these assumptions are always true for all message objects that will be passed to this function.

  • 104-111: The process_response function has been updated to accept a mode parameter, but the mode parameter is not used within the function. Verify if the mode parameter is necessary here, and if so, implement the logic that uses it. If it's not necessary, consider removing it to avoid confusion.

Comment on lines +8 to +16
import enum


class Mode(enum.Enum):
"""The mode to use for patching the client"""

FUNCTIONS: str = "function_call"
TOOLS: str = "tool_call"
JSON: str = "json_mode"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Mode enum is well-defined and follows the Python naming conventions for enums. However, the enum members should be in uppercase to follow the Python style guide for enum names.

class Mode(enum.Enum):
    FUNCTIONS: str = "function_call"
    TOOLS: str = "tool_call"
    JSON: str = "json_mode"

Commitable suggestion

IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
import enum
class Mode(enum.Enum):
"""The mode to use for patching the client"""
FUNCTIONS: str = "function_call"
TOOLS: str = "tool_call"
JSON: str = "json_mode"
import enum
class Mode(enum.Enum):
"""The mode to use for patching the client"""
FUNCTIONS: str = "function_call"
TOOLS: str = "tool_call"
JSON: str = "json_mode"

Comment on lines 204 to 239
"""
message = completion.choices[0].message

return cls.model_validate_json(
message.function_call.arguments,
context=validation_context,
strict=strict,
)
if mode == Mode.FUNCTIONS:
assert (
message.function_call.name == cls.openai_schema["name"]
), "Function name does not match"
return cls.model_validate_json(
message.function_call.arguments,
context=validation_context,
strict=strict,
)
elif mode == Mode.TOOLS:
assert (
len(message.tool_calls) == 1
), "Instructor does not support multiple tool calls, use List[Model] instead."
tool_call = message.tool_calls[0]
assert (
tool_call.function.name == cls.openai_schema["name"]
), "Tool name does not match"
return cls.model_validate_json(
tool_call.function.arguments,
context=validation_context,
strict=strict,
)
elif mode == Mode.JSON:
return cls.model_validate_json(
message.content,
context=validation_context,
strict=strict,
)
else:
raise ValueError(f"Invalid patch mode: {mode}")


def openai_schema(cls) -> OpenAISchema:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: This review was outside of the patch, so it was mapped to the patch with the greatest overlap. Original lines [187-236]

The from_response method has been updated to handle different modes, but there are a few issues to address:

  1. The completion parameter is expected to have a choices attribute, but it is accessed as if it were a dictionary. This should be consistent with the rest of the codebase.
  2. The message variable is accessed with both dot notation and dictionary-style indexing, which is inconsistent. The code should use one style for accessing attributes.
  3. The assert statements are used for validation, which is not ideal for production code as they can be disabled with optimization flags. It would be better to raise exceptions explicitly.
  4. The message.function_call.name and message.tool_calls are accessed without checking if these keys exist in the message, which could lead to KeyError.

Here is a proposed fix:

def from_response(
    cls,
    completion,
    validation_context=None,
    strict: bool = None,
    mode: Mode = Mode.FUNCTIONS,
):
    message = completion.choices[0].message

    if mode == Mode.FUNCTIONS:
-       assert (
-           message.function_call.name == cls.openai_schema["name"]
-       ), "Function name does not match"
+       function_call = message.get("function_call")
+       if not function_call or function_call.get("name") != cls.openai_schema["name"]:
+           raise ValueError("Function name does not match")
        return cls.model_validate_json(
            function_call.get("arguments"),
            context=validation_context,
            strict=strict,
        )
    elif mode == Mode.TOOLS:
-       assert (
-           len(message.tool_calls) == 1
-       ), "Instructor does not support multiple tool calls, use List[Model] instead."
+       tool_calls = message.get("tool_calls")
+       if not tool_calls or len(tool_calls) != 1:
+           raise ValueError("Instructor does not support multiple tool calls, use List[Model] instead.")
        tool_call = tool_calls[0]
-       assert (
-           tool_call.function.name == cls.openai_schema["name"]
-       ), "Tool name does not match"
+       if tool_call.get("function", {}).get("name") != cls.openai_schema["name"]:
+           raise ValueError("Tool name does not match")
        return cls.model_validate_json(
            tool_call.get("function", {}).get("arguments"),
            context=validation_context,
            strict=strict,
        )
    elif mode == Mode.JSON:
        return cls.model_validate_json(
            message.get("content"),
            context=validation_context,
            strict=strict,
        )
    else:
        raise ValueError(f"Invalid patch mode: {mode}")

Comment on lines +45 to +92
def handle_response_model(
*,
response_model: Type[BaseModel],
kwargs,
mode: Mode = Mode.FUNCTIONS,
):
new_kwargs = kwargs.copy()
if response_model is not None:
if not issubclass(response_model, OpenAISchema):
response_model = openai_schema(response_model) # type: ignore
new_kwargs["functions"] = [response_model.openai_schema] # type: ignore
new_kwargs["function_call"] = {"name": response_model.openai_schema["name"]} # type: ignore

if mode == Mode.FUNCTIONS:
new_kwargs["functions"] = [response_model.openai_schema] # type: ignore
new_kwargs["function_call"] = {
"name": response_model.openai_schema["name"]
} # type: ignore
elif mode == Mode.TOOLS:
new_kwargs["tools"] = [
{
"type": "function",
"function": response_model.openai_schema,
}
]
new_kwargs["tool_choice"] = {
"type": "function",
"function": {"name": response_model.openai_schema["name"]},
}
elif mode == Mode.JSON:
new_kwargs["response_format"] = {"type": "json_object"}

# check that the first message is a system message
# if it is not, add a system message to the beginning
message = f"Make sure that your response to any message matchs the json_schema below, do not deviate at all: \n{response_model.model_json_schema()['properties']}"

if new_kwargs["messages"][0]["role"] != "system":
new_kwargs["messages"].insert(
0,
{
"role": "system",
"content": message,
},
)

# if the first message is a system append the schema to the end
if new_kwargs["messages"][0]["role"] == "system":
new_kwargs["messages"][0]["content"] += f"\n\n{message}"
else:
raise ValueError(f"Invalid patch mode: {mode}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The handle_response_model function has been updated to handle different modes. However, there is a potential issue with the new_kwargs["messages"] list being accessed without checking if it exists or if it's a list. This could lead to a KeyError or TypeError. Ensure that new_kwargs["messages"] is initialized and is a list before attempting to access or modify it.

Comment on lines 94 to 98
if new_kwargs.get("stream", False) and response_model is not None:
import warnings
raise NotImplementedError("stream=True is not supported when using response_model parameter")

warnings.warn(
"stream=True is not supported when using response_model parameter"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The warning about stream=True not being supported when using the response_model parameter is raised after the NotImplementedError exception. The warning will never be executed because the exception will stop the function execution. Remove the redundant warning.

-        warnings.warn(
-            "stream=True is not supported when using response_model parameter"

Commitable suggestion

IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
if new_kwargs.get("stream", False) and response_model is not None:
import warnings
raise NotImplementedError("stream=True is not supported when using response_model parameter")
warnings.warn(
"stream=True is not supported when using response_model parameter"
if new_kwargs.get("stream", False) and response_model is not None:
raise NotImplementedError("stream=True is not supported when using response_model parameter")

Comment on lines 136 to 156
kwargs,
max_retries,
strict: Optional[bool] = None,
mode: Mode = Mode.FUNCTIONS,
):
retries = 0
while retries <= max_retries:
try:
response: ChatCompletion = await func(*args, **kwargs)
return (
process_response(
response,
response_model,
validation_context,
strict=strict,
),
None,
return process_response(
response,
response_model=response_model,
validation_context=validation_context,
strict=strict,
mode=mode,
)
except (ValidationError, JSONDecodeError) as e:
kwargs["messages"].append(response.choices[0].message) # type: ignore
kwargs["messages"].append(dump_message(response.choices[0].message)) # type: ignore
kwargs["messages"].append(
{
"role": "user",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: This review was outside of the patch, so it was mapped to the patch with the greatest overlap. Original lines [131-153]

The retry_async function has been updated to handle different modes, but there is an issue with appending messages to kwargs["messages"] without checking if kwargs["messages"] exists or if it's a list. This could lead to a KeyError or TypeError. Ensure that kwargs["messages"] is initialized and is a list before appending to it.

-            kwargs["messages"].append(dump_message(response.choices[0].message))  # type: ignore
+            if "messages" in kwargs and isinstance(kwargs["messages"], list):
+                kwargs["messages"].append(dump_message(response.choices[0].message))  # type: ignore

Comment on lines 170 to +188
kwargs,
max_retries,
strict: Optional[bool] = None,
mode: Mode = Mode.FUNCTIONS,
):
retries = 0
while retries <= max_retries:
# Excepts ValidationError, and JSONDecodeError
try:
response = func(*args, **kwargs)
return (
process_response(
response, response_model, validation_context, strict=strict
),
None,
return process_response(
response,
response_model=response_model,
validation_context=validation_context,
strict=strict,
mode=mode,
)
except (ValidationError, JSONDecodeError) as e:
kwargs["messages"].append(dump_message(response.choices[0].message))
kwargs["messages"].append(response.choices[0].message)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The retry_sync function has the same issue as retry_async with appending messages to kwargs["messages"]. Ensure that kwargs["messages"] is initialized and is a list before appending to it.

-            kwargs["messages"].append(response.choices[0].message)
+            if "messages" in kwargs and isinstance(kwargs["messages"], list):
+                kwargs["messages"].append(response.choices[0].message)

Commitable suggestion

IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
kwargs,
max_retries,
strict: Optional[bool] = None,
mode: Mode = Mode.FUNCTIONS,
):
retries = 0
while retries <= max_retries:
# Excepts ValidationError, and JSONDecodeError
try:
response = func(*args, **kwargs)
return (
process_response(
response, response_model, validation_context, strict=strict
),
None,
return process_response(
response,
response_model=response_model,
validation_context=validation_context,
strict=strict,
mode=mode,
)
except (ValidationError, JSONDecodeError) as e:
kwargs["messages"].append(dump_message(response.choices[0].message))
kwargs["messages"].append(response.choices[0].message)
kwargs,
max_retries,
strict: Optional[bool] = None,
mode: Mode = Mode.FUNCTIONS,
):
retries = 0
while retries <= max_retries:
# Excepts ValidationError, and JSONDecodeError
try:
response = func(*args, **kwargs)
return process_response(
response,
response_model=response_model,
validation_context=validation_context,
strict=strict,
mode=mode,
)
except (ValidationError, JSONDecodeError) as e:
if "messages" in kwargs and isinstance(kwargs["messages"], list):
kwargs["messages"].append(response.choices[0].message)

Comment on lines 217 to +222
*args,
**kwargs,
):
response_model, new_kwargs = handle_response_model(response_model, kwargs) # type: ignore
response, error = await retry_async(
if mode == Mode.TOOLS:
max_retries = 0
warnings.warn("max_retries is not supported when using tool calls")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new_chatcompletion_async function sets max_retries to 0 and issues a warning when mode is Mode.TOOLS. However, the previous comments suggested raising an exception instead of issuing a warning to enforce the constraint that max_retries is not supported for tool calls. Consider raising an exception here as well.

-            warnings.warn("max_retries is not supported when using tool calls")
+            raise ValueError("max_retries is not supported when using tool calls")

Commitable suggestion

IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
*args,
**kwargs,
):
response_model, new_kwargs = handle_response_model(response_model, kwargs) # type: ignore
response, error = await retry_async(
if mode == Mode.TOOLS:
max_retries = 0
warnings.warn("max_retries is not supported when using tool calls")
*args,
**kwargs,
):
if mode == Mode.TOOLS:
max_retries = 0
raise ValueError("max_retries is not supported when using tool calls")

Comment on lines 243 to +248
*args,
**kwargs,
):
response_model, new_kwargs = handle_response_model(response_model, kwargs) # type: ignore
response, error = retry_sync(
if mode == Mode.TOOLS:
max_retries = 0
warnings.warn("max_retries is not supported when using tool calls")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new_chatcompletion_sync function has the same issue as new_chatcompletion_async regarding the handling of max_retries when mode is Mode.TOOLS. Raise an exception instead of issuing a warning.

-            warnings.warn("max_retries is not supported when using tool calls")
+            raise ValueError("max_retries is not supported when using tool calls")

Commitable suggestion

IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
*args,
**kwargs,
):
response_model, new_kwargs = handle_response_model(response_model, kwargs) # type: ignore
response, error = retry_sync(
if mode == Mode.TOOLS:
max_retries = 0
warnings.warn("max_retries is not supported when using tool calls")
*args,
**kwargs,
):
if mode == Mode.TOOLS:
max_retries = 0
raise ValueError("max_retries is not supported when using tool calls")

@@ -208,7 +268,9 @@
return wrapper_function
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wrap_chatcompletion function is missing a docstring. It's important to maintain consistency and provide documentation for all public functions, especially when they are part of a significant update to the codebase.

Repository owner deleted a comment from coderabbitai bot Nov 24, 2023
Copy link
Contributor

coderabbitai bot commented Nov 24, 2023

Walkthrough

The recent update enhances the OpenAI client's backward compatibility and structured output capabilities. It introduces new keywords and methods to handle different modes of operation—Function Calling, Tool Calling, and JSON Mode. These modes facilitate various ways of processing and integrating structured data. The patching mechanism allows the client to adopt these new features seamlessly, while the tests ensure that the new functionalities work as intended, including schema validation and error handling.

Changes

File Path Change Summary
docs/concepts/patching.md Added keywords for client compatibility and methods for structured output.
instructor/.../function_calls.py Introduced Mode enum and updated from_response method for different modes.
instructor/patch.py Updated functions to handle new mode argument for response processing.
tests/openai/test_modes.py Added tests for user information extraction using new structured output modes.
tests/openai/test_patch.py & tests/test_validators.py Implemented new tests for validating the patched client functionality and error handling.

🐇✨
In the land of code, where the data streams flow,
A patch was spun, with a soft, gentle glow.
Keywords new, and modes a-plenty,
Structured outputs, now there are many.

🌟📜
A rabbit hopped through lines of change,
Through tests and docs, it did arrange.
With each hop, a feature set,
In JSON, tools, and functions met.

🎉🐾
Celebrate we do, with a cheer so bright,
For compatibility's dance, in the moon's soft light.
Now clients old, with new tricks blend,
Thanks to the patches, that our rabbit did send.


Tips

Chat with CodeRabbit Bot (@coderabbitai)

  • If you reply to a review comment from CodeRabbit, the bot will automatically respond.
  • To engage with CodeRabbit bot directly around the specific lines of code in the PR, mention @coderabbitai in your review comment
  • Note: Review comments are made on code diffs or files, not on the PR overview.
  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Note: For conversation with the bot, please use the review comments on code diffs or files.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • The JSON schema for the configuration file is available here.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 4

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 7de55a9 and 649357c.
Files ignored due to filter (1)
  • mkdocs.yml
Files selected for processing (6)
  • docs/concepts/patching.md (1 hunks)
  • instructor/function_calls.py (3 hunks)
  • instructor/patch.py (12 hunks)
  • tests/openai/test_modes.py (1 hunks)
  • tests/openai/test_patch.py (1 hunks)
  • tests/test_validators.py (1 hunks)
Additional comments: 13
docs/concepts/patching.md (1)
  • 1-70:
    The documentation correctly reflects the new enhancements and the example for JSON Mode is now consistent with the description.
instructor/function_calls.py (2)
  • 8-16:
    The enum members should be in uppercase to follow the Python style guide for enum names.
class Mode(enum.Enum):
-    FUNCTIONS: str = "function_call"
-    TOOLS: str = "tool_call"
-    JSON: str = "json_mode"
+    FUNCTIONS = "function_call"
+    TOOLS = "tool_call"
+    JSON = "json_mode"
  • 204-239: > Note: This review was outside the patches, so it was mapped to the patch with the greatest overlap. Original lines [187-236]

The issues previously mentioned regarding the from_response method still need to be addressed.

instructor/patch.py (6)
  • 95-98:
    Remove the unreachable warning after the NotImplementedError.
-        warnings.warn(
-            "stream=True is not supported when using response_model parameter"
-        )
  • 153-156:
    Ensure that kwargs["messages"] is initialized and is a list before appending to it.

  • 188-191:
    Ensure that kwargs["messages"] is initialized and is a list before appending to it.

  • 221-222:
    Consider raising an exception instead of issuing a warning to enforce the constraint that max_retries is not supported for tool calls.

  • 247-248:
    Raise an exception instead of issuing a warning to enforce that max_retries is not supported for tool calls.

  • 268-268:
    Add a docstring to the wrap_chatcompletion function to maintain consistency and provide necessary documentation.

tests/openai/test_modes.py (1)
  • 38-38:
    The enum value Mode.TOOLS should be Mode.TOOL_CALL as per the summary of changes.
-    user = UserExtract.from_response(response, mode=Mode.TOOLS)
+    user = UserExtract.from_response(response, mode=Mode.TOOL_CALL)
tests/test_validators.py (3)
  • 9-9:
    The import of llm_validator is correctly placed at the top of the file, following Python's convention for import statements.

  • 24-38:
    The new test function test_runmodel_validator_error correctly sets up a scenario to test the BeforeValidator with llm_validator. It is well-structured and follows the standard pytest conventions.

  • 14-22:
    The update to test_patch_completes_successfully correctly includes an Annotated statement for the message field, using AfterValidator to apply moderation.

@@ -1,112 +1,107 @@
from pydantic import BaseModel, field_validator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The import field_validator is incorrect and should be validator from the pydantic library.

- from pydantic import BaseModel, field_validator
+ from pydantic import BaseModel, validator

Commitable suggestion

IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
from pydantic import BaseModel, field_validator
from pydantic import BaseModel, validator

Comment on lines +65 to +70
@field_validator("name")
@classmethod
def validate_name(cls, v):
if v.upper() != v:
raise ValueError("Name should be uppercase")
return v
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The decorator @field_validator is not recognized by Pydantic. It should be replaced with @validator.

-    @field_validator("name")
+    @validator("name")

Commitable suggestion

IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
@field_validator("name")
@classmethod
def validate_name(cls, v):
if v.upper() != v:
raise ValueError("Name should be uppercase")
return v
@validator("name")
@classmethod
def validate_name(cls, v):
if v.upper() != v:
raise ValueError("Name should be uppercase")
return v

Comment on lines +34 to +42
def dump_message(message) -> dict:
"""Dumps a message to a dict, to be returned to the OpenAI API.
Workaround for an issue with the OpenAI API, where the `tool_calls` field isn't allowed to be present in requests
if it isn't used.
"""
dumped_message = message.model_dump()
if not dumped_message.get("tool_calls"):
del dumped_message["tool_calls"]
return {k: v for k, v in dumped_message.items() if v}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function dump_message should handle the case where "tool_calls" is not a key in dumped_message to prevent a KeyError.

-    if not dumped_message.get("tool_calls"):
+    if "tool_calls" in dumped_message and not dumped_message.get("tool_calls"):

Commitable suggestion

IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
def dump_message(message) -> dict:
"""Dumps a message to a dict, to be returned to the OpenAI API.
Workaround for an issue with the OpenAI API, where the `tool_calls` field isn't allowed to be present in requests
if it isn't used.
"""
dumped_message = message.model_dump()
if not dumped_message.get("tool_calls"):
del dumped_message["tool_calls"]
return {k: v for k, v in dumped_message.items() if v}
def dump_message(message) -> dict:
"""Dumps a message to a dict, to be returned to the OpenAI API.
Workaround for an issue with the OpenAI API, where the `tool_calls` field isn't allowed to be present in requests
if it isn't used.
"""
dumped_message = message.model_dump()
if "tool_calls" in dumped_message and not dumped_message.get("tool_calls"):
del dumped_message["tool_calls"]
return {k: v for k, v in dumped_message.items() if v}

Comment on lines +79 to +90
if new_kwargs["messages"][0]["role"] != "system":
new_kwargs["messages"].insert(
0,
{
"role": "system",
"content": message,
},
)

# if the first message is a system append the schema to the end
if new_kwargs["messages"][0]["role"] == "system":
new_kwargs["messages"][0]["content"] += f"\n\n{message}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic for appending the JSON schema to the system message is flawed. It should only append the schema if the first message is a system message, not insert a new one.

-            if new_kwargs["messages"][0]["role"] != "system":
-                new_kwargs["messages"].insert(
-                    0,
-                    {
-                        "role": "system",
-                        "content": message,
-                    },
-                )
-
-            # if the first message is a system append the schema to the end
-            if new_kwargs["messages"][0]["role"] == "system":
-                new_kwargs["messages"][0]["content"] += f"\n\n{message}"
+            if new_kwargs["messages"][0]["role"] == "system":
+                new_kwargs["messages"][0]["content"] += f"\n\n{message}"
+            else:
+                new_kwargs["messages"].insert(
+                    0,
+                    {
+                        "role": "system",
+                        "content": message,
+                    },
+                )

Commitable suggestion

IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
if new_kwargs["messages"][0]["role"] != "system":
new_kwargs["messages"].insert(
0,
{
"role": "system",
"content": message,
},
)
# if the first message is a system append the schema to the end
if new_kwargs["messages"][0]["role"] == "system":
new_kwargs["messages"][0]["content"] += f"\n\n{message}"
if new_kwargs["messages"][0]["role"] == "system":
new_kwargs["messages"][0]["content"] += f"\n\n{message}"
else:
new_kwargs["messages"].insert(
0,
{
"role": "system",
"content": message,
},
)

@jxnl jxnl enabled auto-merge (squash) November 24, 2023 20:15
@jxnl jxnl disabled auto-merge November 25, 2023 18:56
@jxnl jxnl merged commit 359c5f9 into main Nov 25, 2023
4 of 8 checks passed
@jxnl jxnl deleted the multiple-modalities branch November 25, 2023 18:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant