Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add multitask tests #265

Merged
merged 7 commits into from
Dec 9, 2023
Merged

Add multitask tests #265

merged 7 commits into from
Dec 9, 2023

Conversation

jxnl
Copy link
Collaborator

@jxnl jxnl commented Dec 9, 2023

Summary by CodeRabbit

  • Refactor

    • Transitioned several methods to asynchronous execution to improve performance and responsiveness.
  • Bug Fixes

    • Updated conditions in methods to handle additional data modes correctly.
  • Tests

    • Streamlined testing by implementing parameterized functions.
    • Removed redundant code and classes to enhance test efficiency.
  • Chores

    • Removed deprecated functionality related to the openai_function decorator to simplify codebase.

Copy link
Contributor

coderabbitai bot commented Dec 9, 2023

Warning

Rate Limit Exceeded

@jxnl has exceeded the limit for the number of files or commits that can be reviewed per hour. Please wait 6 minutes and 14 seconds before requesting another review.

How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.
Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.
Please see our FAQ for further information.

Commits Files that changed from the base of the PR and between 0aeb41c and 8179f14.

Walkthrough

The recent updates involve a shift towards asynchronous programming, with several methods in the instructor/dsl/multitask.py and instructor/patch.py files being modified to include the async keyword. Additionally, there's a notable removal of the openai_function decorator and related logic, suggesting a change in how functions interface with OpenAI services. The test suite reflects these changes, with tests being refactored to accommodate the new async behavior and the removal of certain classes and decorators.

Changes

File Path Change Summary
instructor/dsl/multitask.py Added async keyword to methods; updated conditions in extract_json.
tests/openai/test_modes.py
tests/openai/test_multitask.py
Removed UserExtract class; refactored tests to use parameterized functions and removed empty lines.
instructor/__init__.py
instructor/function_calls.py
instructor/patch.py
tests/test_function_calls.py
Removed openai_function decorator; added async keyword to function declarations.

🐇✨
A hop, a skip, in async delight,
CodeRabbit's changes take flight.
No more openai_function, we're async through and through,
With each await, our code feels brand new. 🚀🌟


Tips

Chat with CodeRabbit Bot (@coderabbitai)

  • If you reply to a review comment from CodeRabbit, the bot will automatically respond.
  • To engage with CodeRabbit bot directly around the specific lines of code in the PR, mention @coderabbitai in your review comment
  • Note: Review comments are made on code diffs or files, not on the PR overview.
  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Note: For conversation with the bot, please use the review comments on code diffs or files.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • The JSON schema for the configuration file is available here.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 2

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 694414e and 9e04060.
Files selected for processing (3)
  • instructor/dsl/multitask.py (4 hunks)
  • tests/openai/test_modes.py (4 hunks)
  • tests/openai/test_multitask.py (2 hunks)
Additional comments: 8
instructor/dsl/multitask.py (6)
  • 12-18: The addition of from_streaming_response_async provides an asynchronous counterpart to the existing from_streaming_response. This is a good approach to maintain backward compatibility while introducing new functionality.

  • 34-40: The addition of tasks_from_chunks_async provides an asynchronous counterpart to the existing tasks_from_chunks. This is a good approach to maintain backward compatibility while introducing new functionality.

  • 51-65: > Note: This review was outside the patches, so it was mapped to the patch with the greatest overlap. Original lines [51-77]

The addition of extract_json_async provides an asynchronous counterpart to the existing extract_json. This is a good approach to maintain backward compatibility while introducing new functionality.

  • 62-64: The condition for Mode.MD_JSON has been correctly added to the extract_json method, ensuring that the method can handle this mode appropriately.

  • 51-65: > Note: This review was outside the patches, so it was mapped to the patch with the greatest overlap. Original lines [65-71]

The error handling for unsupported modes in the extract_json method is correctly implemented, raising a NotImplementedError when an unsupported mode is encountered.

  • 72-78: The exception handling for AttributeError in the extract_json method is correctly implemented, allowing the method to continue processing chunks even if an attribute is missing in one of them.
tests/openai/test_multitask.py (2)
  • 18-29: The parameterization of test_multi_user with different modes is correctly implemented and aligns with the PR's objective to introduce multitasking capabilities.

  • 52-58: The implementation of test_multi_user_tools_mode_async as an asynchronous function is correct and aligns with the PR's objective to support multitasking in tests.

Comment on lines 15 to 29
"Correctly segment it into entitites"
"Make sure the JSON is correct"
),
},
],
max_tokens=1000,
)

resp = [user for user in stream_extract(input="Jason is 20, Sarah is 30")]
print(resp)
assert len(resp) == 2
assert resp[0].name == "Jason"
assert resp[0].age == 20
assert resp[1].name == "Sarah"
assert resp[1].age == 30


def test_multi_user_tools_mode():
client = instructor.patch(OpenAI(), mode=Mode.TOOLS)
@pytest.mark.parametrize("mode", [Mode.FUNCTIONS, Mode.JSON, Mode.TOOLS, Mode.MD_JSON])
def test_multi_user(mode):
client = instructor.patch(OpenAI(), mode=mode)

def stream_extract(input: str) -> Iterable[User]:
return client.chat.completions.create(
model="gpt-3.5-turbo-1106",
stream=True,
response_model=Users,
messages=[
{
"role": "user",
"content": (
f"Consider the data below:\n{input}"
"Correctly segment it into entitites"
"Make sure the JSON is correct"
),
},
],
max_tokens=1000,
)

resp = [user for user in stream_extract(input="Jason is 20, Sarah is 30")]
print(resp)
assert len(resp) == 2
assert resp[0].name == "Jason"
assert resp[0].age == 20
assert resp[1].name == "Sarah"
assert resp[1].age == 30


def test_multi_user_legacy():
def stream_extract(input: str, cls) -> Iterable[User]:
client = instructor.patch(OpenAI())
MultiUser = instructor.MultiTask(cls)
completion = client.chat.completions.create(
model="gpt-3.5-turbo",
stream=True,
functions=[MultiUser.openai_schema],
function_call={"name": MultiUser.openai_schema["name"]},
messages=[
{
"role": "system",
"content": "You are a perfect entity extraction system",
},
{
"role": "user",
"content": (
f"Consider the data below:\n{input}"
"Correctly segment it into entitites"
"Make sure the JSON is correct"
),
},
],
max_tokens=1000,
)
return MultiUser.from_streaming_response(completion, mode=Mode.FUNCTIONS)

resp = [user for user in stream_extract(input="Jason is 20, Sarah is 30", cls=User)]
assert len(resp) == 2
assert resp[0].name == "Jason"
assert resp[0].age == 20
assert resp[1].name == "Sarah"
assert resp[1].age == 30





@pytest.mark.asyncio
async def test_multi_user_function_mode_async():
client = instructor.patch(AsyncOpenAI(), mode=Mode.FUNCTIONS)

async def stream_extract(input: str) -> Iterable[User]:
return await client.chat.completions.create(
model="gpt-3.5-turbo",
stream=True,
response_model=Users,
messages=[
{
"role": "system",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: This review was outside the patches, so it was mapped to the patch with the greatest overlap. Original lines [22-42]

The stream_extract function within test_multi_user should be asynchronous to align with the PR's objective of introducing multitasking capabilities and the changes made in test_multi_user_tools_mode_async.

- def stream_extract(input: str) -> Iterable[User]:
+ async def stream_extract(input: str) -> Iterable[User]:

return await client.chat.completions.create(
model="gpt-3.5-turbo",
stream=True,
response_model=Users,
messages=[
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a missing space in the content string concatenation within the messages list, which could lead to incorrect string formatting.

- "Correctly segment it into entitites"
- "Make sure the JSON is correct"
+ "Correctly segment it into entitites "
+ "Make sure the JSON is correct "

Committable suggestion

IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
{
{

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 9e04060 and 0aeb41c.
Files selected for processing (4)
  • instructor/init.py (1 hunks)
  • instructor/function_calls.py (2 hunks)
  • instructor/patch.py (2 hunks)
  • tests/test_function_calls.py (2 hunks)
Files skipped from review due to trivial changes (1)
  • instructor/init.py
Additional comments: 6
instructor/function_calls.py (4)
  • 16-17: The addition of MD_JSON to the Mode enum is consistent with the PR's objective to support multitasking and asynchronous operations.

  • 168-169: The introduction of the from_response_async method is consistent with the PR's objective to add multitasking tests and refactor code for asynchronous operations.

  • 165-165: The addition of an else block to handle invalid modes in from_response and from_response_async methods is a good practice for robust error handling.

  • 19-20: Ensure that the removal of the openai_function decorator does not impact the functionality of the openai_schema factory function, which is still present in the file.


The usage of the openai_schema function in other parts of the codebase suggests it operates independently of the removed openai_function decorator. No issues found regarding its functionality post-removal.

instructor/patch.py (2)
  • 173-179: > Note: This review was outside the patches, so it was mapped to the patch with the greatest overlap. Original lines [177-211]

The addition of the async keyword to process_response_async is consistent with the PR's objective to support asynchronous operations. The function's implementation correctly uses await for the asynchronous call to response_model.from_response_async. This change aligns with the summary provided.

  • 214-216: The addition of the async keyword to retry_async is consistent with the PR's objective to support asynchronous operations. The function's implementation correctly uses await for the asynchronous call to func and process_response_async. This change aligns with the summary provided.

@jxnl jxnl merged commit 941f100 into main Dec 9, 2023
9 checks passed
@jxnl jxnl deleted the param-tests branch December 9, 2023 21:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant