Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate Test Set: extract response attr if app returns object with that attr #865

Merged
merged 10 commits into from
Mar 6, 2024

Conversation

joshreini1
Copy link
Contributor

@joshreini1 joshreini1 commented Feb 7, 2024

Enables generate_test_set for Llama-Index apps that return response object rather than str


Ellipsis 馃殌 This PR description was created by Ellipsis for commit f008169.

Summary:

This PR modifies the GenerateTestSet class in trulens_eval/trulens_eval/generate_test_set.py to handle Llama-Index apps that return a response object, extracting the response attribute if it exists.

Key points:

  • Modified _generate_themes, _format_themes, _generate_test_prompts, and _format_test_prompts methods in GenerateTestSet class in trulens_eval/trulens_eval/generate_test_set.py.
  • Added code to extract response attribute from the object returned by app_callable.
  • Enables generate_test_set to work with Llama-Index apps returning response objects.

Generated with 鉂わ笍 by ellipsis.dev

@joshreini1 joshreini1 changed the title extract response attr if app returns object with that attr Generate Test Set: extract response attr if app returns object with that attr Feb 7, 2024
@joshreini1 joshreini1 self-assigned this Feb 7, 2024
@joshreini1 joshreini1 marked this pull request as ready for review February 7, 2024 22:43
@dosubot dosubot bot added the size:S This PR changes 10-29 lines, ignoring generated files. label Feb 9, 2024
Copy link

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

馃憤 Looks good to me!

  • Reviewed the entire pull request up to b22ca94
  • Looked at 64 lines of code in 1 files
  • Took 2 minutes and 37 seconds to review
More info
  • Skipped 0 files when reviewing.
  • Skipped posting 4 additional comments because they didn't meet confidence threshold of 50%.
1. trulens_eval/trulens_eval/generate_test_set.py:52:
  • Assessed confidence : 50%
  • Comment:
    Ensure that the 'response' attribute of the object returned by the app_callable is always a string, as the rest of the code assumes this. If it's not guaranteed to be a string, this could lead to errors later on.
  • Reasoning:
    The PR author has added checks to see if the response from the app_callable has a 'response' attribute and if it does, it uses that for further processing. This is done to handle cases where the app_callable returns an object with a 'response' attribute instead of a string. This seems like a reasonable change, but it's important to ensure that the 'response' attribute is always a string, as the rest of the code assumes this. If it's not guaranteed to be a string, this could lead to errors later on.
2. trulens_eval/trulens_eval/generate_test_set.py:110:
  • Assessed confidence : 50%
  • Comment:
    Ensure that the 'response' attribute of the object returned by the app_callable is always a string, as the rest of the code assumes this. If it's not guaranteed to be a string, this could lead to errors later on.
  • Reasoning:
    The PR author has added checks to see if the response from the app_callable has a 'response' attribute and if it does, it uses that for further processing. This is done to handle cases where the app_callable returns an object with a 'response' attribute instead of a string. This seems like a reasonable change, but it's important to ensure that the 'response' attribute is always a string, as the rest of the code assumes this. If it's not guaranteed to be a string, this could lead to errors later on.
3. trulens_eval/trulens_eval/generate_test_set.py:130:
  • Assessed confidence : 50%
  • Comment:
    Ensure that the 'response' attribute of the object returned by the app_callable is always a string, as the rest of the code assumes this. If it's not guaranteed to be a string, this could lead to errors later on.
  • Reasoning:
    The PR author has added checks to see if the response from the app_callable has a 'response' attribute and if it does, it uses that for further processing. This is done to handle cases where the app_callable returns an object with a 'response' attribute instead of a string. This seems like a reasonable change, but it's important to ensure that the 'response' attribute is always a string, as the rest of the code assumes this. If it's not guaranteed to be a string, this could lead to errors later on.
4. trulens_eval/trulens_eval/generate_test_set.py:75:
  • Assessed confidence : 66%
  • Grade: 40%
  • Comment:
    Consider logging the original error message in addition to the custom error message. This will provide more context in case of a SyntaxError and help with debugging.
except SyntaxError as e:
    logger.error(f'Error while parsing themes string: {e}')
    raise ValueError(f'Failed to parse themes string: {test_categories}') from e
  • Reasoning:
    The PR author has added a try-except block around the literal_eval call to handle cases where the string cannot be parsed as a Python literal. This is a good practice as it improves the robustness of the code. However, it would be better to also log the original error message to help with debugging in case of a SyntaxError.

Workflow ID: wflow_S8bRqr9eLHQNkGDk


Not what you expected? You can customize the content of the reviews using rules. Learn more here.

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Mar 6, 2024
@joshreini1 joshreini1 merged commit f008169 into main Mar 6, 2024
8 checks passed
Copy link

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problems found on commit f008169.


Not what you expected? You can customize the content of the reviews using rules. Learn more here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm This PR has been approved by a maintainer size:S This PR changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants