Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow VirtualRecords to have multiple calls to the same component. #988

Merged
merged 21 commits into from
Mar 12, 2024

Conversation

piotrm0
Copy link
Contributor

@piotrm0 piotrm0 commented Mar 11, 2024

  • Allow VirtualRecords to have multiple calls to the same component.

  • Added updates to the virtual_example with how that is done:

# The same method selector can indicate multiple invocations by mapping to a
# list of Dicts instead of a single Dict:

rec2 = VirtualRecord(
    main_input="Where is Germany?",
    main_output="Poland is in Europe",
    calls=
        {
            context_method: 
                [dict(
                    args=["Where is Germany?"],
                    rets=["Poland is a country located in Europe."]
                ), dict(
                    args=["Where is Germany?"],
                    rets=["Germany is a country located in Europe."]
                )
            ] 
        }
    )

Followed by feedback function variants for this:

# Select context to be used in feedback. We select the return values of the
# virtual `get_context` call in the virtual `retriever` component. Names are
# arbitrary except for `rets`.  If there are multiple calls to this method
# recorded, the first one is used by default though a warning will be issued.
context = context_method.rets[:]
# Same as context = context_method[0].rets[:]

# Alternatively, all of the contexts can be retrieved for use in feedback.
context_all_calls = context_method[:].rets[:]
  • Added combinations field to Feedback and argument to Feedback.aggregate to specify how to build argument dictionaries for feedback functions if selectors generate more than one thing. The default and existing mode is "product" but also added "zip" as an option as specified here:
class FeedbackCombinations(str, Enum):
    """How to collect arguments for feedback function calls.
    
    Note that this applies only to cases where selectors pick out more than one
    thing for feedback function arguments. This option is used for the field
    `combinations` of
    [FeedbackDefinition][trulens_eval.schema.FeedbackDefinition] and can be
    specified with
    [Feedback.aggregate][trulens_eval.feedback.feedback.Feedback.aggregate].
    """

    ZIP = "zip"
    """Match argument values per position in produced values. 
    
    Example:
        If the selector for `arg1` generates values `0, 1, 2` and one for `arg2`
        generates values `"a", "b", "c"`, the feedback function will be called 3
        times with kwargs:

        - `{'arg1': 0, arg2: "a"}`,
        - `{'arg1': 1, arg2: "b"}`, 
        - `{'arg1': 2, arg2: "c"}`

    If the quantities of items in the various generators do not match, the
    result will have only as many combinations as the generator with the
    fewest items as per python [zip][zip] (strict mode is not used).

    Note that selectors can use
    [Lens][trulens_eval.utils.serial.Lens] `collect()` to name a single (list)
    value instead of multiple values.
    """

    PRODUCT = "product"
    """Evaluate feedback on all combinations of feedback function arguments.

    Example:
        If the selector for `arg1` generates values `0, 1` and the one for
        `arg2` generates values `"a", "b"`, the feedback function will be called
        4 times with kwargs:

        - `{'arg1': 0, arg2: "a"}`,
        - `{'arg1': 0, arg2: "b"}`,
        - `{'arg1': 1, arg2: "a"}`,
        - `{'arg1': 1, arg2: "b"}`

    See [itertools.product][itertools.product] for more.

    Note that selectors can use
    [Lens][trulens_eval.utils.serial.Lens] `collect()` to name a single (list)
    value instead of multiple values.
    """
  • Added FeedbackStatus.SKIPPED to indicate that an eval was skipped and should not be ran again. Fixed runner to take this into account.

  • Fixed OpenAI provider to take in rpm/pace and use it for controlling rate of endpoint invocations.

@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Mar 11, 2024
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@dosubot dosubot bot added the documentation Improvements or additions to documentation label Mar 11, 2024
@piotrm0 piotrm0 requested a review from joshreini1 March 11, 2024 21:28
@joshreini1
Copy link
Contributor

I think we're going for something slightly different in that the repeated call may not be directly in sequence:

rec2 = VirtualRecord(
    main_input="Where is Germany?",
    main_output="Poland is in Europe",
    calls=
        {
            context_method: 
                dict(
                    args=["Where is Germany?"],
                    rets=["Poland is a country located in Europe."]
                ),
            some_other_method: 
                dict(
                    args=["Where is Germany?"],
                    rets=["Poland is a country located in Europe."]
                ),
            context_method: 
                dict(
                    args=["Where is Germany?"],
                    rets=["Germany is a country located in Europe."]
                ),
        }
    )

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Mar 12, 2024
@piotrm0 piotrm0 merged commit 83223c4 into main Mar 12, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation lgtm This PR has been approved by a maintainer size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants