verify feedback selectors on recorder init #961

piotrm0 · 2024-03-06T19:06:22Z

Fill in normal method call information in dummy record.
Ignore failures if they are of the form something_referring_to_method_m.args.method_m_arg.anything or something_referring_to_method_m.rets.anything as info beyond known parameter names or return values is not known before app is invoked.
Also added App.dummy_record method to produce the records used to check for selector issue but might be independently useful for users.

When creating an app recorder and feedbacks are provided, the selectors in those feedbacks are checked against the app and (empty) record in case those selectors are wrong. Settings to not run this check or not throw the error are provided and explained in the error message. The message also includes a dump of the longest prefix of the selector that does exist. Example:

f = Feedback(hugs.language_match).on(Select.App.app._response_synthesizer.thisdoesnotexist.thisalso)

tru_query_engine_recorder = TruLlama(query_engine, feedbacks=[f])

Produces an exception and a hint message:

ValueError: Some selectors do not exist in the app or record.

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃                                              Selector check failed                                              ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛

Source of argument text1 to language_match does not exist in app or expected record:                               

                                                                                                                   
 __record__.app._response_synthesizer.thisdoesnotexist.thisalso                                                    
                                                                                                                   

The data used to make this check may be incomplete. If you expect records produced by your app to contain the      
selected content, you can ignore this error by setting selectors_nocheck in the TruLlama constructor.              
Alternatively, setting selectors_check_warning will print out this message but will not raise an error.            


                                              Additional information:                                              

Feedback function signature:                                                                                       

                                                                                                                   
 (text1: str, text2: str) -> Tuple[float, Dict]                                                                    
                                                                                                                   

The prefix __record__.app._response_synthesizer selects this data that exists in your app or typical records:      

 • Object of type dict starting with:                                                                              

                                                                                                                   
       {                                                                                                           
         '_llm': {                                                                                                 
           'wrapped_llm_predict': [...],                                                                           
           'wrapped_async_llm_predict': [...],                                                                     
           'wrapped_llm_chat': [...],                                                                              
           'wrapped_async_llm_chat': [...]                                                                         
         },                                                                                                        
         'get_response': [RecordAppCall(...), RecordAppCall(...), RecordAppCall(...)]                              
       }

	🚀 This PR description was created by Ellipsis for commit `90514b3`.

Summary:

This PR introduces a method to verify feedback selectors, improves comments and logging, adds support for NEMO Guardrails apps with new classes, and updates test files and documentation.

Key points:

Added a new method check_selectors in App and Feedback classes in app.py and feedback.py respectively.
Improved comments and logging messages in app.py and feedback.py.
Introduced support for NEMO Guardrails apps with the creation of TruRails and RailsInstrument classes in tru_rails.py.
Modified test files and updated documentation.

Generated with ❤️ by ellipsis.dev

	🚀 This PR description was created by Ellipsis for commit `c2289f5`.

Summary:

This PR maintains the detailed explanation of the error handling mechanism, introduces a method to verify feedback selectors, improves comments and logging, adds support for NEMO Guardrails apps, and updates test files and documentation.

Key points:

Maintains the detailed explanation of the error handling mechanism when selectors in feedbacks are checked against the app and record.
Introduces a method to verify feedback selectors.
Improves comments and logging in app.py and feedback.py.
Adds support for NEMO Guardrails apps with the creation of TruRails and RailsInstrument classes in tru_rails.py.
Updates test files and documentation.

Generated with ❤️ by ellipsis.dev

ellipsis-dev

❌ Changes requested.

Reviewed the entire pull request up to 0cb3c5f
Looked at 195 lines of code in 2 files
Took 2 minutes and 51 seconds to review

More info

Skipped 0 files when reviewing.
Skipped posting 0 additional comments because they didn't meet confidence threshold of 50%.

Workflow ID: wflow_KHxlrV4eb8P1Hk0F

Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. We'll respond in a few minutes. Learn more here.

ellipsis-dev · 2024-03-06T19:09:42Z

trulens_eval/trulens_eval/feedback/feedback.py

+        """
+        Check that the selectors are valid for the given app and record.
+        """
+        return True


The check_selectors method currently does not perform any checks and always returns True. This could potentially lead to issues if the selectors are not valid for the given app and record. Please implement the necessary checks or remove the method if it's not needed.

ellipsis-dev · 2024-03-06T19:09:42Z

trulens_eval/trulens_eval/feedback/feedback.py

@@ -415,6 +415,7 @@
            higher_is_better=self.higher_is_better
        )

+    # alias
    on_input = on_prompt


The on_prompt and on_input methods are identical, as are on_response and on_output. Having two methods that do the same thing can lead to confusion. Consider removing one of each pair, or if they are needed for backward compatibility, mark one as deprecated.

…ctors

ellipsis-dev

No problems found on commit 90514b3.

Not what you expected? You can customize the content of the reviews using rules. Learn more here.

…lectors' into piotrm/pre_check_selectors

joshreini1 · 2024-03-07T00:44:47Z

Expecting this change to make a big difference in users understanding why they run into issues with the feedback selector. Nicely done!

ellipsis-dev

👍 Looks good to me!

Performed an incremental review on c2289f5
Looked at 5025 lines of code in 41 files
Took 13 minutes and 30 seconds to review

More info

Skipped 6 files when reviewing.
Skipped posting 6 additional comments because they didn't meet confidence threshold of 50%.

1. trulens_eval/trulens_eval/tru_chain.py:14:

Assessed confidence : 10%
Comment:
The TruChain class is a wrapper for a Llama chain that adds logging and feedback functionality. It checks the selectors in the feedback functions and raises an error or prints a warning if a selector doesn't exist, depending on the settings. It also logs the calls to the __call__ method of the chain and their results, and can be used as a context manager for automatic logging.
Reasoning:
The TruChain class is a wrapper for a Llama chain that adds logging and feedback functionality. The __init__ method takes a chain argument, which is the chain to be wrapped, and an app_id argument, which is a unique identifier for the app. It also takes a feedbacks argument, which is a list of feedback functions to be used for evaluating the app. The __init__ method also has optional arguments for controlling the behavior of the wrapper, such as selectors_nocheck and selectors_check_warning, which control whether to check the selectors in the feedback functions and whether to raise an error if a selector doesn't exist. The __init__ method calls the check_selectors method to check the selectors in the feedback functions. The check_selectors method raises a ValueError if a selector doesn't exist and selectors_nocheck is False. If selectors_check_warning is True, it prints a warning message instead of raising an error. The TruChain class also has a __call__ method that calls the __call__ method of the chain and logs the call and its result. The __call__ method returns the result of the __call__ method of the chain. The __call__ method also has a with_record method that calls the __call__ method of the chain and returns both the result and the record of the call. The TruChain class also has a __enter__ and a __exit__ method, which allow it to be used as a context manager. When used as a context manager, it automatically logs the calls to the __call__ method of the chain and their results.

2. trulens_eval/trulens_eval/tru_llama.py:15:

Assessed confidence : 10%
Comment:
The TruLlama class is a wrapper for a Llama Index app that adds logging and feedback functionality. It checks the selectors in the feedback functions and raises an error or prints a warning if a selector doesn't exist, depending on the settings. It also logs the calls to the query method of the query_engine and their results, and can be used as a context manager for automatic logging.
Reasoning:
The TruLlama class is a wrapper for a Llama Index app that adds logging and feedback functionality. The __init__ method takes a query_engine argument, which is the query engine of the Llama Index app to be wrapped, and an app_id argument, which is a unique identifier for the app. It also takes a feedbacks argument, which is a list of feedback functions to be used for evaluating the app. The __init__ method also has optional arguments for controlling the behavior of the wrapper, such as selectors_nocheck and selectors_check_warning, which control whether to check the selectors in the feedback functions and whether to raise an error if a selector doesn't exist. The __init__ method calls the check_selectors method to check the selectors in the feedback functions. The check_selectors method raises a ValueError if a selector doesn't exist and selectors_nocheck is False. If selectors_check_warning is True, it prints a warning message instead of raising an error. The TruLlama class also has a __call__ method that calls the query method of the query_engine and logs the call and its result. The __call__ method returns the result of the query method of the query_engine. The __call__ method also has a with_record method that calls the query method of the query_engine and returns both the result and the record of the call. The TruLlama class also has a __enter__ and a __exit__ method, which allow it to be used as a context manager. When used as a context manager, it automatically logs the calls to the query method of the query_engine and their results.

3. trulens_eval/trulens_eval/tru_rails.py:15:

Assessed confidence : 10%
Comment:
The TruRails class is a wrapper for a NEMO Guardrails app that adds logging and feedback functionality. It checks the selectors in the feedback functions and raises an error or prints a warning if a selector doesn't exist, depending on the settings. It also logs the calls to the __call__ method of the app and their results, and can be used as a context manager for automatic logging.
Reasoning:
The TruRails class is a wrapper for a NEMO Guardrails app that adds logging and feedback functionality. The __init__ method takes an app argument, which is the app to be wrapped, and an app_id argument, which is a unique identifier for the app. It also takes a feedbacks argument, which is a list of feedback functions to be used for evaluating the app. The __init__ method also has optional arguments for controlling the behavior of the wrapper, such as selectors_nocheck and selectors_check_warning, which control whether to check the selectors in the feedback functions and whether to raise an error if a selector doesn't exist. The __init__ method calls the check_selectors method to check the selectors in the feedback functions. The check_selectors method raises a ValueError if a selector doesn't exist and selectors_nocheck is False. If selectors_check_warning is True, it prints a warning message instead of raising an error. The TruRails class also has a __call__ method that calls the __call__ method of the app and logs the call and its result. The __call__ method returns the result of the __call__ method of the app. The __call__ method also has a with_record method that calls the __call__ method of the app and returns both the result and the record of the call. The TruRails class also has a __enter__ and a __exit__ method, which allow it to be used as a context manager. When used as a context manager, it automatically logs the calls to the __call__ method of the app and their results.

4. trulens_eval/trulens_eval/tru_rails.py:169:

Assessed confidence : 10%
Comment:
The RailsInstrument class is a wrapper for a NEMO Guardrails app that adds logging functionality. It checks the selectors in the feedback functions and raises an error or prints a warning if a selector doesn't exist, depending on the settings. It also logs the calls to the __call__ method of the app and their results, and can be used as a context manager for automatic logging.
Reasoning:
The RailsInstrument class is a wrapper for a NEMO Guardrails app that adds logging functionality. The __init__ method takes an app argument, which is the app to be wrapped, and an app_id argument, which is a unique identifier for the app. The __init__ method also has optional arguments for controlling the behavior of the wrapper, such as selectors_nocheck and selectors_check_warning, which control whether to check the selectors in the feedback functions and whether to raise an error if a selector doesn't exist. The __init__ method calls the check_selectors method to check the selectors in the feedback functions. The check_selectors method raises a ValueError if a selector doesn't exist and selectors_nocheck is False. If selectors_check_warning is True, it prints a warning message instead of raising an error. The RailsInstrument class also has a __call__ method that calls the __call__ method of the app and logs the call and its result. The __call__ method returns the result of the __call__ method of the app. The __call__ method also has a with_record method that calls the __call__ method of the app and returns both the result and the record of the call. The RailsInstrument class also has a __enter__ and a __exit__ method, which allow it to be used as a context manager. When used as a context manager, it automatically logs the calls to the __call__ method of the app and their results.

5. trulens_eval/trulens_eval/tru_llama.py:28:

Assessed confidence : 10%
Comment:
The check_selectors method checks the selectors in the feedback functions against the app and the record. If a selector doesn't exist and selectors_nocheck is False, it raises a ValueError. If selectors_check_warning is True, it prints a warning message instead of raising an error. The error message includes a dump of the longest prefix of the selector that does exist.
Reasoning:
The check_selectors method checks the selectors in the feedback functions against the app and the record. If a selector doesn't exist and selectors_nocheck is False, it raises a ValueError. If selectors_check_warning is True, it prints a warning message instead of raising an error. The error message includes a dump of the longest prefix of the selector that does exist. This method is used in the __init__ method of the TruLlama and TruRails classes to check the selectors in the feedback functions when the classes are instantiated.

6. trulens_eval/trulens_eval/utils/notebook_utils.py:51:

Assessed confidence : 90%
Grade: 0%
Comment:
This writefileinterpolated function is not used anywhere in the codebase. Consider removing it if it's not needed. Also, it's generally a good practice to avoid defining functions conditionally as it can lead to unexpected behavior.
Reasoning:
The function writefileinterpolated is not used anywhere in the codebase. It seems to be a utility function for writing to a file, but it's not clear why it's needed in this context. It's also not clear why it's conditionally defined based on the result of is_notebook(). This could potentially lead to unexpected behavior if the code is run in different environments.

Workflow ID: wflow_XXNjdE2UlsuaAvIM

Not what you expected? You can customize the content of the reviews using rules. Learn more here.

…ctors

…dummy records to check against

…ctors

review-notebook-app · 2024-03-12T02:38:18Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

checkpoint verify selectors implementation

0cb3c5f

dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Mar 6, 2024

piotrm0 marked this pull request as draft March 6, 2024 19:06

ellipsis-dev bot reviewed Mar 6, 2024

View reviewed changes

Merge remote-tracking branch 'origin/main' into piotrm/pre_check_sele…

90514b3

…ctors

ellipsis-dev bot reviewed Mar 6, 2024

View reviewed changes

piotrm0 requested a review from joshreini1 March 7, 2024 00:04

piotrm0 marked this pull request as ready for review March 7, 2024 00:04

dosubot bot added the documentation Improvements or additions to documentation label Mar 7, 2024

piotrm0 added 4 commits March 6, 2024 16:04

Merge branch 'main' into piotrm/pre_check_selectors

b68da9e

Merge branch 'main' into piotrm/pre_check_selectors

92a5429

improve warning message for bad selectors

1fe9b64

Merge remote-tracking branch 'refs/remotes/origin/piotrm/pre_check_se…

c2289f5

…lectors' into piotrm/pre_check_selectors

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Mar 7, 2024

joshreini1 approved these changes Mar 7, 2024

View reviewed changes

dosubot bot added the lgtm This PR has been approved by a maintainer label Mar 7, 2024

ellipsis-dev bot reviewed Mar 7, 2024

View reviewed changes

Merge remote-tracking branch 'origin/main' into piotrm/pre_check_sele…

48cd093

…ctors

piotrm0 marked this pull request as draft March 8, 2024 08:31

piotrm0 changed the title ~~verify feedback selectors on recorder init~~ [DRAFT] verify feedback selectors on recorder init Mar 8, 2024

piotrm0 added 5 commits March 11, 2024 15:51

Merge remote-tracking branch 'origin/main' into piotrm/pre_check_sele…

1d97fd6

…ctors

adding markdown output for selector hints and creating more accurate …

71a9703

…dummy records to check against

remove debug notes

f3dbfcc

Merge remote-tracking branch 'origin/main' into piotrm/pre_check_sele…

617d1da

…ctors

ignore selector check failures beyond rets or 1 deeper than known args.

0a2c01e

piotrm0 marked this pull request as ready for review March 12, 2024 02:40

dosubot bot removed the size:L This PR changes 100-499 lines, ignoring generated files. label Mar 12, 2024

dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Mar 12, 2024

fix 2 more docs links

53c660b

piotrm0 changed the title ~~[DRAFT] verify feedback selectors on recorder init~~ verify feedback selectors on recorder init Mar 12, 2024

piotrm0 merged commit 5927f66 into main Mar 12, 2024
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

verify feedback selectors on recorder init #961

verify feedback selectors on recorder init #961

piotrm0 commented Mar 6, 2024 •

edited

ellipsis-dev bot left a comment

ellipsis-dev bot Mar 6, 2024

ellipsis-dev bot Mar 6, 2024

ellipsis-dev bot left a comment

joshreini1 commented Mar 7, 2024

ellipsis-dev bot left a comment

review-notebook-app bot commented Mar 12, 2024

verify feedback selectors on recorder init #961

verify feedback selectors on recorder init #961

Conversation

piotrm0 commented Mar 6, 2024 • edited

Summary:

Summary:

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot Mar 6, 2024

Choose a reason for hiding this comment

ellipsis-dev bot Mar 6, 2024

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

joshreini1 commented Mar 7, 2024

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

review-notebook-app bot commented Mar 12, 2024

piotrm0 commented Mar 6, 2024 •

edited