You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently each evaluator is a single node. Although code evaluators can support dictionary outputs for multiple metrics, this behavior is relatively obscure and only works for code-based assertions. To better support "iterative refinement" of prompts for developers, we should make it easier to add multiple, independent evaluations in the same node (i.e. a mix of named code assertions and LLM scoring prompts).
To implement this we need to:
abstract out the code eval and LLM scorer subcomponents from their respective nodes, so that they can be added independent of the node (like how the Response Inspector view works)
(possibly useful, but not strictly required) make LLM scorer nodes locked to true/false values by default (?), or otherwise re-think them to be easier to write and add expected output types that scores stick to
The text was updated successfully, but these errors were encountered:
Currently each evaluator is a single node. Although code evaluators can support dictionary outputs for multiple metrics, this behavior is relatively obscure and only works for code-based assertions. To better support "iterative refinement" of prompts for developers, we should make it easier to add multiple, independent evaluations in the same node (i.e. a mix of named code assertions and LLM scoring prompts).
To implement this we need to:
The text was updated successfully, but these errors were encountered: