Support running Evals on specific tasks #45

avivex1000 · 2023-08-03T17:27:01Z

In the current implementation, evals is not easily accessible or runnable without downloading the source code.
The proposed implementation will allow importing the evals runner, defining the preferred models to evaluate against and will run the evaluations on a given task.

Suggested usage:

from declarai.evals import Evaluator

models = [
    Declarai(provider="openai", model="gpt-3.5-turbo"),
    Declarai(provider="openai", model="gpt-4")
]

def test_task() -> str:
    """
    say something
    """

evaluator = Evaluator(models=models)
evaluator.run(test_task)

The text was updated successfully, but these errors were encountered:

avivex1000 added the enhancement New feature or request label Aug 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support running Evals on specific tasks #45

Support running Evals on specific tasks #45

avivex1000 commented Aug 3, 2023

Support running Evals on specific tasks #45

Support running Evals on specific tasks #45

Comments

avivex1000 commented Aug 3, 2023