Skip to content

Extends base fern client with eval utilities#18

Merged
peadaroh merged 5 commits intomasterfrom
eval-utilities-2
Oct 4, 2024
Merged

Extends base fern client with eval utilities#18
peadaroh merged 5 commits intomasterfrom
eval-utilities-2

Conversation

@peadaroh
Copy link
Contributor

@peadaroh peadaroh commented Oct 4, 2024

This PR extends the auto-generated Humanloop client with additional utilities for running an Evaluation where the user manages their full application runtime.

Related fern docs for extending client with custom code: Augment with custom code — Fern

Specifically we add a method humanloop.evaluations.run_local(...) that takes details of a file destination on Humanloop, a user defined dataset, a user defined function to evaluator, a name for their Evaluation and details of their Evaluators.

This is invoked like so:

from my_code import ask_question, load_dataset
checks = hl.evaluations.run(
    file={
        "path": "evals_demo/answer-flow",
        "type": "flow",
        "function": ask_question,
        "version": {"attributes": {"description": "Simple RAG, OpenAI + Chroma", "prompt": PROMPT}},
    },
    name="Staging CI",
    dataset={"path": "evals_demo/medqa-test", "datapoints": load_dataset()},
    evaluators=[
        {"path": "evals_demo/exact_match"},
        {"path": "evals_demo/levenshtein"},
        {"path": "evals_demo/reasoning", "threshold": 0.5},
    ],
    workers=4
)

With resulting output:
The resulting CLI output is
Screenshot 2024-09-30 at 20 25 00
Screenshot 2024-09-30 at 20 25 37

If the file or dataset does not yet exist on Humanloop, they will be created automatically, otherwise they will be updated appropriately.

If the Evaluation exists for the file (based on the name provided) then a new run is added, otherwise a new Evaluation is automatically created.

Evaluators referenced by path (or id) must already exist. However, we've also provisionally added an affordance for local (or external) Evaluators:

from my_code import ask_question, load_dataset
from my_evaluators import levenshtein_distance_optimized

checks = hl.evaluations.run(
    file={
        "path": "evals_demo/answer-flow",
        "type": "flow",
        "function": ask_question,
        "version": {"attributes": {"description": "Simple RAG, OpenAI + Chroma", "prompt": PROMPT}},
    },
    name="Staging CI",
    dataset={"path": "evals_demo/medqa-test", "datapoints": load_dataset()},
    evaluators=[
        {"path": "evals_demo/exact_match"},
        {"path": "evals_demo/levenshtein"},
        {"path": "evals_demo/reasoning", "threshold": 0.5},
        {
            "path": "evals_demo/levenshtein-optimized",
            "function": levenshtein_distance_optimized,
            "args_type": "target_required",
            "return_type": "number",
        }
    ],
    workers=4
)

@peadaroh peadaroh mentioned this pull request Oct 4, 2024
@peadaroh peadaroh merged commit 8ef6327 into master Oct 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant