Do you know of any packages or frameworks similar to promptfoo? #49

Keiku · 2023-06-28T05:36:43Z

I was looking for something just like promptfoo. Do you know of any packages or frameworks similar to this? I would like to consider other comparisons.

ryanpeach · 2023-06-30T16:05:40Z

I also would like to know.

typpo · 2023-07-01T03:41:31Z

Unfortunately not really. I built this because there wasn't anything else out there that did what I needed it to do. OpenAI does have an Evals framework you can take a look at. Its focus is on testing OpenAI models with heavier test cases, and some of the more advanced test cases require Python implementation.

ryanpeach · 2023-07-06T14:30:15Z

The main thing I need is this but in python with langchain compatibility. It might be worth cloning and converting.

Keiku · 2023-07-07T04:37:57Z

As far as I know, QAEvalChain in langchain module might be useful to me. I'm still looking to see if there are other alternatives.

Keiku · 2023-07-07T09:54:57Z

@typpo Thanks for the link reference.

typpo · 2023-07-07T16:58:25Z

For those of you working in Python, have a look at the end-to-end LLM chain testing documentation.

Specifically, I've created an example that shows how to evaluate a Python LangChain implementation.

The example compares raw GPT-4 with LangChain's LLM-Math plugin by using the exec provider to run the LangChain script:

# promptfooconfig.yaml
# ...
providers:
  - openai:chat:gpt-4-0613
  - exec:python langchain_example.py
# ...

The result is a side-by-side comparison of GPT-4 and LangChain doing math:

Hope this helps your use cases. If not, interested in learning more.

Side note - QAEvalChain is similar in approach to the llm-rubric assertion type of promptfoo. It can help evaluate whether a specific answer makes sense for a specific question.

Keiku · 2023-07-12T04:57:51Z

It looks like it was released recently.
hegelai/prompttools: Open-source tools for prompt testing and experimentation

karrtikiyer · 2023-08-16T08:45:56Z

@typpo : First of all congratulations on the great work in building this library. It would be great if we can have some way to directly compare and contrast promptfoo with prompttools and evals by OpenAi. This will make life easier for consumers to pick & choose best among these based on the usecase.

typpo closed this as completed Jul 1, 2023

typpo mentioned this issue Jul 7, 2023

Improve support for running external scripts as providers #55

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do you know of any packages or frameworks similar to promptfoo? #49

Do you know of any packages or frameworks similar to promptfoo? #49

Keiku commented Jun 28, 2023

ryanpeach commented Jun 30, 2023

typpo commented Jul 1, 2023

ryanpeach commented Jul 6, 2023

Keiku commented Jul 7, 2023

Keiku commented Jul 7, 2023

typpo commented Jul 7, 2023 •

edited

Keiku commented Jul 12, 2023

karrtikiyer commented Aug 16, 2023

Do you know of any packages or frameworks similar to promptfoo? #49

Do you know of any packages or frameworks similar to promptfoo? #49

Comments

Keiku commented Jun 28, 2023

ryanpeach commented Jun 30, 2023

typpo commented Jul 1, 2023

ryanpeach commented Jul 6, 2023

Keiku commented Jul 7, 2023

Keiku commented Jul 7, 2023

typpo commented Jul 7, 2023 • edited

Keiku commented Jul 12, 2023

karrtikiyer commented Aug 16, 2023

typpo commented Jul 7, 2023 •

edited