Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do you know of any packages or frameworks similar to promptfoo? #49

Closed
Keiku opened this issue Jun 28, 2023 · 8 comments
Closed

Do you know of any packages or frameworks similar to promptfoo? #49

Keiku opened this issue Jun 28, 2023 · 8 comments

Comments

@Keiku
Copy link

Keiku commented Jun 28, 2023

I was looking for something just like promptfoo. Do you know of any packages or frameworks similar to this? I would like to consider other comparisons.

@ryanpeach
Copy link

I also would like to know.

@typpo
Copy link
Collaborator

typpo commented Jul 1, 2023

Unfortunately not really. I built this because there wasn't anything else out there that did what I needed it to do. OpenAI does have an Evals framework you can take a look at. Its focus is on testing OpenAI models with heavier test cases, and some of the more advanced test cases require Python implementation.

@typpo typpo closed this as completed Jul 1, 2023
@ryanpeach
Copy link

The main thing I need is this but in python with langchain compatibility. It might be worth cloning and converting.

@Keiku
Copy link
Author

Keiku commented Jul 7, 2023

As far as I know, QAEvalChain in langchain module might be useful to me. I'm still looking to see if there are other alternatives.

@Keiku
Copy link
Author

Keiku commented Jul 7, 2023

@typpo Thanks for the link reference.

@typpo
Copy link
Collaborator

typpo commented Jul 7, 2023

For those of you working in Python, have a look at the end-to-end LLM chain testing documentation.

Specifically, I've created an example that shows how to evaluate a Python LangChain implementation.

The example compares raw GPT-4 with LangChain's LLM-Math plugin by using the exec provider to run the LangChain script:

# promptfooconfig.yaml
# ...
providers:
  - openai:chat:gpt-4-0613
  - exec:python langchain_example.py
# ...

The result is a side-by-side comparison of GPT-4 and LangChain doing math:

langchain gpt-4 eval

Hope this helps your use cases. If not, interested in learning more.

Side note - QAEvalChain is similar in approach to the llm-rubric assertion type of promptfoo. It can help evaluate whether a specific answer makes sense for a specific question.

@Keiku
Copy link
Author

Keiku commented Jul 12, 2023

It looks like it was released recently.
hegelai/prompttools: Open-source tools for prompt testing and experimentation

@karrtikiyer
Copy link

@typpo : First of all congratulations on the great work in building this library. It would be great if we can have some way to directly compare and contrast promptfoo with prompttools and evals by OpenAi. This will make life easier for consumers to pick & choose best among these based on the usecase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants