Overview

A/B evaluation with Prodigy with weaviate vector database using OpenAI embeddings.

To install, first add PRODIGY_KEY and OPENAI_KEY to .env file.

Create a new virtual environment:

python -m venv venv
source venv/bin/activate

Run make install to install libraries.

Run make get_data to download sample text data.

Weaviate

Create a cloud instance

Using the free cloud instance for demo purposes. These are only available for 14 days.

My example includes authentication. This is optional but recommended for more realistic applications.

Add your cloud URL (WEAVIATE_CLUSTER) and Auth token (WEAVIATE_TOKEN) to .env file. Can create manuallly or programmatically:

dotenv set WEAVIATE_CLUSTER https://my-sandbox-cluster-xxxxxx.weaviate.network
dotenv set WEAVIATE_KEY xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Create schema and upload data to cloud instance

make setup

To check, run https://some-endpoint.weaviate.network/v1/objects (will need authorization if provided).

Create A/B dataset by taking input data sick-test.jsonl, querying weaviate, and taking the top and 5th top choice:

make query

This will create two datasets: data/choice_top.jsonl (ranking #1) and data/choice_bottom.jsonl (ranking #5).

These two datasets will then be used as an A/B test, using the input text and masking which of the options is which (i.e., which is the top and which is the bottom choice).

If the model is consistent with the annotators preferences, we'd expect the top ranking to always be preferred over the bottom (5th) ranking.

Run Prodigy's A/B (compare) recipe

make prodigy

This runs Prodigy's compare recipe:

$ python3 -m prodigy compare weaviate-sts ./data/choice_bottom.jsonl ./data/choice_top.jsonl

Added dataset weaviate-compare to database SQLite.

✨  Starting the web server at http://localhost:8080 ...
Open the app in your browser and start annotating!

^C
✔ Saved 30 annotations to database SQLite
Dataset: weaviate-compare
Session ID: 2023-04-09_18-15-52



=========================== ✨  Evaluation results ===========================
✔ You preferred B (choice_top.jsonl)

A          2   choice_bottom.jsonl
B         22   choice_top.jsonl   
Ignored    0                      
Total     24

TODOS

[] Generalize query_weaviate.py to take bottom ranking as input

[] Remove options that are identical to the input

[] Explore option to compare across different embeddings: e.g., OpenAI vs. Cohere or HuggingFace

[] Explore running local weaviate instance (e.g., Docker)

[] Create custom Prodigy recipe that reads weaviate in batches, rather than flat files

[] Create tests

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
img		img
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
makefile		makefile
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

img

img

scripts

scripts

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

makefile

makefile

requirements.txt

requirements.txt

Repository files navigation

Overview

Weaviate

TODOS

About

Releases

Packages

Languages

License

wesslen/weaviate-prodigy

Folders and files

Latest commit

History

Repository files navigation

Overview

Weaviate

TODOS

About

Resources

License

Stars

Watchers

Forks

Languages