## Setup environment

1. To run this notebook, run the following commands in the terminal:

    ```bash
    pip3 install -r requirements.txt
    ```
    
1. Follow the instructions in `firebase/README.md` to export the Firestore data to JSON
1. Add the JSON file to `chroma-embeddings/data/firestore-highlights-export.json`.
1. Add a `chroma-embeddings/.env` file with an `OPENAI_KEY`

## Estimate total tokens

OpenAI text embedding API charges $0.0004 per 1,000 tokens.

In [None]:
import tiktoken
import chroma

encoding = tiktoken.encoding_for_model("text-embedding-ada-002")
highlights = chroma.read_highlights_export()

total_tokens = 0
for highlight in highlights:
    num_tokens = len(encoding.encode(highlight["body"]))
    total_tokens += num_tokens

print(f"Total tokens: {total_tokens}")

## Import highlights

In [None]:
# Optionally set a limit here to speed up the import. Useful for debugging.
limit = None

chroma.import_highlights(limit)

## Query the embeddings

Set the `query` variable below with your search. 

This query will be turned into its own embedding and then compared to all other embeddings in the database. The results will be sorted by similarity. The closer the distance is to 0, the more similar the embeddings are.

In [None]:
query = "tips for being a team lead"
chroma.search_highlights(query)