text-embedding-004 embedding model, closes #7

simonw · Apr 10, 2024 · 143097d · 143097d
1 parent 38415f2
commit 143097d
Show file tree

Hide file tree

Showing 2 changed files with 64 additions and 0 deletions.
diff --git a/README.md b/README.md
@@ -42,6 +42,28 @@ llm chat -m gemini-pro
 
 If you have access to the Gemini 1.5 Pro preview you can use `-m gemini-1.5-pro-latest` to work with that model.
 
+### Embeddings
+
+The plugin also adds support for the `text-embedding-004` embedding model.
+
+Run that against a single string like this:
+```bash
+llm embed -m text-embedding-004 -c 'hello world'
+```
+This returns a JSON array of 768 numbers.
+
+This command will embed every `README.md` file in child directories of the current directory and store the results in a SQLite database called `embed.db` in a collection called `readmes`:
+
+```bash
+llm embed-multi readmes --files . '*/README.md' -d embed.db -m text-embedding-004
+```
+You can then run similarity searches against that collection like this:
+```bash
+llm similar readmes -c 'upload csvs to stuff' -d embed.db
+```
+
+See the [LLM embeddings documentation](https://llm.datasette.io/en/stable/embeddings/cli.html) for further details.
+
 ## Development
 
 To set up this plugin locally, first checkout the code. Then create a new virtual environment:

diff --git a/llm_gemini.py b/llm_gemini.py
@@ -83,3 +83,45 @@ def execute(self, prompt, stream, response, conversation):
                     gathered.append(event)
                     events.clear()
         response.response_json = gathered
+
+
+@llm.hookimpl
+def register_embedding_models(register):
+    register(
+        GeminiEmbeddingModel("text-embedding-004", "text-embedding-004"),
+    )
+
+
+class GeminiEmbeddingModel(llm.EmbeddingModel):
+    needs_key = "gemini"
+    key_env_var = "LLM_GEMINI_KEY"
+    batch_size = 20
+
+    def __init__(self, model_id, gemini_model_id):
+        self.model_id = model_id
+        self.gemini_model_id = gemini_model_id
+
+    def embed_batch(self, items):
+        headers = {
+            "Content-Type": "application/json",
+        }
+        data = {
+            "requests": [
+                {
+                    "model": "models/" + self.gemini_model_id,
+                    "content": {"parts": [{"text": item}]},
+                }
+                for item in items
+            ]
+        }
+
+        with httpx.Client() as client:
+            response = client.post(
+                f"https://generativelanguage.googleapis.com/v1beta/models/{self.gemini_model_id}:batchEmbedContents?key={self.get_key()}",
+                headers=headers,
+                json=data,
+                timeout=None,
+            )
+
+        response.raise_for_status()
+        return [item["values"] for item in response.json()["embeddings"]]