Skip to content

Conversation

Spartee
Copy link
Contributor

@Spartee Spartee commented Jul 11, 2023

Providers for creating embeddings with OpenAI and Hugginface.

Includes

  • base class
  • simple integration testing
  • async methods for clients that support it.

@Spartee Spartee added the enhancement New feature or request label Jul 11, 2023
@Spartee Spartee requested a review from tylerhutcherson July 11, 2023 04:05
Copy link
Collaborator

@tylerhutcherson tylerhutcherson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some suggestions but looks great all around!

@@ -0,0 +1,9 @@

from redisvl.providers.openai import OpenAIProvider
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we consider organizing it one layer deeper, under embeddings? that way its clear whether its for Embedding creation or something else (future proofing)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a long import but I see what you mean. A couple options

from redisvl.api.embedding.openai
from redisvl.embedding.openai import OpenAIEmbedding
from redisvl.model.embedding.openai import ...

others?

@@ -0,0 +1,29 @@
from typing import Dict, List, Optional

from sentence_transformers import SentenceTransformer
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could consider importing this scoped within the class -- and then we can prevent folks from having to install SentenceTransformers (torch, etc) if they don't plan to need it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the default for LLMCache... should we not assume that? I do like the scoped gaurd approach.

@@ -0,0 +1,48 @@
from typing import Dict, List, Optional

import openai
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same point as above?

openai.api_key = api_config.get("api_key", None)
self._model_client = openai.Embedding

def embed_many(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we look to make this immune to rate limit issues? One pattern I have seen is to use tenacity and their exponential backoff decorator.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tylerhutcherson can you point me to or write a code sample for this?

@Spartee
Copy link
Contributor Author

Spartee commented Jul 11, 2023

addressing changes in Next PR

@Spartee Spartee merged commit 801ba9c into main Jul 11, 2023
@Spartee Spartee deleted the providers branch July 24, 2023 22:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants