Correct Async functionality with AsyncClientAPI and add embedding model support #6660

danielnyari · 2025-06-10T09:24:10Z

Corrected async usage to use actual AsyncHttpClient.
Added optional argument embedding_function to support embedding models other than the default, e.g.:

import os

from autogen_ext.memory.chromadb import ChromaDBVectorMemory, HttpChromaDBVectorMemoryConfig
from chromadb.utils.embedding_functions import OpenAIEmbeddingFunction

azoai_embed_func = OpenAIEmbeddingFunction(
    api_key_env_var="AZURE_OPENAI_API_KEY",
    model_name="gpt-4o-mini",
    deployment_id="gpt-4o-mini",
    api_base=os.environ.get("AZURE_OPENAI_ENDPOINT"),
    api_type="azure",
    api_version=os.environ.get("AZURE_OPENAI_API_VERSION"),
)


chroma_user_memory = ChromaDBVectorMemory(
    config=HttpChromaDBVectorMemoryConfig(
        collection_name="preferences",
        host="localhost",
        port=8000,
        ssl=False,
        k=2, 
        score_threshold=0.4,
    ),
  embedding_function=azoai_embed_func
)

Why are these changes needed?

Related issue number

Checks

I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://github.com/microsoft/autogen/blob/main/CONTRIBUTING.md to build and test documentation locally.
I've added tests (if relevant) corresponding to the changes introduced in this PR.
I've made sure all auto checks have passed.

1. Corrected async usage to use actual AsyncHttpClient. 2. Added optional argument embedding_function to support embedding models other than the default, e.g.: ´´´python from chromadb.utils.embedding_functions import OpenAIEmbeddingFunction return OpenAIEmbeddingFunction( api_key_env_var="AZURE_OPENAI_API_KEY", model_name=config.index.config.get("model"), # type: ignore deployment_id=config.index.config.get("azure_deployment"), api_base=config.index.config.get("azure_endpoint"), api_type="azure", api_version=config.index.config.get("api_version"), ) ´´´

danielnyari · 2025-06-10T09:25:00Z

@microsoft-github-policy-service agree

This reverts commit 6b83ebf.

victordibia · 2025-06-13T16:29:42Z

@danielnyari
Thanks for this PR!

There have been some recent updates to the chromadb example in #6648 that overlaps with some of the ideas you have here (e.g., there is now support for an embedding function config) that is now merged in main
Can you rebase your addition on main? i.e., ensure azure embedding function support and use async client api?

Thanks.

victordibia · 2025-06-13T20:04:17Z

python/packages/autogen-ext/src/autogen_ext/memory/chromadb/_chromadb.py

@@ -19,6 +22,7 @@
    OpenAIEmbeddingFunctionConfig,
    PersistentChromaDBVectorMemoryConfig,
    SentenceTransformerEmbeddingFunctionConfig,
+    AzureOpenAIEmbeddingFunctionConfig,


Can you comment more on this import?

I was just being a git noob with the rebase. Hope it makes sense now, with the config class committed, but anyways:

I wrote a separate Config class for Azure OpenAI, cause - unfortunately - chromadb shares a single EmbeddingFunction for both OpenAI and Azure OpenAI.

For OpenAI you can just do:

from chromadb.utils import embedding_functions oai_embed = embedding_functions.OpenAIEmbeddingFunction(api_key="api-key", model_name="text-embedding-3-small")

But if you want to use Azure OpenAI you have to:

from chromadb.utils import embedding_functions oai_embed = embedding_functions.OpenAIEmbeddingFunction(api_key="api-key", model_name="text-embedding-3-small", deployment_id="text-embedding-3-small", api_base="https://test-endpoint.openai.azure.com/", api_type="azure", api_version="2024-12-01-preview", )

I did not subclass the existing OpenAIEmbeddingFunctionConfig class because it would cause trouble at:

autogen/python/packages/autogen-ext/src/autogen_ext/memory/chromadb/_chromadb.py

Lines 183 to 201 in 593151e

elif isinstance(config, OpenAIEmbeddingFunctionConfig):

try:

return embedding_functions.OpenAIEmbeddingFunction(api_key=config.api_key, model_name=config.model)

except Exception as e:

raise ImportError(

f"Failed to create OpenAI embedding function with model '{config.model}'. "

f"Ensure openai is installed and API key is valid. Error: {e}"

) from e

elif isinstance(config, AzureOpenAIEmbeddingFunctionConfig):

try:

return embedding_functions.OpenAIEmbeddingFunction(

api_key=config.api_key,

api_type=config.api_type,

model_name=config.model,

api_base=config.azure_endpoint,

deployment_id=config.azure_deployment,

api_version=config.api_version,

)

victordibia · 2025-06-13T20:04:47Z

python/packages/autogen-ext/src/autogen_ext/memory/chromadb/_chromadb.py

@@ -89,7 +93,7 @@ class ChromaDBVectorMemory(Memory, Component[ChromaDBVectorMemoryConfig]):
                    collection_name="multilingual_memory",
                    persistence_path=os.path.join(str(Path.home()), ".chromadb_autogen"),
                    embedding_function_config=SentenceTransformerEmbeddingFunctionConfig(
-                        model_name="paraphrase-multilingual-mpnet-base-v2"
+                        model="paraphrase-multilingual-mpnet-base-v2"


why is this argument converted to model from model_name ?

I am aware that this is a breaking change. However, my thought process was the following:

There is already a config implementation for the AzureOpenAIChatCompletionClient / OpenAIChatCompletionClient

autogen/python/packages/autogen-ext/src/autogen_ext/models/openai/config/__init__.py

Lines 100 to 108 in 6e7415e

class BaseOpenAIClientConfigurationConfigModel(CreateArgumentsConfigModel):

model: str

api_key: SecretStr | None = None

timeout: float | None = None

max_retries: int | None = None

model_capabilities: ModelCapabilities | None = None # type: ignore

model_info: ModelInfo | None = None

add_name_prefixes: bool | None = None

default_headers: Dict[str, str] | None = None

2. I assume that most people who use Azure OpenAI use the same resource for chat completions and embeddings, which means that api key, endpoint will be the same
3. This makes it easier to share configs if you have higher-level constructor
4. If model makes more sense for OpenAI/AzureOpenAI then it should be uniform for all Embedding functions

+1 since this was added recently hopefully it isn't a big change

danielnyari added 2 commits June 10, 2025 11:27

Revert "Correct Async functionality with AsyncClientAPI"

d33f6e2

This reverts commit 6b83ebf.

Update chromadb.py

7521ed8

victordibia self-assigned this Jun 13, 2025

Merge remote-tracking branch 'upstream/main' into bug/http-not-async

71d14e8

victordibia reviewed Jun 13, 2025

View reviewed changes

danielnyari and others added 2 commits June 13, 2025 22:06

requested changes

593151e

Merge branch 'main' into bug/http-not-async

7a0278c

danielnyari requested a review from victordibia June 17, 2025 08:00

Merge branch 'main' into bug/http-not-async

67bc4d2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Correct Async functionality with AsyncClientAPI and add embedding model support #6660

Correct Async functionality with AsyncClientAPI and add embedding model support #6660

Uh oh!

danielnyari commented Jun 10, 2025 •

edited

Loading

Uh oh!

danielnyari commented Jun 10, 2025

Uh oh!

victordibia commented Jun 13, 2025

Uh oh!

victordibia Jun 13, 2025

Uh oh!

danielnyari Jun 13, 2025 •

edited

Loading

Uh oh!

victordibia Jun 13, 2025

Uh oh!

danielnyari Jun 13, 2025 •

edited

Loading

Uh oh!

Uh oh!

	elif isinstance(config, OpenAIEmbeddingFunctionConfig):
	try:
	return embedding_functions.OpenAIEmbeddingFunction(api_key=config.api_key, model_name=config.model)
	except Exception as e:
	raise ImportError(
	f"Failed to create OpenAI embedding function with model '{config.model}'. "
	f"Ensure openai is installed and API key is valid. Error: {e}"
	) from e

	elif isinstance(config, AzureOpenAIEmbeddingFunctionConfig):
	try:
	return embedding_functions.OpenAIEmbeddingFunction(
	api_key=config.api_key,
	api_type=config.api_type,
	model_name=config.model,
	api_base=config.azure_endpoint,
	deployment_id=config.azure_deployment,
	api_version=config.api_version,
	)

	class BaseOpenAIClientConfigurationConfigModel(CreateArgumentsConfigModel):
	model: str
	api_key: SecretStr \| None = None
	timeout: float \| None = None
	max_retries: int \| None = None
	model_capabilities: ModelCapabilities \| None = None # type: ignore
	model_info: ModelInfo \| None = None
	add_name_prefixes: bool \| None = None
	default_headers: Dict[str, str] \| None = None

Correct Async functionality with AsyncClientAPI and add embedding model support #6660

Are you sure you want to change the base?

Correct Async functionality with AsyncClientAPI and add embedding model support #6660

Uh oh!

Conversation

danielnyari commented Jun 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why are these changes needed?

Related issue number

Checks

Uh oh!

danielnyari commented Jun 10, 2025

Uh oh!

victordibia commented Jun 13, 2025

Uh oh!

victordibia Jun 13, 2025

Choose a reason for hiding this comment

Uh oh!

danielnyari Jun 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

victordibia Jun 13, 2025

Choose a reason for hiding this comment

Uh oh!

danielnyari Jun 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

danielnyari commented Jun 10, 2025 •

edited

Loading

danielnyari Jun 13, 2025 •

edited

Loading

danielnyari Jun 13, 2025 •

edited

Loading