Skip to content

Conversation

@HARISHKUMAR1112001
Copy link
Contributor

Feature Addition: Updating AzureAISearchRM Class

This pull request aims to enhance the functionality of the AzureAISearchRM class by introducing support for additional search capabilities. Specifically, the class is updated to enable Vector Search, Hybrid Search, and Full Text Search with Semantic ReRanker, providing users with more versatile and powerful search options.

Changes Included:

  • Integration of Vector Search, Hybrid Search, and Full Text Search functionalities.
  • Implementation of Semantic ReRanker to enhance search results with semantic understanding.
  • Addition of necessary methods and attributes to support the new features.
  • Updates to the documentation to reflect the expanded capabilities of the class.

Checks:

Pre-Commit checks are passing (locally and remotely)
Title of your PR / MR corresponds to the required format
Commit message follows required format {label}(dspy): {message}

If there are any further improvements or adjustments required, please feel free to provide feedback. This pull request aims to enhance the functionality and usability of the AzureAISearchRM class, and any suggestions for refinement are highly appreciated.

@HARISHKUMAR1112001
Copy link
Contributor Author

HARISHKUMAR1112001 commented Apr 14, 2024

@insop @arnavsinghvi11 @okhat , Can I get the review on this PR.

@HARISHKUMAR1112001 HARISHKUMAR1112001 changed the title Add vector support in azure ai search feat(dspy): add vector, hybrid and fulltext search support in azure ai search module Apr 14, 2024
@arnavsinghvi11
Copy link
Collaborator

Thanks @HARISHKUMAR1112001 ! left a few minor comments. good to merge after that!

@HARISHKUMAR1112001
Copy link
Contributor Author

Thanks @arnavsinghvi11 for the quick review. I have updated the code with correct spellings.

@arnavsinghvi11
Copy link
Collaborator

Thanks @HARISHKUMAR1112001 !

@arnavsinghvi11 arnavsinghvi11 merged commit 7b1e49a into stanfordnlp:main Apr 16, 2024
@SimplyJuanjo
Copy link

@HARISHKUMAR1112001 @arnavsinghvi11 i'm getting the following error with the following code:

requirements.txt
git+https://github.com/stanfordnlp/dspy.git
azure-search-documents==11.6.0b1

`
client = openai.AzureOpenAI(
base_url=f"{openai.api_base}/openai/deployments/{deployment_id}/extensions",
api_key=openai.api_key,
api_version=openai.api_version
)

    azure_search = AzureAISearchRM(
    search_service_name=search_endpoint,
    search_api_key=search_key,
    search_index_name=search_index_name,
    field_text="content",
    k=3,
    azure_openai_client=client,
    openai_embed_model="text-embedding-ada-002",
    )

    dspy.settings.configure(rm=azure_search)
    retrieve = dspy.Retrieve(k=3)
    retrieval_response = retrieve(name).passages

    response=''
    for result in retrieval_response:
        print("Text:", result, "\n")
        response += result + "\n"

error:

[2024-04-18T08:45:17.286Z] Executed 'Functions.f29bot2' (Failed, Id=60fe1453-0587-4435-bd53-525ca750323c, Duration=222ms)
[2024-04-18T08:45:17.286Z] System.Private.CoreLib: Exception while executing function: Functions.f29bot2. System.Private.CoreLib: Result: Failure
[2024-04-18T08:45:17.286Z] Exception: ServiceRequestError: <urllib3.connection.HTTPSConnection object at 0x7138005e6140>: Failed to resolve 'https' ([Errno -2] Name or service not known)
[2024-04-18T08:45:17.286Z] Stack: File "/usr/lib/azure-functions-core-tools-4/workers/python/3.10/LINUX/X64/azure_functions_worker/dispatcher.py", line 495, in _handle__invocation_request
[2024-04-18T08:45:17.286Z] call_result = await self._loop.run_in_executor(
[2024-04-18T08:45:17.286Z] File "/usr/lib/python3.10/concurrent/futures/thread.py", line 58, in run
[2024-04-18T08:45:17.286Z] result = self.fn(*self.args, **self.kwargs)
[2024-04-18T08:45:17.286Z] File "/usr/lib/azure-functions-core-tools-4/workers/python/3.10/LINUX/X64/azure_functions_worker/dispatcher.py", line 768, in _run_sync_func
[2024-04-18T08:45:17.286Z] return ExtensionManager.get_sync_invocation_wrapper(context,
[2024-04-18T08:45:17.286Z] File "/usr/lib/azure-functions-core-tools-4/workers/python/3.10/LINUX/X64/azure_functions_worker/extension.py", line 215, in _raw_invocation_wrapper
[2024-04-18T08:45:17.286Z] result = function(**args)
[2024-04-18T08:45:17.286Z] File "/home/juanjo/juanjo_repos/functions/f29bot2/init.py", line 54, in main
[2024-04-18T08:45:17.286Z] retrieval_response = retrieve(name).passages
[2024-04-18T08:45:17.286Z] File "/home/juanjo/juanjo_repos/functions/.venv/lib/python3.10/site-packages/dspy/retrieve/retrieve.py", line 30, in call
[2024-04-18T08:45:17.286Z] return self.forward(*args, **kwargs)
[2024-04-18T08:45:17.286Z] File "/home/juanjo/juanjo_repos/functions/.venv/lib/python3.10/site-packages/dspy/retrieve/retrieve.py", line 39, in forward
[2024-04-18T08:45:17.286Z] passages = dsp.retrieveEnsemble(queries, k=k,**kwargs)
[2024-04-18T08:45:17.286Z] File "/home/juanjo/juanjo_repos/functions/.venv/lib/python3.10/site-packages/dsp/primitives/search.py", line 57, in retrieveEnsemble
[2024-04-18T08:45:17.286Z] return retrieve(queries[0], k, **kwargs)
[2024-04-18T08:45:17.287Z] File "/home/juanjo/juanjo_repos/functions/.venv/lib/python3.10/site-packages/dsp/primitives/search.py", line 12, in retrieve
[2024-04-18T08:45:17.287Z] passages = dsp.settings.rm(query, k=k, **kwargs)
[2024-04-18T08:45:17.287Z] File "/home/juanjo/juanjo_repos/functions/.venv/lib/python3.10/site-packages/dspy/retrieve/retrieve.py", line 30, in call
[2024-04-18T08:45:17.287Z] return self.forward(*args, **kwargs)
[2024-04-18T08:45:17.287Z] File "/home/juanjo/juanjo_repos/functions/.venv/lib/python3.10/site-packages/dspy/retrieve/azureaisearch_rm.py", line 328, in forward
[2024-04-18T08:45:17.287Z] results = self.azure_search_request(
[2024-04-18T08:45:17.287Z] File "/home/juanjo/juanjo_repos/functions/.venv/lib/python3.10/site-packages/dspy/retrieve/azureaisearch_rm.py", line 288, in azure_search_request
[2024-04-18T08:45:17.287Z] sorted_results = sorted(results, key=lambda x: x["@search.score"], reverse=True)
[2024-04-18T08:45:17.287Z] File "/home/juanjo/juanjo_repos/functions/.venv/lib/python3.10/site-packages/azure/search/documents/_paging.py", line 54, in next
[2024-04-18T08:45:17.287Z] return next(self._page_iterator)
[2024-04-18T08:45:17.287Z] File "/home/juanjo/juanjo_repos/functions/.venv/lib/python3.10/site-packages/azure/core/paging.py", line 75, in next
[2024-04-18T08:45:17.287Z] self._response = self._get_next(self.continuation_token)
[2024-04-18T08:45:17.287Z] File "/home/juanjo/juanjo_repos/functions/.venv/lib/python3.10/site-packages/azure/search/documents/_paging.py", line 124, in _get_next_cb
[2024-04-18T08:45:17.287Z] return self._client.documents.search_post(search_request=self._initial_query.request, **self._kwargs)
[2024-04-18T08:45:17.287Z] File "/home/juanjo/juanjo_repos/functions/.venv/lib/python3.10/site-packages/azure/core/tracing/decorator.py", line 78, in wrapper_use_tracer
[2024-04-18T08:45:17.287Z] return func(*args, **kwargs)
[2024-04-18T08:45:17.287Z] File "/home/juanjo/juanjo_repos/functions/.venv/lib/python3.10/site-packages/azure/search/documents/_generated/operations/_documents_operations.py", line 777, in search_post
[2024-04-18T08:45:17.287Z] pipeline_response: PipelineResponse = self._client._pipeline.run( # pylint: disable=protected-access
[2024-04-18T08:45:17.287Z] File "/home/juanjo/juanjo_repos/functions/.venv/lib/python3.10/site-packages/azure/core/pipeline/_base.py", line 230, in run
[2024-04-18T08:45:17.287Z] return first_node.send(pipeline_request)
[2024-04-18T08:45:17.287Z] File "/home/juanjo/juanjo_repos/functions/.venv/lib/python3.10/site-packages/azure/core/pipeline/_base.py", line 86, in send
[2024-04-18T08:45:17.287Z] response = self.next.send(request)
[2024-04-18T08:45:17.287Z] File "/home/juanjo/juanjo_repos/functions/.venv/lib/python3.10/site-packages/azure/core/pipeline/_base.py", line 86, in send
[2024-04-18T08:45:17.288Z] response = self.next.send(request)
[2024-04-18T08:45:17.288Z] File "/home/juanjo/juanjo_repos/functions/.venv/lib/python3.10/site-packages/azure/core/pipeline/_base.py", line 86, in send
[2024-04-18T08:45:17.288Z] response = self.next.send(request)
[2024-04-18T08:45:17.288Z] [Previous line repeated 2 more times]
[2024-04-18T08:45:17.288Z] File "/home/juanjo/juanjo_repos/functions/.venv/lib/python3.10/site-packages/azure/core/pipeline/policies/_redirect.py", line 197, in send
[2024-04-18T08:45:17.288Z] response = self.next.send(request)
[2024-04-18T08:45:17.288Z] File "/home/juanjo/juanjo_repos/functions/.venv/lib/python3.10/site-packages/azure/core/pipeline/policies/_retry.py", line 553, in send
[2024-04-18T08:45:17.288Z] raise err
[2024-04-18T08:45:17.288Z] File "/home/juanjo/juanjo_repos/functions/.venv/lib/python3.10/site-packages/azure/core/pipeline/policies/_retry.py", line 531, in send
[2024-04-18T08:45:17.288Z] response = self.next.send(request)
[2024-04-18T08:45:17.288Z] File "/home/juanjo/juanjo_repos/functions/.venv/lib/python3.10/site-packages/azure/core/pipeline/_base.py", line 86, in send
[2024-04-18T08:45:17.288Z] response = self.next.send(request)
[2024-04-18T08:45:17.288Z] File "/home/juanjo/juanjo_repos/functions/.venv/lib/python3.10/site-packages/azure/core/pipeline/_base.py", line 86, in send
[2024-04-18T08:45:17.288Z] response = self.next.send(request)
[2024-04-18T08:45:17.288Z] File "/home/juanjo/juanjo_repos/functions/.venv/lib/python3.10/site-packages/azure/core/pipeline/_base.py", line 86, in send
[2024-04-18T08:45:17.288Z] response = self.next.send(request)
[2024-04-18T08:45:17.288Z] [Previous line repeated 2 more times]
[2024-04-18T08:45:17.288Z] File "/home/juanjo/juanjo_repos/functions/.venv/lib/python3.10/site-packages/azure/core/pipeline/_base.py", line 119, in send
[2024-04-18T08:45:17.288Z] self._sender.send(request.http_request, **request.context.options),
[2024-04-18T08:45:17.288Z] File "/home/juanjo/juanjo_repos/functions/.venv/lib/python3.10/site-packages/azure/core/pipeline/transport/_requests_basic.py", line 386, in send
[2024-04-18T08:45:17.288Z] raise error
[2024-04-18T08:45:17.288Z] .

already commented this on the discord this morning

@HARISHKUMAR1112001
Copy link
Contributor Author

I think you are provide wrong arguments to AzureOpenAI. It has below arguments:

` """Construct a new synchronous azure openai client instance.

    This automatically infers the following arguments from their corresponding environment variables if they are not provided:
    - `api_key` from `AZURE_OPENAI_API_KEY`
    - `organization` from `OPENAI_ORG_ID`
    - `azure_ad_token` from `AZURE_OPENAI_AD_TOKEN`
    - `api_version` from `OPENAI_API_VERSION`
    - `azure_endpoint` from `AZURE_OPENAI_ENDPOINT`

    Args:
        azure_endpoint: Your Azure endpoint, including the resource, e.g. `https://example-resource.azure.openai.com/`

        azure_ad_token: Your Azure Active Directory token, https://www.microsoft.com/en-us/security/business/identity-access/microsoft-entra-id

        azure_ad_token_provider: A function that returns an Azure Active Directory token, will be invoked on every request.

        azure_deployment: A model deployment, if given sets the base client URL to include `/deployments/{azure_deployment}`.
            Note: this means you won't be able to use non-deployment endpoints. Not supported with Assistants APIs.
    """`

@SimplyJuanjo
Copy link

thanks for ur response @HARISHKUMAR1112001

it was really weird because both the AzureOpenAI and the SearchClient were working alone without DSPy

but

debugging with a custom AzureRM class i think i found the error

it was related to the "search_service_name=search_endpoint" arg

I was passing the search_endpoint as a complete url:

search_endpoint = "https://f29search.search.windows.net";

[2024-04-19T12:36:57.321Z] Azure Search Request
[2024-04-19T12:36:57.321Z] <SearchClient [endpoint='https://https://f29search.search.windows.net.search.windows.net', index='sharepoint-index']>

tried then without "https://"

search_endpoint = "https://f29search.search.windows.net";

[2024-04-19T12:41:45.467Z] Azure Search Request
[2024-04-19T12:41:45.469Z] <SearchClient [endpoint='https://f29search.search.windows.net.search.windows.net', index='sharepoint-index']>

but should have passed the service name only:

search_endpoint = "f29search"; # Add your Azure AI Search endpoint here

now it worked!

sorry for bothering, was my bad!

arnavsinghvi11 added a commit that referenced this pull request Jul 12, 2024
…-azure-ai-search

feat(dspy): add vector, hybrid and fulltext search support in azure ai search module
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants