Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] AI Search connection in mlindex_content not detected #3313

Closed
adamdougal opened this issue May 20, 2024 · 8 comments
Closed

[BUG] AI Search connection in mlindex_content not detected #3313

adamdougal opened this issue May 20, 2024 · 8 comments
Assignees
Labels
bug Something isn't working no-recent-activity There has been no recent activity on this issue/pull request

Comments

@adamdougal
Copy link

adamdougal commented May 20, 2024

Describe the bug
I have a Multi-Round Q&A on Your Data chat flow which queries Azure AI Search and passes the results, along with the chat history and question to Azure OpenAI. After exporting the flow, with the intention of deploying to Azure App Service. I am unable to successfully run a built flow (either as an executable or docker image). I believe this is due to the Azure AI Search connection being ignored.

When running pf flow serve the flow runs correctly and I am able to get responses.

I have tried this with prompt flow verisons 1.9.0, 1.10.0 and 1.11.0.

How To Reproduce the bug
Steps to reproduce the behavior, how frequent can you experience the bug:

  1. Create a Multi-Round Q&A on your data chat flow.
  2. Export the flow locally
  3. Install the requirements pip install -r requirements.txt
  4. Create the AI Search and Open AI connections files
  5. Add the connections to prompt flow pf connection create -f <file>
  6. Build the flow pf flow build --source . --output dist-docker --format docker
  7. Inspect the dist-docker/connections directory and observe that the AI Search connection is missing

Expected behavior
The AI search connection is picked up and connection file created.

Screenshots
If applicable, add screenshots to help explain your problem.

Running Information(please complete the following information):

  • Promptflow Package Version using pf -v: [e.g. 0.0.102309906]
$ pf -v
{
 "promptflow": "1.9.0",
 "promptflow-core": "1.9.0",
 "promptflow-devkit": "1.9.0",
 "promptflow-tracing": "1.9.0"
}
  • Operating System: [e.g. Ubuntu 20.04, Windows 11]
$ cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 11 (bullseye)"
NAME="Debian GNU/Linux"
VERSION_ID="11"
VERSION="11 (bullseye)"
VERSION_CODENAME=bullseye
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"
  • Python Version using python --version: [e.g. python==3.10.12]
$ python --version
Python 3.11.8

Additional context
Example files, click to expand:

A shorterned and redacted example prompt flow
id: bring_your_own_data_chat_qna
name: Bring Your Own Data Chat QnA
inputs:
  chat_history:
    type: list
    default:
    - inputs:
        chat_input: Hi
      outputs:
        chat_output: Hello! How can I assist you today?
    - inputs:
        chat_input: What is Azure compute instance?
      outputs:
        chat_output: An Azure Machine Learning compute instance is a fully managed
          cloud-based workstation for data scientists. It provides a
          pre-configured and managed development environment in the cloud for
          machine learning. Compute instances can also be used as a compute
          target for training and inferencing for development and testing
          purposes. They have a job queue, run jobs securely in a virtual
          network environment, and can run multiple small jobs in parallel.
          Additionally, compute instances support single-node multi-GPU
          distributed training jobs.
    is_chat_input: false
    is_chat_history: true
  chat_input:
    type: string
    default: How can I create one using azureml sdk V2?
    is_chat_input: true
outputs:
  chat_output:
    type: string
    reference: ${chat_with_context.output}
    is_chat_output: true
nodes:
- name: index_lookup
  type: python
  source:
    type: package
    tool: promptflow_vectordb.tool.common_index_lookup.search
  inputs:
    mlindex_content: >
      embeddings:
        api_base: https://<openai-resource-redacted>.openai.azure.com/
        api_type: azure
        api_version: '2024-02-01'
        batch_size: '1'
        connection:
          id: /subscriptions/<subscription-redacted>/resourceGroups/<resource-group-redacted>/providers/Microsoft.MachineLearningServices/workspaces/<aml-workspace-redacted>/connections/openai_connection
        connection_type: workspace_connection
        deployment: text-embedding-ada-002
        dimension: 1536
        kind: open_ai
        model: text-embedding-ada-002
        schema_version: '2'
      index:
        api_version: '2023-11-01'
        connection:
          id: /subscriptions/<subscription-redacted>/resourceGroups/<resource-group-redacted>/providers/Microsoft.MachineLearningServices/workspaces/<aml-workspace-redacted>/connections/aisearch_connection
        connection_type: workspace_connection
        endpoint: https://<ai-search-resource-redacted>.search.windows.net
        engine: azure-sdk
        field_mapping:
          content: content
          embedding: content_vector
          metadata: metadata
        index: <search-index-redacted>
        kind: acs
        semantic_configuration_name: default
    queries: ${inputs.chat_input}
    query_type: Hybrid (vector + keyword)
    top_k: 2
  use_variants: false
- name: chat_with_context
  type: llm
  source:
    type: code
    path: chat_with_context.jinja2
  inputs:
    deployment_name: gpt-35-turbo-16k
    temperature: 0
    top_p: 1
    max_tokens: 1000
    presence_penalty: 0
    frequency_penalty: 0
    prompt_text: ${Prompt_variants.output}
  provider: AzureOpenAI
  connection: openai_connection
  api: chat
  module: promptflow.tools.aoai
  use_variants: false
node_variants: {}
environment:
  python_requirements_txt: requirements.txt
AI Search Connection
$schema: https://azuremlschemas.azureedge.net/promptflow/latest/CognitiveSearchConnection.schema.json
name: aisearch_connection
type: cognitive_search
api_key:  ${env:AISEARCH_CONNECTION_API_KEY}
api_base: "https://<search-resource-redacted>.search.windows.net"
api_version: "2023-03-15-preview"
@adamdougal adamdougal added the bug Something isn't working label May 20, 2024
@adamdougal
Copy link
Author

adamdougal commented May 20, 2024

I've been playing around with this test with my flow.dag.yml and it looks like where it's falling down is in flow.py:get_connection_names() where it does not inspect the mlindex_content.

I'm not sure if the code has changed to break this, if the generated flow.dag.yml interface has changed or perhaps this has never worked? Or maybe I'm doing something wrong!

@pgr-lopes
Copy link

pgr-lopes commented May 20, 2024

Pretty sure you hit the same problem I did: #2876

You have to do the metadata account set manually, for whatever reason the activity is not taking the config.json file into consideration:

az login
az account set --subscription <subscription_id>
az configure --defaults group=<resource_group_name> workspace=<workspace_name>

@adamdougal
Copy link
Author

adamdougal commented May 20, 2024

I've hardcoded a connection in the flow.dag.yml which has got me further.

I am now getting a response of:

{"error":{"code":"UserError","message":"Execution failure in 'index_lookup': (Exception) Exception occured in search_function_construction."}}

I have also deployed this to an AML endpoint using the deploy button in AML and get the same error there.

As it stands, from what I can tell, using Azure AI Search with Prompt Flow is currently unusable unless invoked from AML Studio.

@brynn-code brynn-code self-assigned this May 21, 2024
@brynn-code
Copy link
Contributor

The response didn't show the full error reason, you could reach the error detail from the app service's container logs.
If the error caused by connection missing, I could explain more about the connections when deploy to app service.

When deploying to Azure App service, promptflow will use connection locally ( here the locally means the connections meta stored in local sqlite ), I took a look at your flow, there are 2 connections, the AI search one and the OpenAI one. If you wanna use Azure AI connections which stored in the Azure AI project, please set the connection provider config to let promptflow fetching Azure AI connections ( you may need to add command in the container startup script, also remember to add the app service as reader roles to your AI project to access the connection keys ). Refer to here for the connection config.

@brynn-code
Copy link
Contributor

Usually we won't assume user is deploying app service with AzureAI connections, that will cause many problems, like the permission, the service principal role, balabala, so by default we will guide user setup connections again for there app service locally by setting the app service environment variables. The locally setup guides you could reach via the following documentation:
https://microsoft.github.io/promptflow/cloud/azureai/deploy-to-azure-appservice.html#view-and-test-the-web-app

@adamdougal
Copy link
Author

Heya, thanks for your response! Unfortunately, even after setting the connection.provider=local setting, it does not pick up the AI Search connection unless I manually add it to the flow.dag.yaml.

Regarding the Exception occured in search_function_construction error. I get this both locally and when the flow is deployed as an AML endpoint. Here is the exception from the logs:

[2024-05-21 07:14:14,056][flowinvoker][ERROR] - Flow run failed with error: {'message': "Execution failure in 'index_lookup': (Exception) Exception occured in search_function_construction.", 'messageFormat': "Execution failure in '{node_name}'.", 'messageParameters': {'node_name': 'index_lookup'}, 'referenceCode': 'Tool/promptflow_vectordb.tool.common_index_lookup', 'code': 'UserError', 'innerError': {'code': 'ToolExecutionError', 'innerError': None}, 'additionalInfo': [{'type': 'ToolExecutionErrorDetails', 'info': {'type': 'Exception', 'message': 'Exception occured in search_function_construction.', 'traceback': 'Traceback (most recent call last):
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow_vectordb/tool/utils/profiling.py", line 18, in measure_execution_time
    yield
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow_vectordb/tool/common_index_lookup.py", line 54, in _get_search_func
    search_func = build_search_func(index, top_k, query_type)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow_vectordb/tool/common_index_lookup_extensions/utils.py", line 37, in build_search_func
    store = index.as_langchain_vectorstore()
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/azureml/rag/mlindex.py", line 212, in as_langchain_vectorstore
    return azuresearch.AzureSearch(
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/langchain_community/vectorstores/azuresearch.py", line 268, in __init__
    self.client = _get_search_client(
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/langchain_community/vectorstores/azuresearch.py", line 84, in _get_search_client
    from azure.search.documents.indexes.models import (
ImportError: cannot import name \'ExhaustiveKnnAlgorithmConfiguration\' from \'azure.search.documents.indexes.models\' (/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/azure/search/documents/indexes/models/__init__.py)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow/tracing/_trace.py", line 470, in wrapped
    output = func(*args, **kwargs)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow_vectordb/core/logging/utils.py", line 98, in wrapper
    res = func(*args, **kwargs)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow_vectordb/tool/common_index_lookup.py", line 125, in search
    search_func = _get_search_func(mlindex_content, top_k, query_type)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow_vectordb/tool/common_index_lookup.py", line 54, in _get_search_func
    search_func = build_search_func(index, top_k, query_type)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/contextlib.py", line 137, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow_vectordb/tool/utils/profiling.py", line 21, in measure_execution_time
    raise Exception(error_msg) from e
Exception: Exception occured in search_function_construction.
', 'filename': '/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow_vectordb/tool/utils/profiling.py', 'lineno': 21, 'name': 'measure_execution_time'}}], 'debugInfo': {'type': 'ToolExecutionError', 'message': "Execution failure in 'index_lookup': (Exception) Exception occured in search_function_construction.", 'stackTrace': '
The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow/executor/flow_executor.py", line 1008, in _exec
    output, aggregation_inputs = self._exec_inner_with_trace(
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow/executor/flow_executor.py", line 913, in _exec_inner_with_trace
    output, nodes_outputs = self._traverse_nodes(inputs, context)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow/executor/flow_executor.py", line 1189, in _traverse_nodes
    nodes_outputs, bypassed_nodes = self._submit_to_scheduler(context, inputs, batch_nodes)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow/executor/flow_executor.py", line 1244, in _submit_to_scheduler
    return scheduler.execute(self._line_timeout_sec)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow/executor/_flow_nodes_scheduler.py", line 131, in execute
    raise e
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow/executor/_flow_nodes_scheduler.py", line 113, in execute
    self._dag_manager.complete_nodes(self._collect_outputs(completed_futures))
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow/executor/_flow_nodes_scheduler.py", line 160, in _collect_outputs
    each_node_result = each_future.result()
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/concurrent/futures/_base.py", line 439, in result
    return self.__get_result()
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result
    raise self._exception
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow/executor/_flow_nodes_scheduler.py", line 181, in _exec_single_node_in_thread
    result = context.invoke_tool(node, f, kwargs=kwargs)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow/_core/flow_execution_context.py", line 90, in invoke_tool
    result = self._invoke_tool_inner(node, f, kwargs)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow/_core/flow_execution_context.py", line 206, in _invoke_tool_inner
    raise ToolExecutionError(node_name=node_name, module=module) from e
', 'innerException': {'type': 'Exception', 'message': 'Exception occured in search_function_construction.', 'stackTrace': '
The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow/_core/flow_execution_context.py", line 182, in _invoke_tool_inner
    return f(**kwargs)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow/tracing/_trace.py", line 470, in wrapped
    output = func(*args, **kwargs)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow_vectordb/core/logging/utils.py", line 98, in wrapper
    res = func(*args, **kwargs)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow_vectordb/tool/common_index_lookup.py", line 125, in search
    search_func = _get_search_func(mlindex_content, top_k, query_type)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow_vectordb/tool/common_index_lookup.py", line 54, in _get_search_func
    search_func = build_search_func(index, top_k, query_type)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/contextlib.py", line 137, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow_vectordb/tool/utils/profiling.py", line 21, in measure_execution_time
    raise Exception(error_msg) from e
', 'innerException': {'type': 'ImportError', 'message': "cannot import name 'ExhaustiveKnnAlgorithmConfiguration' from 'azure.search.documents.indexes.models' (/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/azure/search/documents/indexes/models/__init__.py)", 'stackTrace': 'Traceback (most recent call last):
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow_vectordb/tool/utils/profiling.py", line 18, in measure_execution_time
    yield
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow_vectordb/tool/common_index_lookup.py", line 54, in _get_search_func
    search_func = build_search_func(index, top_k, query_type)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow_vectordb/tool/common_index_lookup_extensions/utils.py", line 37, in build_search_func
    store = index.as_langchain_vectorstore()
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/azureml/rag/mlindex.py", line 212, in as_langchain_vectorstore
    return azuresearch.AzureSearch(
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/langchain_community/vectorstores/azuresearch.py", line 268, in __init__
    self.client = _get_search_client(
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/langchain_community/vectorstores/azuresearch.py", line 84, in _get_search_client
    from azure.search.documents.indexes.models import (
', 'innerException': None}}}}

Copy link

Hi, we're sending this friendly reminder because we haven't heard back from you in 30 days. We need more information about this issue to help address it. Please be sure to give us your input. If we don't hear back from you within 7 days of this comment, the issue will be automatically closed. Thank you!

@github-actions github-actions bot added the no-recent-activity There has been no recent activity on this issue/pull request label Jun 20, 2024
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jun 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working no-recent-activity There has been no recent activity on this issue/pull request
Projects
None yet
Development

No branches or pull requests

3 participants