# Mosaic AI Agent Framework: Author and deploy a multi-agent system with Genie and Serving Endpoints

This notebook demonstrates how to build a multi-agent system using Mosaic AI Agent Framework and [LangGraph](https://blog.langchain.dev/langgraph-multi-agent-workflows/), where [Genie](https://www.databricks.com/product/ai-bi/genie) is one of the agents.
In this notebook, you:
1. Author a multi-agent system using LangGraph.
1. Wrap the LangGraph agent with MLflow `ResponsesAgent` to ensure compatibility with Databricks features.
1. Manually test the multi-agent system's output.
1. Log and deploy the multi-agent system.

This example is based on [LangGraph documentation - Multi-agent supervisor example](https://github.com/langchain-ai/langgraph/blob/main/docs/docs/tutorials/multi_agent/agent_supervisor.md)

## Why use a Genie agent?

Multi-agent systems consist of multiple AI agents working together, each with specialized capabilities. As one of those agents, Genie allows users to interact with their structured data using natural language. Unlike SQL functions which can only run pre-defined queries, Genie has the flexibility to create novel queries to answer user questions.

## Prerequisites

- Address all `TODO`s in this notebook.
- Create a Genie Space, see Databricks documentation ([AWS](https://docs.databricks.com/aws/genie/set-up) | [Azure](https://learn.microsoft.com/azure/databricks/genie/set-up)).

In [0]:
%pip install -U -qqq langgraph-supervisor==0.0.30 mlflow[databricks] databricks-langchain databricks-agents uv 
dbutils.library.restartPython()

[43mNote: you may need to restart the kernel using %restart_python or dbutils.library.restartPython() to use updated packages.[0m



## Define the multi-agent system

Create a multi-agent system in LangGraph using a supervisor agent node with one or more of the following subagents:
- **GenieAgent**: A LangChain runnable that allows you to easily interact with your Genie Space to query structured data.
- **Custom serving agent**: An agent that is already hosted as an existing endpoint on Databricks.
- **In-code tool-calling agent**: An agent that calls Unity Catalog function tools, defined within this notebook. This example uses `system.ai.python_exec`, but for examples of other tools you can add to your agents, see Databricks documentation ([AWS](https://docs.databricks.com/aws/generative-ai/agent-framework/agent-tool) | [Azure](https://learn.microsoft.com/en-us/azure/databricks/generative-ai/agent-framework/agent-tool)).

The supervisor agent is responsible for creating and routing tool calls to each of your subagents, passing only the context necessary. You can modify this behavior and pass along the entire message history if desired. See the [LangGraph docs](https://langchain-ai.github.io/langgraph/reference/supervisor/) for more information.

### Write agent code to file

Define the agent code in a single cell below. This lets you write the agent code to a local Python file, using the `%%writefile` magic command, for subsequent logging and deployment.


In [0]:
%%writefile agent.py
import json
from typing import Generator, Literal
from uuid import uuid4

import mlflow
from databricks_langchain import (
    ChatDatabricks,
    DatabricksFunctionClient,
    UCFunctionToolkit,
    set_uc_function_client,
)
from databricks_langchain.genie import GenieAgent
from langchain_core.runnables import Runnable
from langchain.agents import create_agent
from langgraph.graph.state import CompiledStateGraph
from langgraph_supervisor import create_supervisor
from mlflow.pyfunc import ResponsesAgent
from mlflow.types.responses import (
    ResponsesAgentRequest,
    ResponsesAgentResponse,
    ResponsesAgentStreamEvent,
    output_to_responses_items_stream,
    to_chat_completions_input,
)
from pydantic import BaseModel

client = DatabricksFunctionClient()
set_uc_function_client(client)

########################################
# Create your LangGraph Supervisor Agent
########################################

GENIE = "genie"


class ServedSubAgent(BaseModel):
    endpoint_name: str
    name: str
    task: Literal["agent/v1/responses", "agent/v1/chat", "agent/v2/chat"]
    description: str


class Genie(BaseModel):
    space_id: str
    name: str
    task: str = GENIE
    description: str


class InCodeSubAgent(BaseModel):
    tools: list[str]
    name: str
    description: str


TOOLS = []


def stringify_content(state):
    msgs = state["messages"]
    if isinstance(msgs[-1].content, list):
        msgs[-1].content = json.dumps(msgs[-1].content, indent=4)
    return {"messages": msgs}


def create_langgraph_supervisor(
    llm: Runnable,
    externally_served_agents: list[ServedSubAgent] = [],
    in_code_agents: list[InCodeSubAgent] = [],
):
    agents = []
    agent_descriptions = ""

    # Process inline code agents
    for agent in in_code_agents:
        agent_descriptions += f"- {agent.name}: {agent.description}\n"
        uc_toolkit = UCFunctionToolkit(function_names=agent.tools)
        TOOLS.extend(uc_toolkit.tools)
        agents.append(create_agent(llm, tools=uc_toolkit.tools, name=agent.name))

    # Process served endpoints and Genie Spaces
    for agent in externally_served_agents:
        agent_descriptions += f"- {agent.name}: {agent.description}\n"
        if isinstance(agent, Genie):
            # to better control the messages sent to the genie agent, you can use the `message_processor` param: https://api-docs.databricks.com/python/databricks-ai-bridge/latest/databricks_langchain.html#databricks_langchain.GenieAgent
            genie_agent = GenieAgent(
                genie_space_id=agent.space_id,
                genie_agent_name=agent.name,
                description=agent.description,
            )
            genie_agent.name = agent.name
            agents.append(genie_agent)
        else:
            model = ChatDatabricks(
                endpoint=agent.endpoint_name, use_responses_api="responses" in agent.task
            )
            # Disable streaming for subagents for ease of parsing
            model._stream = lambda x: model._stream(**x, stream=False)
            agents.append(
                create_agent(
                    model,
                    tools=[],
                    name=agent.name,
                    post_model_hook=stringify_content,
                )
            )

    # TODO: The supervisor prompt includes agent names/descriptions as well as general
    # instructions. You can modify this to improve quality or provide custom instructions.
    prompt = f"""
    You are a supervisor in a multi-agent system.

    1. Understand the user's last request
    2. Read through the entire chat history.
    3. If the answer to the user's last request is present in chat history, answer using information in the history.
    4. If the answer is not in the history, from the below list of agents, determine which agent is best suited to answer the question.
    5. Provide a summarized response to the user's last query, even if it's been answered before.

    {agent_descriptions}"""

    return create_supervisor(
        agents=agents,
        model=llm,
        prompt=prompt,
        add_handoff_messages=False,
        output_mode="full_history",
    ).compile()


##########################################
# Wrap LangGraph Supervisor as a ResponsesAgent
##########################################


class LangGraphResponsesAgent(ResponsesAgent):
    def __init__(self, agent: CompiledStateGraph):
        self.agent = agent

    def predict(self, request: ResponsesAgentRequest) -> ResponsesAgentResponse:
        outputs = [
            event.item
            for event in self.predict_stream(request)
            if event.type == "response.output_item.done"
        ]
        return ResponsesAgentResponse(output=outputs, custom_outputs=request.custom_inputs)

    def predict_stream(
        self,
        request: ResponsesAgentRequest,
    ) -> Generator[ResponsesAgentStreamEvent, None, None]:
        cc_msgs = to_chat_completions_input([i.model_dump() for i in request.input])
        first_message = True
        seen_ids = set()

        # can adjust `recursion_limit` to limit looping: https://docs.langchain.com/oss/python/langgraph/GRAPH_RECURSION_LIMIT#troubleshooting
        for _, events in self.agent.stream({"messages": cc_msgs}, stream_mode=["updates"]):
            new_msgs = [
                msg
                for v in events.values()
                for msg in v.get("messages", [])
                if msg.id not in seen_ids
            ]
            if first_message:
                seen_ids.update(msg.id for msg in new_msgs[: len(cc_msgs)])
                new_msgs = new_msgs[len(cc_msgs) :]
                first_message = False
            else:
                seen_ids.update(msg.id for msg in new_msgs)
                node_name = tuple(events.keys())[0]  # assumes one name per node
                yield ResponsesAgentStreamEvent(
                    type="response.output_item.done",
                    item=self.create_text_output_item(
                        text=f"<name>{node_name}</name>", id=str(uuid4())
                    ),
                )
            if len(new_msgs) > 0:
                yield from output_to_responses_items_stream(new_msgs)


#######################################################
# Configure the Foundation Model and Serving Sub-Agents
#######################################################

# TODO: Replace with your model serving endpoint
LLM_ENDPOINT_NAME = "databricks-meta-llama-3-1-8b-instruct"
llm = ChatDatabricks(endpoint=LLM_ENDPOINT_NAME)

# TODO: Add the necessary information about each of your subagents. Subagents could be agents deployed to Model Serving endpoints or Genie Space subagents.
# Your agent descriptions are crucial for improving quality. Include as much detail as possible.
EXTERNALLY_SERVED_AGENTS = [
    Genie(
        space_id="01f0c9f705201d14b364f5daf28bb639",
        name="talent_genie",
        description="This agent can analyze talent stability and mobility patterns to identify early signs of attrition risk and growth opportunities that can help leaders make proactive, data-driven decisions about their workforce."
    ),
    # ServedSubAgent(
    #     endpoint_name="cities-agent",
    #     name="city-agent", # choose a semantically relevant name for your agent
    #     task="agent/v1/responses",
    #     description="This agent can answer questions about the best cities to visit in the world.",
    # ),
]

############################################################
# Create additional agents in code
############################################################

# TODO: Fill the following with UC function-calling agents. The tools parameter is a list of UC function names that you want your agent to call.
# IN_CODE_AGENTS = [
#     InCodeSubAgent(
#         tools=["system.ai.*"],
#         name="code execution agent",
#         description="The code execution agent specializes in solving programming challenges, generating code snippets, debugging issues, and explaining complex coding concepts.",
#     )
# ]
IN_CODE_AGENTS = []

#################################################
# Create supervisor and set up MLflow for tracing
#################################################

supervisor = create_langgraph_supervisor(llm, EXTERNALLY_SERVED_AGENTS, IN_CODE_AGENTS)

mlflow.langchain.autolog()
AGENT = LangGraphResponsesAgent(supervisor)
mlflow.models.set_model(AGENT)

Overwriting agent.py


## Test the agent

Interact with the agent to test its output. Since this notebook called `mlflow.langchain.autolog()` you can view the trace for each step the agent takes.

Even if you didn't add any subagents in the agent definition above, the supervisor agent can still answer questions. It just won't have any subagents to switch to.

**Important:** LangGraph internally uses exceptions (something like `Command` or `ParentCommand`) to switch between nodes. These particular exceptions may appear in your MLflow traces as Events, but this behavior is expected and should not be a cause for concern.

In [0]:
dbutils.library.restartPython()

In [0]:
from agent import AGENT

# TODO: Replace this placeholder `input_example` with a domain-specific prompt for your agent.
input_example = {
    "input": [
        {"role": "user", "content": "What is the average time-in-role across business units?"}
    ]
}


AGENT.predict(input_example)

ResponsesAgentResponse(tool_choice=None, truncation=None, id=None, created_at=None, error=None, incomplete_details=None, instructions=None, metadata=None, model=None, object='response', output=[OutputItem(type='message', id='5e4d1f2f-60ef-4a54-9b73-00ce812fb3db', content=[{'text': '<name>talent_genie</name>', 'type': 'output_text'}], role='assistant'), OutputItem(type='message', id='d1a6759e-bc10-4a1c-85dc-a20a21f37a02', content=[{'text': '|    | business_unit    |   avg_time_in_role_days |\n|---:|:-----------------|------------------------:|\n|  0 | HR               |                -1006.59 |\n|  1 | Finance          |                -1017.58 |\n|  2 | Operations       |                -1018.22 |\n|  3 | Customer Success |                -1044.47 |\n|  4 | Sales            |                -1065.38 |\n|  5 | Engineering      |                -1113.59 |', 'type': 'output_text'}], role='assistant'), OutputItem(type='message', id='3d8c4224-058d-4476-80ad-7b3eb33cdbd7', content=[{'text':

Trace(trace_id=tr-59afb0f827a700e5c3544c267faabbc6)

In [0]:
for event in AGENT.predict_stream(input_example):
  print(event.model_dump(exclude_none=True))

{'type': 'response.output_item.done', 'item': {'id': 'ef28e874-2f51-49a8-8eac-d86246759646', 'content': [{'text': '<name>talent_genie</name>', 'type': 'output_text'}], 'role': 'assistant', 'type': 'message'}}
{'type': 'response.output_item.done', 'item': {'id': '29f232b4-e5d0-47da-a895-b10e5efa55bd', 'content': [{'text': '|     | manager_id                           | avg_attrition_flag   |   avg_attrition_risk_score |   team_size |\n|----:|:-------------------------------------|:---------------------|---------------------------:|------------:|\n|   0 | 210f8592-cd3e-4a9e-bcf3-4ff8f5fe6ae0 |                      |                  0.07      |           8 |\n|   1 | e491ffe5-bc20-4012-9207-c94cdca0fe09 |                      |                  0.07      |           8 |\n|   2 | a1bff46f-53b5-4c1b-9587-5d278f4711be |                      |                  0.0682353 |          17 |\n|   3 | 27195713-0a32-4349-9593-2bf4c120b0f8 |                      |                  0.0666667 |        

Trace(trace_id=tr-c8b6d8d9e0acbf8ab266ef066ea2c912)

## Log the agent as an MLflow model

Log the agent as code from the `agent.py` file. See [MLflow - Models from Code](https://mlflow.org/docs/latest/models.html#models-from-code).

### Enable automatic authentication for Databricks resources
For the most common Databricks resource types, Databricks supports and recommends declaring resource dependencies for the agent upfront during logging. This enables automatic authentication passthrough when you deploy the agent. With automatic authentication passthrough, Databricks automatically provisions, rotates, and manages short-lived credentials to securely access these resource dependencies from within the agent endpoint.

To enable automatic authentication, specify the dependent Databricks resources when calling `mlflow.pyfunc.log_model().`
  - **TODO**: If your Unity Catalog tool queries a [vector search index](docs link) or leverages [external functions](docs link), you need to include the dependent vector search index and UC connection objects, respectively, as resources. See docs ([AWS](https://docs.databricks.com/aws/generative-ai/agent-framework/agent-authentication#supported-resources-for-automatic-authentication-passthrough) | [Azure](https://docs.databricks.com/aws/generative-ai/agent-framework/agent-authentication#supported-resources-for-automatic-authentication-passthrough)).

  - **TODO**: Add the SQL Warehouse or tables powering your Genie space to enable passthrough authentication. ([AWS](https://docs.databricks.com/aws/generative-ai/agent-framework/agent-authentication#supported-resources-for-automatic-authentication-passthrough) | [Azure](https://docs.databricks.com/aws/generative-ai/agent-framework/agent-authentication#supported-resources-for-automatic-authentication-passthrough)). If your genie space uses "embedded credentials" then you do not have to add this.

In [0]:
# Determine Databricks resources to specify for automatic auth passthrough at deployment time
import mlflow
from agent import EXTERNALLY_SERVED_AGENTS, LLM_ENDPOINT_NAME, TOOLS, Genie
from databricks_langchain import UnityCatalogTool, VectorSearchRetrieverTool
from mlflow.models.resources import (
    DatabricksFunction,
    DatabricksGenieSpace,
    DatabricksServingEndpoint,
    DatabricksSQLWarehouse,
    DatabricksTable
)
from pkg_resources import get_distribution

# TODO: Manually include underlying resources if needed. See the TODO in the markdown above for more information.
resources = [DatabricksServingEndpoint(endpoint_name=LLM_ENDPOINT_NAME)]
# TODO: Add SQL Warehouses and delta tables powering the Genie Space
resources.append(DatabricksSQLWarehouse(warehouse_id="148ccb90800933a1"))
resources.append(DatabricksTable(table_name="akash_s_demo.talent.fact_attrition_snapshots"))
resources.append(DatabricksTable(table_name="akash_s_demo.talent.dim_employees"))
resources.append(DatabricksTable(table_name="akash_s_demo.talent.fact_compensation"))
resources.append(DatabricksTable(table_name="akash_s_demo.talent.fact_performance"))
resources.append(DatabricksTable(table_name="akash_s_demo.talent.fact_role_history"))

# Add tools from Unity Catalog
for tool in TOOLS:
    if isinstance(tool, VectorSearchRetrieverTool):
        resources.extend(tool.resources)
    elif isinstance(tool, UnityCatalogTool):
        resources.append(DatabricksFunction(function_name=tool.uc_function_name))

# Add serving endpoints and Genie Spaces
for agent in EXTERNALLY_SERVED_AGENTS:
    if isinstance(agent, Genie):
        resources.append(DatabricksGenieSpace(genie_space_id=agent.space_id))
    else:
        resources.append(DatabricksServingEndpoint(endpoint_name=agent.endpoint_name))

with mlflow.start_run():
    logged_agent_info = mlflow.pyfunc.log_model(
        name="agent",
        python_model="agent.py",
        resources=resources,
        pip_requirements=[
            f"databricks-connect=={get_distribution('databricks-connect').version}",
            f"mlflow=={get_distribution('mlflow').version}",
            f"databricks-langchain=={get_distribution('databricks-langchain').version}",
            f"langgraph=={get_distribution('langgraph').version}",
            f"langgraph-supervisor=={get_distribution('langgraph-supervisor').version}",
        ],
    )

🔗 View Logged Model at: https://adb-984752964297111.11.azuredatabricks.net/ml/experiments/3188636615953256/models/m-498c32aa60dd4141a8b3f6df6a863f87?o=984752964297111
2025/11/25 12:37:24 INFO mlflow.pyfunc: Predicting on input example to validate output


## Pre-deployment agent validation
Before registering and deploying the agent, perform pre-deployment checks using the [mlflow.models.predict()](https://mlflow.org/docs/latest/python_api/mlflow.models.html#mlflow.models.predict) API. See Databricks documentation ([AWS](https://docs.databricks.com/en/machine-learning/model-serving/model-serving-debug.html#validate-inputs) | [Azure](https://learn.microsoft.com/en-us/azure/databricks/machine-learning/model-serving/model-serving-debug#before-model-deployment-validation-checks)).

In [0]:
import mlflow
mlflow.models.predict(
    model_uri=f"runs:/{logged_agent_info.run_id}/agent",
    input_data=input_example,
    env_manager="uv",
)

Downloading artifacts:   0%|          | 0/1 [00:00<?, ?it/s]



Downloading artifacts:   0%|          | 0/12 [00:00<?, ?it/s]

2025/11/25 12:37:44 INFO mlflow.models.flavor_backend_registry: Selected backend for flavor 'python_function'


Downloading artifacts:   0%|          | 0/1 [00:00<?, ?it/s]



Downloading artifacts:   0%|          | 0/12 [00:00<?, ?it/s]

2025/11/25 12:37:46 INFO mlflow.utils.virtualenv: Creating a new environment in /tmp/virtualenv_envs/mlflow-23a64789871d1fa1c78aa7da9409abe895cdb96e with python version 3.12.3 using uv
Using CPython 3.12.3 interpreter at: [36m/usr/bin/python3.12[39m
Creating virtual environment at: [36m/tmp/virtualenv_envs/mlflow-23a64789871d1fa1c78aa7da9409abe895cdb96e[39m
Activate with: [32msource /tmp/virtualenv_envs/mlflow-23a64789871d1fa1c78aa7da9409abe895cdb96e/bin/activate[39m
2025/11/25 12:37:47 INFO mlflow.utils.virtualenv: Installing dependencies
[2mUsing Python 3.12.3 environment at: /tmp/virtualenv_envs/mlflow-23a64789871d1fa1c78aa7da9409abe895cdb96e[0m
[2mResolved [1m3 packages[0m [2min 68ms[0m[0m
[36m[1mDownloading[0m[39m setuptools [2m(1.2MiB)[0m
[36m[1mDownloading[0m[39m pip [2m(1.8MiB)[0m
 Downloaded pip
 Downloaded setuptools
[2mPrepared [1m3 packages[0m [2min 105ms[0m[0m
[2mInstalled [1m3 packages[0m [2min 16ms[0m[0m
 [32m+[39m [1mpip[0m[2m=

{"object": "response", "output": [{"type": "message", "id": "d8ee3d00-af8b-4b7a-881a-ce9b3065e086", "content": [{"text": "<name>talent_genie</name>", "type": "output_text"}], "role": "assistant"}, {"type": "message", "id": "30350062-ba57-4597-8bd4-d0a3f846933d", "content": [{"text": "|     | attrition_manager_id                 |   avg_attrition_flag |   num_records |\n|----:|:-------------------------------------|---------------------:|--------------:|\n|   0 | 49f6584a-9f8e-4796-868a-7f4788c99367 |                    0 |           216 |\n|   1 | ab2bf807-ad1a-444c-8f15-bd7f475d5e54 |                    0 |           648 |\n|   2 | 67d82de0-c47f-4eec-9fd3-06af2f55be1a |                    0 |           468 |\n|   3 | 6f94ef60-a7ea-44e7-8609-ccbd0927dcf1 |                    0 |           419 |\n|   4 | 5a09f90b-2650-4dd9-9e3b-bd5d7bba449e |                    0 |           378 |\n|   5 | 6d1f7392-6a5e-4b52-b6d7-d6cd7dfc4e87 |                    0 |           385 |\n|   6 | c08cab78-b1

2025/11/25 12:38:12 INFO mlflow.tracing.export.async_export_queue: Flushing the async trace logging queue before program exit. This may take a while...


## Register the model to Unity Catalog

Update the `catalog`, `schema`, and `model_name` below to register the MLflow model to Unity Catalog.

In [0]:
mlflow.set_registry_uri("databricks-uc")

# TODO: define the catalog, schema, and model name for your UC model
catalog = "akash_s_demo"
schema = "talent"
model_name = "mobility_attrition"
UC_MODEL_NAME = f"{catalog}.{schema}.{model_name}"

# register the model to UC
uc_registered_model_info = mlflow.register_model(
    model_uri=logged_agent_info.model_uri, name=UC_MODEL_NAME
)

Successfully registered model 'akash_s_demo.talent.mobility_attrition'.


Downloading artifacts:   0%|          | 0/12 [00:00<?, ?it/s]

Uploading artifacts:   0%|          | 0/13 [00:00<?, ?it/s]

🔗 Created version '1' of model 'akash_s_demo.talent.mobility_attrition': https://adb-984752964297111.11.azuredatabricks.net/explore/data/models/akash_s_demo/talent/mobility_attrition/version/1?o=984752964297111


## Deploy the agent

In [0]:
from databricks import agents

agents.deploy(UC_MODEL_NAME, uc_registered_model_info.version, tags={"endpointSource": "docs"}, deploy_feedback_model=False)

Downloading artifacts:   0%|          | 0/1 [00:00<?, ?it/s]


    Deployment of akash_s_demo.talent.mobility_attrition version 1 initiated.  This can take up to 15 minutes and the Review App & Query Endpoint will not work until this deployment finishes.

    View status: https://adb-984752964297111.11.azuredatabricks.net/ml/endpoints/agents_akash_s_demo-talent-mobility_attrition/?o=984752964297111
    Review App: https://adb-984752964297111.11.azuredatabricks.net/ml/review-v2/7920ec559cf7453a8c8d00e55e0e056a/chat?o=984752964297111
    Monitor: https://adb-984752964297111.11.azuredatabricks.net/ml/experiments/3188636615953256?compareRunsMode=TRACES&o=984752964297111

You can refer back to the links above from the endpoint detail page at https://adb-984752964297111.11.azuredatabricks.net/ml/endpoints/agents_akash_s_demo-talent-mobility_attrition/?o=984752964297111.


Deployment(model_name='akash_s_demo.talent.mobility_attrition', model_version='1', endpoint_name='agents_akash_s_demo-talent-mobility_attrition', served_entity_name='akash_s_demo-talent-mobility_attrition_1', query_endpoint='https://adb-984752964297111.11.azuredatabricks.net/serving-endpoints/agents_akash_s_demo-talent-mobility_attrition/served-models/akash_s_demo-talent-mobility_attrition_1/invocations?o=984752964297111', endpoint_url='https://adb-984752964297111.11.azuredatabricks.net/ml/endpoints/agents_akash_s_demo-talent-mobility_attrition/?o=984752964297111', review_app_url='https://adb-984752964297111.11.azuredatabricks.net/ml/review-v2/7920ec559cf7453a8c8d00e55e0e056a/chat?o=984752964297111')

## Next steps

After your agent is deployed, you can chat with it in AI playground to perform additional checks, share it with SMEs in your organization for feedback, or embed it in a production application. See Databricks documentation ([AWS](https://docs.databricks.com/en/generative-ai/deploy-agent.html) | [Azure](https://learn.microsoft.com/en-us/azure/databricks/generative-ai/deploy-agent)).

In [0]:
from databricks.sdk import WorkspaceClient

def _get_endpoint_task_type(endpoint_name: str) -> str:
    w = WorkspaceClient()
    ep = w.serving_endpoints.get("endpoint_name")
    return ep.task if ep.task else "chat/completions"

In [0]:

_get_endpoint_task_type('agents_akash_s_demo-talent-mobility_attrition')

'agent/v1/responses'