### Fabric Data Agent Automation Library

### Introduction
This notebook demonstrates how to automate Fabric data agent functionalities such as creating a data agent; adding a datasource (e.g. Lakehouse) or adding instructions to a data agent programmatically using our library. More information on data agent can be found [here](https://learn.microsoft.com/en-us/fabric/data-science/concept-ai-skill).

In [None]:
%pip install fabric-data-agent-sdk

StatementMeta(, 63babb7b-1224-4272-af46-33cb1af9ad84, 7, Finished, Available, Finished)

Collecting fabric-data-agent-sdk
  Downloading fabric_data_agent_sdk-0.0.2a0-py3-none-any.whl.metadata (3.1 kB)
Collecting openai>=1.57.0 (from fabric-data-agent-sdk)
  Downloading openai-1.68.2-py3-none-any.whl.metadata (25 kB)
Collecting httpx==0.27.2 (from fabric-data-agent-sdk)
  Downloading httpx-0.27.2-py3-none-any.whl.metadata (7.1 kB)
Collecting httpcore==1.* (from httpx==0.27.2->fabric-data-agent-sdk)
  Downloading httpcore-1.0.7-py3-none-any.whl.metadata (21 kB)
Collecting h11<0.15,>=0.13 (from httpcore==1.*->httpx==0.27.2->fabric-data-agent-sdk)
  Downloading h11-0.14.0-py3-none-any.whl.metadata (8.2 kB)
Collecting jiter<1,>=0.4.0 (from openai>=1.57.0->fabric-data-agent-sdk)
  Downloading jiter-0.9.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (5.2 kB)
Collecting pydantic<3,>=1.9.0 (from openai>=1.57.0->fabric-data-agent-sdk)
  Downloading pydantic-2.10.6-py3-none-any.whl.metadata (30 kB)
Collecting typing-extensions<5,>=4.11 (from openai>=1.57.0->fab

In [None]:
from fabric.dataagent.client import (
    FabricDataAgentManagement,
    create_data_agent,
    delete_data_agent,
)

StatementMeta(, 63babb7b-1224-4272-af46-33cb1af9ad84, 9, Finished, Available, Finished)



First, let's create a data agent

In [None]:
data_agent_name = "data_agent_automation_sample"

StatementMeta(, 63babb7b-1224-4272-af46-33cb1af9ad84, 10, Finished, Available, Finished)

In [None]:
# create DataAgent
data_agent = create_data_agent(data_agent_name)

StatementMeta(, 63babb7b-1224-4272-af46-33cb1af9ad84, 11, Finished, Available, Finished)

You can check the configuration of a data agent as shown below

In [None]:
# by default the instructions and description for the data agent will be empty, we will update them later in the notebook
data_agent.get_configuration()

StatementMeta(, 63babb7b-1224-4272-af46-33cb1af9ad84, 12, Finished, Available, Finished)

DataAgentConfiguration(instructions=None, user_description=None)

You can also initialize a client for an existing data agent

In [None]:
data_agent = FabricDataAgentManagement(data_agent_name)

StatementMeta(, 63babb7b-1224-4272-af46-33cb1af9ad84, 14, Finished, Available, Finished)

Update data agent with instructions and a description

In [None]:
data_agent.update_configuration(
    instructions="You are a helpful assistant, help users with their questions",
    user_description="Data agent to assists users with insights from the AdventureWorks dataset.",
)
data_agent.get_configuration()

StatementMeta(, 63babb7b-1224-4272-af46-33cb1af9ad84, 15, Finished, Available, Finished)

DataAgentConfiguration(instructions='You are a helpful assistant, help users with their questions', user_description='Data Agent to assists users with insights from the AdventureWorks dataset.')

You will now add a datasource to your data agent. In this sample, you will add a Lakehouse as a datasource, however you can also add Semantic Model or KQL database.

You will work with a sample Lakehouse called AdventureWorks for the remainder of the notebook, you can also create the same Lakehouse by following the instructions here: https://learn.microsoft.com/en-us/fabric/data-science/ai-skill-scenario#create-a-lakehouse-with-adventureworksdw

If you name your Lakehouse `AdventureWorks` the rest of the notebook should run without requiring any changes. If you would like to use your own Lakehouse, make sure to change the `lakehouse_name` and other relevant variables throughout the notebook.

In [None]:
# add a lakehouse
lakehouse_name = "AdventureWorks"
# datasource type could be: lakehouse, kqldatabase, datawarehouse or semanticmodel
data_agent.add_datasource(lakehouse_name, type="lakehouse")

StatementMeta(, 63babb7b-1224-4272-af46-33cb1af9ad84, 16, Finished, Available, Finished)

Datasource(04be57db-55bb-4a11-afcf-d403693468c4)

In [None]:
# we can check which datasources are added to the data agent
data_agent.get_datasources()

StatementMeta(, 63babb7b-1224-4272-af46-33cb1af9ad84, 17, Finished, Available, Finished)

[Datasource(04be57db-55bb-4a11-afcf-d403693468c4)]

You can publish the data agent.

In [None]:
data_agent.publish()

StatementMeta(, 63babb7b-1224-4272-af46-33cb1af9ad84, 18, Finished, Available, Finished)

Now you will work with the datasource you just added.

In [None]:
datasource = data_agent.get_datasources()[0]

StatementMeta(, 63babb7b-1224-4272-af46-33cb1af9ad84, 19, Finished, Available, Finished)

You can take a look at the tables and the columns in the datasource. 

- Note that by default, the datasource is initialized with no table selected. A `*` next to the table indicates selected table.
- You can select tables using `datasource.select` to pick the right tables or all tables related to the context of the question.
- Selecting a table will also select all columns in the table.

In [None]:
datasource.pretty_print()

StatementMeta(, 63babb7b-1224-4272-af46-33cb1af9ad84, 20, Finished, Available, Finished)

 dbo
  | dimcurrency
  |  | CurrencyKey
  |  | CurrencyAlternateKey
  |  | CurrencyName
  | dimcustomer
  |  | CustomerKey
  |  | GeographyKey
  |  | CustomerAlternateKey
  |  | Title
  |  | FirstName
  |  | MiddleName
  |  | LastName
  |  | NameStyle
  |  | BirthDate
  |  | MaritalStatus
  |  | Suffix
  |  | Gender
  |  | EmailAddress
  |  | YearlyIncome
  |  | TotalChildren
  |  | NumberChildrenAtHome
  |  | EnglishEducation
  |  | SpanishEducation
  |  | FrenchEducation
  |  | EnglishOccupation
  |  | SpanishOccupation
  |  | FrenchOccupation
  |  | HouseOwnerFlag
  |  | NumberCarsOwned
  |  | AddressLine1
  |  | AddressLine2
  |  | Phone
  |  | DateFirstPurchase
  |  | CommuteDistance
  | dimaccount
  |  | AccountKey
  |  | ParentAccountKey
  |  | AccountCodeAlternateKey
  |  | ParentAccountCodeAlternateKey
  |  | AccountDescription
  |  | AccountType
  |  | Operator
  |  | CustomMembers
  |  | ValueType
  | dimproductcategory
  |  | ProductCategoryKey
  |  | ProductCategoryAlterna

You can add/remove table to consider/not consider them in query generation.

In [None]:
datasource.select("dbo", "dimcurrency")
datasource.select("dbo", "dimemployee")

StatementMeta(, 63babb7b-1224-4272-af46-33cb1af9ad84, 21, Finished, Available, Finished)

You will see that the `*` will appear next to the `dimcurrency` and `dimemployee` tables, as they are now selected.

In [None]:
datasource.pretty_print()

StatementMeta(, 63babb7b-1224-4272-af46-33cb1af9ad84, 22, Finished, Available, Finished)

 dbo
  | dimcurrency *
  |  | CurrencyKey
  |  | CurrencyAlternateKey
  |  | CurrencyName
  | dimcustomer
  |  | CustomerKey
  |  | GeographyKey
  |  | CustomerAlternateKey
  |  | Title
  |  | FirstName
  |  | MiddleName
  |  | LastName
  |  | NameStyle
  |  | BirthDate
  |  | MaritalStatus
  |  | Suffix
  |  | Gender
  |  | EmailAddress
  |  | YearlyIncome
  |  | TotalChildren
  |  | NumberChildrenAtHome
  |  | EnglishEducation
  |  | SpanishEducation
  |  | FrenchEducation
  |  | EnglishOccupation
  |  | SpanishOccupation
  |  | FrenchOccupation
  |  | HouseOwnerFlag
  |  | NumberCarsOwned
  |  | AddressLine1
  |  | AddressLine2
  |  | Phone
  |  | DateFirstPurchase
  |  | CommuteDistance
  | dimaccount
  |  | AccountKey
  |  | ParentAccountKey
  |  | AccountCodeAlternateKey
  |  | ParentAccountCodeAlternateKey
  |  | AccountDescription
  |  | AccountType
  |  | Operator
  |  | CustomMembers
  |  | ValueType
  | dimproductcategory
  |  | ProductCategoryKey
  |  | ProductCategoryAlter

Now let's unselect the `dimcurrency` table.

In [None]:
datasource.unselect("dbo", "dimcurrency")

StatementMeta(, 63babb7b-1224-4272-af46-33cb1af9ad84, 23, Finished, Available, Finished)

You will notice from the missing `*` next to the columns under the `dimcurrency table`, now that you un-selected the table, which means the data agent is instructed to not use the un-selected table when generating an answer.

In [None]:
datasource.pretty_print()

StatementMeta(, 63babb7b-1224-4272-af46-33cb1af9ad84, 24, Finished, Available, Finished)

 dbo
  | dimcurrency
  |  | CurrencyKey
  |  | CurrencyAlternateKey
  |  | CurrencyName
  | dimcustomer
  |  | CustomerKey
  |  | GeographyKey
  |  | CustomerAlternateKey
  |  | Title
  |  | FirstName
  |  | MiddleName
  |  | LastName
  |  | NameStyle
  |  | BirthDate
  |  | MaritalStatus
  |  | Suffix
  |  | Gender
  |  | EmailAddress
  |  | YearlyIncome
  |  | TotalChildren
  |  | NumberChildrenAtHome
  |  | EnglishEducation
  |  | SpanishEducation
  |  | FrenchEducation
  |  | EnglishOccupation
  |  | SpanishOccupation
  |  | FrenchOccupation
  |  | HouseOwnerFlag
  |  | NumberCarsOwned
  |  | AddressLine1
  |  | AddressLine2
  |  | Phone
  |  | DateFirstPurchase
  |  | CommuteDistance
  | dimaccount
  |  | AccountKey
  |  | ParentAccountKey
  |  | AccountCodeAlternateKey
  |  | ParentAccountCodeAlternateKey
  |  | AccountDescription
  |  | AccountType
  |  | Operator
  |  | CustomMembers
  |  | ValueType
  | dimproductcategory
  |  | ProductCategoryKey
  |  | ProductCategoryAlterna

You can also add few-shot examples.

In [None]:
example_question = "How many employees are there in the company?"
example_query = "SELECT COUNT(*) AS NumberOfEmployees FROM dbo.dimemployee"
datasource.add_fewshots({example_question:example_query})

StatementMeta(, 63babb7b-1224-4272-af46-33cb1af9ad84, 25, Finished, Available, Finished)

In [None]:
datasource.get_fewshots()

StatementMeta(, 63babb7b-1224-4272-af46-33cb1af9ad84, 26, Finished, Available, Finished)

Unnamed: 0,Id,Question,Query,State,Embedding
0,03f6dc18-b372-4766-ad4e-f2d8c340da58,How many employees are there in the company?,SELECT COUNT(*) AS NumberOfEmployees FROM dbo....,validating,


You can delete few-shots using their ids.

In [None]:
# make sure the replace the id of the few-shot example with the id that is assigned to your few-shot example
datasource.remove_fewshot("03f6dc18-b372-4766-ad4e-f2d8c340da58")

StatementMeta(, 63babb7b-1224-4272-af46-33cb1af9ad84, 27, Finished, Available, Finished)

In [None]:
datasource.get_fewshots()

StatementMeta(, 63babb7b-1224-4272-af46-33cb1af9ad84, 28, Finished, Available, Finished)

Unnamed: 0,Id,Question,Query,State,Embedding


Finally, we can delete the data agent

In [None]:
delete_data_agent(data_agent_name)

StatementMeta(, 63babb7b-1224-4272-af46-33cb1af9ad84, 29, Finished, Available, Finished)