Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 8 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

A minimal Python SDK to use Microsoft Dataverse as a database for Azure AI Foundry–style apps.

- Read (SQL) — Execute read-only T‑SQL via the McpExecuteSqlQuery Custom API. Returns `list[dict]`.
- Read (SQL) — Execute constrained read-only SQL via the Dataverse Web API `?sql=` parameter. Returns `list[dict]`.
- OData CRUD — Thin wrappers over Dataverse Web API (create/get/update/delete).
- Bulk create — Pass a list of records to `create(...)` to invoke the bound `CreateMultiple` action; returns `list[str]` of GUIDs. If `@odata.type` is absent the SDK resolves the logical name from metadata (cached).
- Bulk update — Call `update_multiple(entity_set, records)` to invoke the bound `UpdateMultiple` action; returns nothing. Each record must include the real primary key attribute (e.g. `accountid`).
Expand All @@ -14,7 +14,7 @@ A minimal Python SDK to use Microsoft Dataverse as a database for Azure AI Found
## Features

- Simple `DataverseClient` facade for CRUD, SQL (read-only), and table metadata.
- SQL-over-API: T-SQL routed through Custom API endpoint (no ODBC / TDS driver required).
- SQL-over-API: Constrained SQL (single SELECT with limited WHERE/TOP/ORDER BY) via native Web API `?sql=` parameter.
- Table metadata ops: create simple custom tables with primitive columns (string/int/decimal/float/datetime/bool) and delete them.
- Bulk create via `CreateMultiple` (collection-bound) by passing `list[dict]` to `create(entity_set, payloads)`; returns list of created IDs.
- Bulk update via `UpdateMultiple` by calling `update_multiple(entity_set, records)` with primary key attribute present in each record; returns nothing.
Expand All @@ -35,12 +35,12 @@ Create and activate a Python 3.13+ environment, then install dependencies:
python -m pip install -r requirements.txt
```

Direct TDS via ODBC is not used; SQL reads are executed via the Custom API over OData.
Direct TDS via ODBC is not used; SQL reads are executed via the Web API using the `?sql=` query parameter.

## Configuration Notes

- For Web API (OData), tokens target your Dataverse org URL scope: https://yourorg.crm.dynamics.com/.default. The SDK requests this scope from the provided TokenCredential.
- For complete functionalities, please use one of the PREPROD BAP environments, otherwise McpExecuteSqlQuery might not work.
(Preprod environments may surface newest SQL subset capabilities sooner than production.)

### Configuration (DataverseConfig)

Expand All @@ -50,7 +50,7 @@ Pass a `DataverseConfig` or rely on sane defaults:
from dataverse_sdk import DataverseClient
from dataverse_sdk.config import DataverseConfig

cfg = DataverseConfig() # defaults: language_code=1033, sql_api_name="McpExecuteSqlQuery"
cfg = DataverseConfig() # defaults: language_code=1033
client = DataverseClient(base_url="https://yourorg.crm.dynamics.com", config=cfg)

# Optional HTTP tunables (timeouts/retries)
Expand All @@ -70,7 +70,7 @@ The quickstart demonstrates:
- Creating, reading, updating, and deleting records (OData)
- Bulk create (CreateMultiple) to insert many records in one call
- Retrieve multiple with paging (contrasting `$top` vs `page_size`)
- Executing a read-only SQL query
- Executing a read-only SQL query (Web API `?sql=`)

## Examples

Expand Down Expand Up @@ -112,7 +112,7 @@ print({"bulk_update": "ok"})
# Delete
client.delete("accounts", account_id)

# SQL (read-only) via Custom API
# SQL (read-only) via Web API `?sql=`
rows = client.query_sql("SELECT TOP 3 accountid, name FROM account ORDER BY createdon DESC")
for r in rows:
print(r.get("accountid"), r.get("name"))
Expand Down Expand Up @@ -271,7 +271,7 @@ Notes:
- Passing a list of payloads to `create` triggers bulk create and returns `list[str]` of IDs.
- Use `get_multiple` for paging through result sets; prefer `select` to limit columns.
- For CRUD methods that take a record id, pass the GUID string (36-char hyphenated). Parentheses around the GUID are accepted but not required.
- SQL is routed through the Custom API named in `DataverseConfig.sql_api_name` (default: `McpExecuteSqlQuery`).
* SQL queries are executed directly against entity set endpoints using the `?sql=` parameter. Supported subset only (single SELECT, optional WHERE/TOP/ORDER BY, alias). Unsupported constructs will be rejected by the service.

### Pandas helpers

Expand Down
18 changes: 10 additions & 8 deletions examples/quickstart.py
Original file line number Diff line number Diff line change
Expand Up @@ -320,15 +320,17 @@ def print_line_summaries(label: str, summaries: list[dict]) -> None:
except Exception as e:
print(f"Bulk update failed: {e}")

# 4) Query records via SQL Custom API
print("Query (SQL via Custom API):")
# 4) Query records via SQL (?sql parameter))
print("Query (SQL via ?sql query parameter):")
try:
import time
pause("Execute SQL Query")

def _run_query():
log_call(f"client.query_sql(\"SELECT TOP 2 * FROM {logical} ORDER BY {attr_prefix}_amount DESC\")")
return client.query_sql(f"SELECT TOP 2 * FROM {logical} ORDER BY {attr_prefix}_amount DESC")
cols = f"{id_key}, {code_key}, {amount_key}, {when_key}"
query = f"SELECT TOP 2 {cols} FROM {logical} ORDER BY {attr_prefix}_amount DESC"
log_call(f"client.query_sql(\"{query}\") (Web API ?sql=)")
return client.query_sql(query)

def _retry_if(ex: Exception) -> bool:
msg = str(ex) if ex else ""
Expand All @@ -338,9 +340,9 @@ def _retry_if(ex: Exception) -> bool:
id_key = f"{logical}id"
ids = [r.get(id_key) for r in rows if isinstance(r, dict) and r.get(id_key)]
print({"entity": logical, "rows": len(rows) if isinstance(rows, list) else 0, "ids": ids})
tds_summaries = []
record_summaries = []
for row in rows if isinstance(rows, list) else []:
tds_summaries.append(
record_summaries.append(
{
"id": row.get(id_key),
"code": row.get(code_key),
Expand All @@ -349,9 +351,9 @@ def _retry_if(ex: Exception) -> bool:
"when": row.get(when_key),
}
)
print_line_summaries("TDS record summaries (top 2 by amount):", tds_summaries)
print_line_summaries("SQL record summaries (top 2 by amount):", record_summaries)
except Exception as e:
print(f"SQL via Custom API failed: {e}")
print(f"SQL query failed: {e}")

# Pause between SQL query and retrieve-multiple demos
pause("Retrieve multiple (OData paging demos)")
Expand Down
10 changes: 6 additions & 4 deletions examples/quickstart_pandas.py
Original file line number Diff line number Diff line change
Expand Up @@ -183,8 +183,8 @@ def backoff_retry(op, *, delays=(0, 2, 5, 10, 20), retry_http_statuses=(400, 403
print(f"Update/verify failed: {e}")
sys.exit(1)

# 4) Query records via SQL Custom API
print("(Pandas) Query (SQL via Custom API):")
# 4) Query records via SQL (Web API ?sql=)
print("(Pandas) Query (SQL via Web API ?sql=):")
try:
# Try singular logical name first, then plural entity set, with short backoff
import time
Expand All @@ -196,7 +196,9 @@ def backoff_retry(op, *, delays=(0, 2, 5, 10, 20), retry_http_statuses=(400, 403
df_rows = None
for name in candidates:
def _run_query():
return PANDAS.query_sql_df(f"SELECT TOP 3 * FROM {name} ORDER BY createdon DESC")
id_key = f"{logical}id"
cols = f"{id_key}, {attr_prefix}_code, {attr_prefix}_amount, {attr_prefix}_when"
return PANDAS.query_sql_df(f"SELECT TOP 3 {cols} FROM {name} ORDER BY {attr_prefix}_amount DESC")
def _retry_if(ex: Exception) -> bool:
msg = str(ex) if ex else ""
return ("Invalid table name" in msg) or ("Invalid object name" in msg)
Expand All @@ -211,7 +213,7 @@ def _retry_if(ex: Exception) -> bool:
except SystemExit:
pass
except Exception as e:
print(f"SQL via Custom API failed: {e}")
print(f"SQL query failed: {e}")

# 5) Delete record
print("(Pandas) Delete (OData via Pandas wrapper):")
Expand Down
4 changes: 2 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@ requires = ["setuptools>=61.0"]
build-backend = "setuptools.build_meta"

[project]
name = "dataverse-sdk-poc"
name = "dataverse-python-client"
version = "0.1.0"
description = "POC: Dataverse Python SDK with TDS reads and OData CRUD via SQL router"
description = "Dataverse Python client"
authors = [{ name = "POC" }]
readme = "README.md"
requires-python = ">=3.10"
Expand Down
20 changes: 12 additions & 8 deletions src/dataverse_sdk/client.py
Comment thread
tpellissier-msft marked this conversation as resolved.
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ class DataverseClient:

This client exposes a simple, stable surface for:
- OData CRUD: create, get, update, delete records
- SQL (read-only): execute T-SQL via Dataverse Custom API (no ODBC/TDS driver)
- SQL (read-only): query SQL via ?sql parameter in Web API
- Table metadata: create, inspect, and delete simple custom tables

The client owns authentication (Azure Identity) and configuration, and delegates
Expand Down Expand Up @@ -182,21 +182,25 @@ def get_multiple(
page_size=page_size,
)

# SQL via Custom API
def query_sql(self, tsql: str):
"""Execute a read-only SQL query via the configured Custom API.
# SQL via Web API sql parameter
def query_sql(self, sql: str):
"""Execute a read-only SQL query using the Dataverse Web API `?sql=` capability.

The query must follow the currently supported subset: single SELECT with optional WHERE,
TOP (integer), ORDER BY (columns only), and simple alias after FROM. Example:
``SELECT TOP 3 accountid, name FROM account ORDER BY name DESC``

Parameters
----------
tsql : str
A SELECT-only T-SQL statement (e.g., ``"SELECT TOP 3 * FROM account"``).
sql : str
Supported single SELECT statement.

Returns
-------
list[dict]
Rows as a list of dictionaries.
Result rows (empty list if none).
"""
return self._get_odata().query_sql(tsql)
return self._get_odata().query_sql(sql)

# Table metadata helpers
def get_table_info(self, tablename: str) -> Optional[Dict[str, Any]]:
Expand Down
2 changes: 0 additions & 2 deletions src/dataverse_sdk/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@
@dataclass(frozen=True)
class DataverseConfig:
language_code: int = 1033
sql_api_name: str = "McpExecuteSqlQuery"

# Optional HTTP tuning (not yet wired everywhere; reserved for future use)
http_retries: Optional[int] = None
Expand All @@ -19,7 +18,6 @@ def from_env(cls) -> "DataverseConfig":
# Environment-free defaults
return cls(
language_code=1033,
sql_api_name="McpExecuteSqlQuery",
http_retries=None,
http_backoff=None,
http_timeout=None,
Expand Down
126 changes: 103 additions & 23 deletions src/dataverse_sdk/odata.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,8 @@ def __init__(self, auth, base_url: str, config=None) -> None:
)
# Cache: entity set name -> logical name (resolved via metadata lookup)
self._entityset_logical_cache = {}
# Cache: logical name -> entity set name (reverse lookup for SQL endpoint)
self._logical_to_entityset_cache: dict[str, str] = {}

def _headers(self) -> Dict[str, str]:
"""Build standard OData headers with bearer auth."""
Expand Down Expand Up @@ -374,42 +376,120 @@ def _do_request(url: str, *, params: Optional[Dict[str, Any]] = None) -> Dict[st
next_link = data.get("@odata.nextLink") or data.get("odata.nextLink") if isinstance(data, dict) else None

# --------------------------- SQL Custom API -------------------------
def query_sql(self, tsql: str) -> list[dict[str, Any]]:
"""Execute a read-only T-SQL query via the configured Custom API.
def query_sql(self, sql: str) -> list[dict[str, Any]]:
"""Execute a read-only SQL query using the Dataverse Web API `?sql=` capability.

The platform supports a constrained subset of SQL SELECT statements directly on entity set endpoints:
GET /{entity_set}?sql=<encoded select statement>

This client extracts the logical table name from the query, resolves the corresponding
entity set name (cached) and invokes the Web API using the `sql` query parameter.

Parameters
----------
tsql : str
SELECT-style Dataverse-supported T-SQL (read-only).
sql : str
Single SELECT statement within supported subset.

Returns
-------
list[dict]
Rows materialised as list of dictionaries (empty list if no rows).
Result rows (empty list if none).

Raises
------
ValueError
If the SQL is empty or malformed, or if the table logical name cannot be determined.
RuntimeError
If the Custom API response is missing the expected ``queryresult`` property or type is unexpected.
If metadata lookup for the logical name fails.
"""
payload = {"querytext": tsql}
headers = self._headers()
api_name = self.config.sql_api_name
url = f"{self.api}/{api_name}"
r = self._request("post", url, headers=headers, json=payload)
if not isinstance(sql, str) or not sql.strip():
raise ValueError("sql must be a non-empty string")
sql = sql.strip()

# Extract logical table name via helper (robust to identifiers ending with 'from')
logical = self._extract_logical_table(sql)

entity_set = self._entity_set_from_logical(logical)
# Issue GET /{entity_set}?sql=<query>
headers = self._headers().copy()
url = f"{self.api}/{entity_set}"
params = {"sql": sql}
r = self._request("get", url, headers=headers, params=params)
try:
r.raise_for_status()
except Exception as e:
# Attach response snippet to aid debugging unsupported SQL patterns
resp_text = None
try:
resp_text = r.text[:500] if getattr(r, 'text', None) else None
except Exception:
pass
detail = f" SQL query failed (status={getattr(r, 'status_code', '?')}): {resp_text}" if resp_text else ""
raise RuntimeError(str(e) + detail) from e
try:
body = r.json()
except ValueError:
return []
if isinstance(body, dict):
value = body.get("value")
if isinstance(value, list):
# Ensure dict rows only
return [row for row in value if isinstance(row, dict)]
# Fallbacks: if body itself is a list
if isinstance(body, list):
return [row for row in body if isinstance(row, dict)]
return []

@staticmethod
def _extract_logical_table(sql: str) -> str:
"""Extract the logical table name after the first standalone FROM.

Examples:
SELECT * FROM account
SELECT col1, startfrom FROM new_sampleitem WHERE col1 = 1

"""
if not isinstance(sql, str):
raise ValueError("sql must be a string")
# Mask out single-quoted string literals to avoid matching FROM inside them.
masked = re.sub(r"'([^']|'')*'", "'x'", sql)
pattern = r"\bfrom\b\s+([A-Za-z0-9_]+)" # minimal, single-line regex
m = re.search(pattern, masked, flags=re.IGNORECASE)
if not m:
raise ValueError("Unable to determine table logical name from SQL (expected 'FROM <name>').")
return m.group(1).lower()

# ---------------------- Entity set resolution -----------------------
def _entity_set_from_logical(self, logical: str) -> str:
"""Resolve entity set name (plural) from a logical (singular) name using metadata.

Caches results for subsequent SQL queries.
"""
if not logical:
raise ValueError("logical name required")
cached = self._logical_to_entityset_cache.get(logical)
if cached:
return cached
url = f"{self.api}/EntityDefinitions"
logical_escaped = self._escape_odata_quotes(logical)
params = {
"$select": "LogicalName,EntitySetName",
"$filter": f"LogicalName eq '{logical_escaped}'",
}
r = self._request("get", url, headers=self._headers(), params=params)
r.raise_for_status()
data = r.json()
if "queryresult" not in data:
raise RuntimeError(f"{api_name} response missing 'queryresult'.")
q = data["queryresult"]
if q is None:
parsed = []
elif isinstance(q, str):
s = q.strip()
parsed = [] if not s else json.loads(s)
else:
raise RuntimeError(f"Unexpected queryresult type: {type(q)}")
return parsed
try:
body = r.json()
items = body.get("value", []) if isinstance(body, dict) else []
except ValueError:
items = []
if not items:
raise RuntimeError(f"Unable to resolve entity set for logical name '{logical}'.")
es = items[0].get("EntitySetName")
if not es:
raise RuntimeError(f"Metadata response missing EntitySetName for logical '{logical}'.")
self._logical_to_entityset_cache[logical] = es
return es

# ---------------------- Table metadata helpers ----------------------
def _label(self, text: str) -> Dict[str, Any]:
Expand Down
9 changes: 5 additions & 4 deletions src/dataverse_sdk/odata_pandas_wrappers.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
DataFrame summarizing success/failure.
* get_ids: fetches a set of ids returning a DataFrame of the merged JSON
objects (outer union of keys). Missing keys are NaN.
* query_sql_df: runs a SQL query via Custom API and returns the result rows as
* query_sql_df: runs a SQL query via the Web API `?sql=` parameter and returns the result rows as
a DataFrame (empty DataFrame if no rows).

Edge cases & behaviors:
Expand Down Expand Up @@ -139,12 +139,13 @@ def get_ids(self, entity_set: str, ids: Sequence[str] | pd.Series | pd.Index, se
return pd.DataFrame(rows)

# --------------------------- Query SQL -------------------------------
def query_sql_df(self, tsql: str) -> pd.DataFrame:
"""Execute a SQL query via Custom API and return a DataFrame.
def query_sql_df(self, sql: str) -> pd.DataFrame:
"""Execute a SQL query via the Dataverse Web API `?sql=` parameter and return a DataFrame.

The statement must adhere to the supported subset (single SELECT, optional WHERE/TOP/ORDER BY, no joins).
Empty result -> empty DataFrame (columns inferred only if rows present).
"""
rows: Any = self._c.query_sql(tsql)
rows: Any = self._c.query_sql(sql)

# If API returned a JSON string, parse it
if isinstance(rows, str):
Expand Down
Loading