feat: add Azure OpenAI Service backend by najamulsaqib · Pull Request #1107 · Graphify-Labs/graphify

najamulsaqib · 2026-06-01T22:01:38Z

Adds full support for Azure OpenAI as a new LLM backend alongside the existing OpenAI, Anthropic, Gemini, Kimi, DeepSeek, Bedrock, and Ollama backends.

_call_azure() — dedicated Azure dispatch using AzureOpenAI SDK client (requires azure_endpoint + api_version, distinct from the shared _call_openai_compat() path)
BACKENDS["azure"] — config entry with env-driven deployment name, GPT-4o pricing defaults, and api_version fallback
detect_backend() — auto-detects Azure when both AZURE_OPENAI_API_KEY and AZURE_OPENAI_ENDPOINT are set
extract_files_direct() and _call_llm() — both dispatch sites updated with Azure branches
README — documents env vars, optional extras, privacy/data-residency note, and CLI cheat-sheet example

Adds full support for Azure OpenAI as a new LLM backend alongside the existing OpenAI, Anthropic, Gemini, Kimi, DeepSeek, Bedrock, and Ollama backends. - `_call_azure()` — dedicated Azure dispatch using `AzureOpenAI` SDK client (requires `azure_endpoint` + `api_version`, distinct from the shared `_call_openai_compat()` path) - `BACKENDS["azure"]` — config entry with env-driven deployment name, GPT-4o pricing defaults, and `api_version` fallback - `detect_backend()` — auto-detects Azure when both `AZURE_OPENAI_API_KEY` and `AZURE_OPENAI_ENDPOINT` are set - `extract_files_direct()` and `_call_llm()` — both dispatch sites updated with Azure branches - README — documents env vars, optional extras, privacy/data-residency note, and CLI cheat-sheet example Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

safishamsi

Thanks @najamulsaqib - this is a clean addition that fits the backend architecture well. Verified the design: the base_url-less azure entry is safe (Azure has its own early-return dispatch before any base_url fallback in both call paths - no KeyError in estimate_cost, the custom-provider loop, or _call_openai_compat), detect_backend correctly requires both env vars, deployment-name semantics are right, and no secrets leak. A few things before merge:

Must-fix

max_tokens → max_completion_tokens in both Azure call sites. AzureOpenAI is OpenAI-compatible, and the rest of that path uses max_completion_tokens (see _call_openai_compat and the max_completion_tokens config key). max_tokens is the deprecated param - it works for gpt-4o today, but 400s on any o-series / reasoning deployment, which is a common Azure setup. Please switch both _call_azure and the _call_llm azure branch.

Should-fix

Add a test. There's an established pattern (tests/test_llm_backends.py, test_provider_registry.py, test_ollama.py, test_claude_cli_backend.py). Minimal coverage, mocking openai.AzureOpenAI: (a) _call_azure builds the client with azure_endpoint + api_version and parses a canned response; (b) detect_backend() returns "azure" only when both AZURE_OPENAI_API_KEY and AZURE_OPENAI_ENDPOINT are set (key alone must not select it); (c) iterating BACKENDS (e.g. estimate_cost) raises no KeyError for the base_url-less azure entry.
Duplication drift: the _call_llm azure branch re-implements the client inline and drops the GRAPHIFY_API_TIMEOUT handling that _call_azure has. Please factor out a small _azure_client(api_key, endpoint) helper (api_version + timeout) so the two paths can't diverge.

Nice-to-have

A one-line comment on the config noting azure intentionally omits base_url (so a future refactor doesn't route it through _call_openai_compat), and that cost is fixed gpt-4o pricing (mis-estimates for non-gpt-4o deployments).

The max_completion_tokens change is the only blocker; the test + helper would make it solid. Thanks again!

… helper, tests Must-fix: - Replace deprecated `max_tokens` with `max_completion_tokens` in both Azure call sites (_call_azure kwargs and _call_llm azure branch). `max_tokens` works for gpt-4o today but 400s on o-series/reasoning deployments which are common in Azure setups. Should-fix: - Extract `_azure_client(api_key, endpoint)` helper as the single place that reads AZURE_OPENAI_API_VERSION and GRAPHIFY_API_TIMEOUT, so _call_azure and _call_llm can't diverge. Also fixes the missing timeout handling in the _call_llm azure branch. - Add 4 tests (mocking openai.AzureOpenAI): (a) _call_azure builds client with correct azure_endpoint + api_version and sends max_completion_tokens not max_tokens (b) detect_backend() returns "azure" only when both API key and endpoint are set; key alone must not select it (c) estimate_cost("azure", ...) raises no KeyError for the base_url-less azure entry Nice-to-have: - BACKENDS["azure"] comment notes base_url is intentionally absent (prevents accidental routing through _call_openai_compat) and that pricing is gpt-4o–fixed (may mis-estimate other deployments) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

safishamsi

Thanks @najamulsaqib — this addresses everything cleanly. Verified on current v8:

max_completion_tokens is now used in both Azure call sites (the deprecated max_tokens is gone from the actual create() call) — and there's a test asserting exactly that (max_completion_tokens present, max_tokens absent in the create kwargs).
The shared _azure_client helper removes the duplication, so the api_version/endpoint/timeout can't drift between the two paths.
The base_url-less azure entry is safe (verified earlier — azure has its own dispatch, never hits the openai-compat base_url fallback).
tests/test_llm_backends.py: 32 passed; full suite green (1521 passed).

Good to merge. Nice work.

… features) #1118 — prune stale AST nodes on full re-extraction (#1116) Stamps every AST-extracted node with _origin="ast" in extract(). On a full rebuild _rebuild_code drops any AST-marked node absent from the fresh output even when its source file survives, fixing stale symbols. Backward-compat: marker-less nodes from pre-1118 graphs survive one cycle then self-heal. #1110 — stop reading images and PDFs as garbage in headless extract Images route through per-backend vision payloads (base64/data-URI/bytes for claude/openai/bedrock); non-vision backends get _strip_pixels for graceful degradation. PDFs reuse pypdf. 5MB cap, 20-image chunk limit. #1159 — Salesforce Apex extractor (.cls, .trigger) Regex-based extractor: classes, interfaces, enums, methods, triggers, SOQL/DML edges. No new dependency. Dispatched as .cls and .trigger. #1107 — Azure OpenAI Service backend (--backend azure) Uses AzureOpenAI SDK client (from existing openai package). Auto-detects when AZURE_OPENAI_API_KEY + AZURE_OPENAI_ENDPOINT both set. Uses max_completion_tokens (not deprecated max_tokens). #1103 — live PostgreSQL introspection (--postgres DSN) graphify extract --postgres "postgresql://..." introspects tables, views, routines, and FK relations via information_schema (SERIALIZABLE READ ONLY). Credentials sanitized on error. New graphify[postgres] extra (psycopg3). Union-resolved llm.py conflict: Azure functions + bedrock images= param. Fixed test_image_vision.py mock to accept timeout= kwarg (our #1112). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

safishamsi · 2026-06-07T00:29:41Z

Landed in 7467c1b.

What it adds: Azure OpenAI Service as a new LLM backend (--backend azure). Uses AzureOpenAI from the existing openai package — no new dependency. Auto-detects when both AZURE_OPENAI_API_KEY and AZURE_OPENAI_ENDPOINT are set. Uses max_completion_tokens (not deprecated max_tokens). Priority in auto-detection: after deepseek, before bedrock.

4 tests pass. 1910 passed, 0 failures.

safishamsi requested changes Jun 1, 2026

View reviewed changes

njm-reg1 and others added 2 commits June 2, 2026 11:20

Merge branch 'safishamsi:v8' into v8

db6147b

safishamsi approved these changes Jun 2, 2026

View reviewed changes

safishamsi closed this Jun 7, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat: add Azure OpenAI Service backend#1107

feat: add Azure OpenAI Service backend#1107
najamulsaqib wants to merge 3 commits into
Graphify-Labs:v8from
najamulsaqib:v8

najamulsaqib commented Jun 1, 2026

Uh oh!

safishamsi left a comment

Uh oh!

safishamsi left a comment

Uh oh!

safishamsi commented Jun 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Uh oh!

Conversation

najamulsaqib commented Jun 1, 2026

Uh oh!

safishamsi left a comment

Choose a reason for hiding this comment

Must-fix

Should-fix

Nice-to-have

Uh oh!

safishamsi left a comment

Choose a reason for hiding this comment

Uh oh!

safishamsi commented Jun 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants