feat: add Azure OpenAI Service backend#1107
Conversation
Adds full support for Azure OpenAI as a new LLM backend alongside the existing OpenAI, Anthropic, Gemini, Kimi, DeepSeek, Bedrock, and Ollama backends. - `_call_azure()` — dedicated Azure dispatch using `AzureOpenAI` SDK client (requires `azure_endpoint` + `api_version`, distinct from the shared `_call_openai_compat()` path) - `BACKENDS["azure"]` — config entry with env-driven deployment name, GPT-4o pricing defaults, and `api_version` fallback - `detect_backend()` — auto-detects Azure when both `AZURE_OPENAI_API_KEY` and `AZURE_OPENAI_ENDPOINT` are set - `extract_files_direct()` and `_call_llm()` — both dispatch sites updated with Azure branches - README — documents env vars, optional extras, privacy/data-residency note, and CLI cheat-sheet example Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
safishamsi
left a comment
There was a problem hiding this comment.
Thanks @najamulsaqib - this is a clean addition that fits the backend architecture well. Verified the design: the base_url-less azure entry is safe (Azure has its own early-return dispatch before any base_url fallback in both call paths - no KeyError in estimate_cost, the custom-provider loop, or _call_openai_compat), detect_backend correctly requires both env vars, deployment-name semantics are right, and no secrets leak. A few things before merge:
Must-fix
max_tokens → max_completion_tokens in both Azure call sites. AzureOpenAI is OpenAI-compatible, and the rest of that path uses max_completion_tokens (see _call_openai_compat and the max_completion_tokens config key). max_tokens is the deprecated param - it works for gpt-4o today, but 400s on any o-series / reasoning deployment, which is a common Azure setup. Please switch both _call_azure and the _call_llm azure branch.
Should-fix
- Add a test. There's an established pattern (
tests/test_llm_backends.py,test_provider_registry.py,test_ollama.py,test_claude_cli_backend.py). Minimal coverage, mockingopenai.AzureOpenAI: (a)_call_azurebuilds the client withazure_endpoint+api_versionand parses a canned response; (b)detect_backend()returns"azure"only when bothAZURE_OPENAI_API_KEYandAZURE_OPENAI_ENDPOINTare set (key alone must not select it); (c) iteratingBACKENDS(e.g.estimate_cost) raises noKeyErrorfor thebase_url-less azure entry. - Duplication drift: the
_call_llmazure branch re-implements the client inline and drops theGRAPHIFY_API_TIMEOUThandling that_call_azurehas. Please factor out a small_azure_client(api_key, endpoint)helper (api_version + timeout) so the two paths can't diverge.
Nice-to-have
- A one-line comment on the config noting azure intentionally omits
base_url(so a future refactor doesn't route it through_call_openai_compat), and that cost is fixed gpt-4o pricing (mis-estimates for non-gpt-4o deployments).
The max_completion_tokens change is the only blocker; the test + helper would make it solid. Thanks again!
… helper, tests
Must-fix:
- Replace deprecated `max_tokens` with `max_completion_tokens` in both
Azure call sites (_call_azure kwargs and _call_llm azure branch).
`max_tokens` works for gpt-4o today but 400s on o-series/reasoning
deployments which are common in Azure setups.
Should-fix:
- Extract `_azure_client(api_key, endpoint)` helper as the single place
that reads AZURE_OPENAI_API_VERSION and GRAPHIFY_API_TIMEOUT, so
_call_azure and _call_llm can't diverge. Also fixes the missing
timeout handling in the _call_llm azure branch.
- Add 4 tests (mocking openai.AzureOpenAI):
(a) _call_azure builds client with correct azure_endpoint + api_version
and sends max_completion_tokens not max_tokens
(b) detect_backend() returns "azure" only when both API key and endpoint
are set; key alone must not select it
(c) estimate_cost("azure", ...) raises no KeyError for the base_url-less
azure entry
Nice-to-have:
- BACKENDS["azure"] comment notes base_url is intentionally absent
(prevents accidental routing through _call_openai_compat) and that
pricing is gpt-4o–fixed (may mis-estimate other deployments)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
safishamsi
left a comment
There was a problem hiding this comment.
Thanks @najamulsaqib — this addresses everything cleanly. Verified on current v8:
max_completion_tokensis now used in both Azure call sites (the deprecatedmax_tokensis gone from the actualcreate()call) — and there's a test asserting exactly that (max_completion_tokenspresent,max_tokensabsent in the create kwargs).- The shared
_azure_clienthelper removes the duplication, so the api_version/endpoint/timeout can't drift between the two paths. - The
base_url-less azure entry is safe (verified earlier — azure has its own dispatch, never hits the openai-compatbase_urlfallback). tests/test_llm_backends.py: 32 passed; full suite green (1521 passed).
Good to merge. Nice work.
… features) #1118 — prune stale AST nodes on full re-extraction (#1116) Stamps every AST-extracted node with _origin="ast" in extract(). On a full rebuild _rebuild_code drops any AST-marked node absent from the fresh output even when its source file survives, fixing stale symbols. Backward-compat: marker-less nodes from pre-1118 graphs survive one cycle then self-heal. #1110 — stop reading images and PDFs as garbage in headless extract Images route through per-backend vision payloads (base64/data-URI/bytes for claude/openai/bedrock); non-vision backends get _strip_pixels for graceful degradation. PDFs reuse pypdf. 5MB cap, 20-image chunk limit. #1159 — Salesforce Apex extractor (.cls, .trigger) Regex-based extractor: classes, interfaces, enums, methods, triggers, SOQL/DML edges. No new dependency. Dispatched as .cls and .trigger. #1107 — Azure OpenAI Service backend (--backend azure) Uses AzureOpenAI SDK client (from existing openai package). Auto-detects when AZURE_OPENAI_API_KEY + AZURE_OPENAI_ENDPOINT both set. Uses max_completion_tokens (not deprecated max_tokens). #1103 — live PostgreSQL introspection (--postgres DSN) graphify extract --postgres "postgresql://..." introspects tables, views, routines, and FK relations via information_schema (SERIALIZABLE READ ONLY). Credentials sanitized on error. New graphify[postgres] extra (psycopg3). Union-resolved llm.py conflict: Azure functions + bedrock images= param. Fixed test_image_vision.py mock to accept timeout= kwarg (our #1112). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Landed in 7467c1b. What it adds: Azure OpenAI Service as a new LLM backend ( 4 tests pass. 1910 passed, 0 failures. |
Adds full support for Azure OpenAI as a new LLM backend alongside the existing OpenAI, Anthropic, Gemini, Kimi, DeepSeek, Bedrock, and Ollama backends.
_call_azure()— dedicated Azure dispatch usingAzureOpenAISDK client (requiresazure_endpoint+api_version, distinct from the shared_call_openai_compat()path)BACKENDS["azure"]— config entry with env-driven deployment name, GPT-4o pricing defaults, andapi_versionfallbackdetect_backend()— auto-detects Azure when bothAZURE_OPENAI_API_KEYandAZURE_OPENAI_ENDPOINTare setextract_files_direct()and_call_llm()— both dispatch sites updated with Azure branches