# LangExtract + Azure OpenAI: Quickstart

This notebook shows a minimal example of using the `langextract-azureopenai` provider.

Prerequisites:
- Install the package (editable is fine): `pip install -e .` from the project root
- Set credentials in your environment:
  - `export AZURE_OPENAI_API_KEY="<your-key>"`
  - `export AZURE_OPENAI_ENDPOINT="https://<your-endpoint>.openai.azure.com/"`
  - `export AZURE_OPENAI_API_VERSION="2024-02-15-preview"`


In [1]:
import os

import langextract as lx

# Ensure the provider class is importable and registered
from langextract_azureopenai import AzureOpenAILanguageModel  # noqa

# Read credentials from environment (recommended)
api_key = os.getenv('AZURE_OPENAI_API_KEY')
endpoint = os.getenv('AZURE_OPENAI_ENDPOINT')
api_version = os.getenv('AZURE_OPENAI_API_VERSION')
# Validate credentials
if not api_key or not endpoint or not api_version:
    raise RuntimeError(
        'Please set AZURE_OPENAI_API_KEY, AZURE_OPENAI_ENDPOINT, and AZURE_OPENAI_API_VERSION before running.'
    )

print('✅ Credentials detected')


RuntimeError: Please set AZURE_OPENAI_API_KEY, AZURE_OPENAI_ENDPOINT, and AZURE_OPENAI_API_VERSION before running.

## Simple Chat Completion

Create a provider via LangExtract's factory and run a basic prompt.

In [None]:
config = lx.factory.ModelConfig(
    model_id='azureopenai-gpt-4.1',
    provider='AzureOpenAILanguageModel',
    provider_kwargs={
        'api_key': api_key,
        'azure_endpoint': endpoint,
        'api_version': api_version,
    },
)
model = lx.factory.create_model(config)
print(f'✓ Created provider: {type(model).__name__}')

prompts = ['Say hello from Azure OpenAI.']
results = list(model.infer(prompts))
print(results[0][0].output)


✓ Created provider: AzureOpenAILanguageModel
Hello from Azure OpenAI! How can I assist you today?


## Structured Extraction (Optional)

A minimal extraction example using `lx.extract` with examples.

In [None]:
prompt = 'Extract people, organizations, and locations from the text.'
examples = [
    lx.data.ExampleData(
        text='John Smith works at Microsoft in Seattle.',
        extractions=[
            lx.data.Extraction(extraction_class='person', extraction_text='John Smith', attributes={'role': 'employee'}),
            lx.data.Extraction(extraction_class='organization', extraction_text='Microsoft', attributes={'type': 'company'}),
            lx.data.Extraction(extraction_class='location', extraction_text='Seattle', attributes={'type': 'city'}),
        ],
    )
]

text = 'Sarah Johnson is the CEO of TechCorp in San Francisco.'
# Use explicit config to pass credentials in a library-compatible way
config2 = lx.factory.ModelConfig(
    model_id='azureopenai-gpt-4.1',
    provider='AzureOpenAILanguageModel',
    provider_kwargs={
        'api_key': api_key,
        'azure_endpoint': endpoint,
        'api_version': api_version,
    },
)

result = lx.extract(
    text_or_documents=text,
    prompt_description=prompt,
    examples=examples,
    config=config2,
)
for e in result.extractions:
    print(e.extraction_class, '->', e.extraction_text, e.attributes)


  result = lx.extract(
2025-08-19 09:54:01,498 - langextract.debug - DEBUG - [langextract.inference] CALL: BaseLanguageModel.__init__(self=<AzureOpenAILanguageModel>, constraint=None, kwargs={})
2025-08-19 09:54:01,498 - langextract.debug - DEBUG - [langextract.inference] RETURN: BaseLanguageModel.__init__ -> None (0.0 ms)
2025-08-19 09:54:01,519 - langextract.debug - DEBUG - [langextract.inference] CALL: BaseLanguageModel.apply_schema(self=<AzureOpenAILanguageModel>, schema_instance=<langextract_...t 0x10e441940>)
2025-08-19 09:54:01,519 - langextract.debug - DEBUG - [langextract.inference] RETURN: BaseLanguageModel.apply_schema -> None (0.0 ms)
DEBUG:absl:Initialized Annotator with prompt:
Extract people, organizations, and locations from the text.

Examples
Q: John Smith works at Microsoft in Seattle.
A: {
  "extractions": [
    {
      "person": "John Smith",
      "person_attributes": {
        "role": "employee"
      }
    },
    {
      "organization": "Microsoft",
      "organ

[92m✓[0m Extraction processing complete



INFO:absl:Finalizing annotation for document ID doc_fdd4f06a.
INFO:absl:Document annotation completed.


[92m✓[0m Extracted [1m3[0m entities ([1m3[0m unique types)
  [96m•[0m Time: [1m1.01s[0m
  [96m•[0m Speed: [1m54[0m chars/sec
  [96m•[0m Chunks: [1m1[0m
person -> Sarah Johnson {'role': 'CEO'}
organization -> TechCorp {'type': 'company'}
location -> San Francisco {'type': 'city'}
