APIM AI Gateway

Azure API Management placed in front of Azure OpenAI, so every call to the AI model goes through one controlled front door.

Project Overview

When you give an app direct access to an AI model, you hand it a key and hope for the best. There is no easy way to see who is calling, to stop one user from running up the bill, or to avoid paying for the same answer twice.

This project puts a gateway in the middle. Clients talk to the gateway, and the gateway talks to the model. Because everything passes through one place, three useful things become possible:

Keys you can revoke - Each client gets its own subscription key, and the real connection to the model stays with the gateway
Per-client usage limits - A free tier capped at 500 tokens per minute, a premium tier at 100,000
Caching - Repeated questions are answered from the gateway's memory, with no second call to the model and no second charge

A single request arrives with a subscription key, the gateway checks the tier's token budget, looks for a cached answer, and only calls the model if it has to. On the way back out, the answer is saved to the cache for next time.

How It Works

One key in, no keys out - The client sends only its APIM subscription key. The gateway authenticates to Azure OpenAI with a managed identity, so no model key is ever written into a script or shared with a client
Two tiers, two budgets - Token limits live in APIM policies attached to each product. The policy counts the tokens each call uses and adds them up per subscription, so one tier never eats into the other's budget
Exact-match caching - The cache keys on the exact text of the request. Asking "what is APIM" twice returns a cached answer the second time, but changing a single letter to "what is apim" counts as a new question

The test script exercises all three behaviours in one run.

Technology Stack

Gateway: Azure API Management
Model: Azure OpenAI running gpt-4.1-mini
Infrastructure: Bicep deployed via the Azure CLI
Authentication: Managed identity, so the gateway proves who it is without storing a key
Testing: Bash and curl

Project Structure

APIM + GenAI/
├── docs/              # Architecture diagram and screenshots
├── infra/
│   └── main.bicep     # Creates the APIM gateway
├── test-gateway.sh    # Drives the gateway to show each feature
└── README.md

Environment Setup

The test script reads the gateway URL and subscription keys from the environment, so no secrets are saved in the file. Set these before running:

export APIM_GATEWAY="https://<your-apim>.azure-api.net"
export APIM_FREE_KEY="<free-test subscription key>"
export APIM_PREMIUM_KEY="<premium-test subscription key>"

The deployment name and api-version default to gpt-4.1-mini and 2025-03-01-preview. Override them with APIM_DEPLOYMENT and APIM_API_VERSION if yours differ.

Running It

You need an Azure subscription, the Azure CLI installed, and a deployed gateway with a free and premium product set up.

# Deploy the gateway
az deployment group create \
  --resource-group <your-resource-group> \
  --template-file infra/main.bicep

# Run the test (after setting the environment variables above)
./test-gateway.sh

The free tier clears a couple of calls and then returns 429 (too many requests), the premium tier clears all of them, and an identical question comes back from the cache the second time it is asked.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

APIM AI Gateway

Project Overview

How It Works

Technology Stack

Project Structure

Environment Setup

Running It

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
docs		docs
infra		infra
README.md		README.md
test-gateway.sh		test-gateway.sh

Folders and files

Latest commit

History

Repository files navigation

APIM AI Gateway

Project Overview

How It Works

Technology Stack

Project Structure

Environment Setup

Running It

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages