Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 30 additions & 1 deletion .github/workflows/e2e_tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,12 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
environment: [ "ci"]
environment: [ "ci", "azure"]
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
CLIENT_SECRET: ${{ secrets.CLIENT_SECRET }}
CLIENT_ID: ${{ secrets.CLIENT_ID }}
TENANT_ID: ${{ secrets.TENANT_ID }}

steps:
- uses: actions/checkout@v4
Expand Down Expand Up @@ -72,6 +75,32 @@ jobs:

authentication:
module: "noop"

- name: Get Azure API key (access token)
if: matrix.environment == 'azure'
id: azure_token
env:
CLIENT_ID: ${{ secrets.CLIENT_ID }}
CLIENT_SECRET: ${{ secrets.CLIENT_SECRET }}
TENANT_ID: ${{ secrets.TENANT_ID }}
run: |
echo "Requesting Azure API token..."
RESPONSE=$(curl -s -X POST \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "client_id=$CLIENT_ID&scope=https://cognitiveservices.azure.com/.default&client_secret=$CLIENT_SECRET&grant_type=client_credentials" \
"https://login.microsoftonline.com/$TENANT_ID/oauth2/v2.0/token")

echo "Response received. Extracting access_token..."
ACCESS_TOKEN=$(echo "$RESPONSE" | jq -r '.access_token')

if [ -z "$ACCESS_TOKEN" ] || [ "$ACCESS_TOKEN" == "null" ]; then
echo "❌ Failed to obtain Azure access token. Response:"
echo "$RESPONSE"
exit 1
fi

echo "✅ Successfully obtained Azure access token."
echo "AZURE_API_KEY=$ACCESS_TOKEN" >> $GITHUB_ENV

- name: Select and configure run.yaml
env:
Expand Down
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,8 @@ Lightspeed Core Stack (LCS) supports the large language models from the provider
| OpenAI | gpt-5, gpt-4o, gpt4-turbo, gpt-4.1, o1, o3, o4 | Yes | remote::openai | [1](examples/openai-faiss-run.yaml) [2](examples/openai-pgvector-run.yaml) |
| OpenAI | gpt-3.5-turbo, gpt-4 | No | remote::openai | |
| RHAIIS (vLLM)| meta-llama/Llama-3.1-8B-Instruct | Yes | remote::vllm | [1](tests/e2e/configs/run-rhaiis.yaml) |
| Azure | gpt-5, gpt-5-mini, gpt-5-nano, gpt-5-chat, gpt-4.1, gpt-4.1-mini, gpt-4.1-nano, o3-mini, o4-mini | Yes | remote::azure | [1](examples/azure-run.yaml) |
| Azure | o1, o1-mini | No | remote::azure | |

The "provider_type" is used in the llama stack configuration file when refering to the provider.

Expand Down
2 changes: 2 additions & 0 deletions docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ services:
- ./run.yaml:/opt/app-root/run.yaml:Z
environment:
- OPENAI_API_KEY=${OPENAI_API_KEY}
- AZURE_API_KEY=${AZURE_API_KEY}
- BRAVE_SEARCH_API_KEY=${BRAVE_SEARCH_API_KEY:-}
- TAVILY_SEARCH_API_KEY=${TAVILY_SEARCH_API_KEY:-}
- RHAIIS_URL=${RHAIIS_URL}
Expand All @@ -36,6 +37,7 @@ services:
- ./lightspeed-stack.yaml:/app-root/lightspeed-stack.yaml:Z
environment:
- OPENAI_API_KEY=${OPENAI_API_KEY}
- AZURE_API_KEY=${AZURE_API_KEY}
depends_on:
llama-stack:
condition: service_healthy
Expand Down
4 changes: 2 additions & 2 deletions docs/providers.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ The tables below summarize each provider category, containing the following atri
| meta-reference | inline | `accelerate`, `fairscale`, `torch`, `torchvision`, `transformers`, `zmq`, `lm-format-enforcer`, `sentence-transformers`, `torchao==0.8.0`, `fbgemm-gpu-genai==1.1.2` | ❌ |
| sentence-transformers | inline | `torch torchvision torchao>=0.12.0 --extra-index-url https://download.pytorch.org/whl/cpu`, `sentence-transformers --no-deps` | ❌ |
| anthropic | remote | `litellm` | ❌ |
| azure | remote | `itellm` | |
| azure | remote | | |
| bedrock | remote | `boto3` | ❌ |
| cerebras | remote | `cerebras_cloud_sdk` | ❌ |
| databricks | remote | — | ❌ |
Expand Down Expand Up @@ -287,4 +287,4 @@ Red Hat providers:

---

For a deeper understanding, see the [official llama-stack configuration documentation](https://llama-stack.readthedocs.io/en/latest/distributions/configuration.html).
For a deeper understanding, see the [official llama-stack providers documentation](https://llamastack.github.io/docs/providers).
128 changes: 128 additions & 0 deletions examples/azure-run.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
version: '2'
image_name: minimal-viable-llama-stack-configuration

apis:
- agents
- datasetio
- eval
- files
- inference
- post_training
- safety
- scoring
- telemetry
- tool_runtime
- vector_io
benchmarks: []
container_image: null
datasets: []
external_providers_dir: null
inference_store:
db_path: .llama/distributions/ollama/inference_store.db
type: sqlite
logging: null
metadata_store:
db_path: .llama/distributions/ollama/registry.db
namespace: null
type: sqlite
providers:
files:
- provider_id: localfs
provider_type: inline::localfs
config:
storage_dir: /tmp/llama-stack-files
metadata_store:
type: sqlite
db_path: .llama/distributions/ollama/files_metadata.db
agents:
- provider_id: meta-reference
provider_type: inline::meta-reference
config:
persistence_store:
db_path: .llama/distributions/ollama/agents_store.db
namespace: null
type: sqlite
responses_store:
db_path: .llama/distributions/ollama/responses_store.db
type: sqlite
datasetio:
- provider_id: huggingface
provider_type: remote::huggingface
config:
kvstore:
db_path: .llama/distributions/ollama/huggingface_datasetio.db
namespace: null
type: sqlite
- provider_id: localfs
provider_type: inline::localfs
config:
kvstore:
db_path: .llama/distributions/ollama/localfs_datasetio.db
namespace: null
type: sqlite
eval:
- provider_id: meta-reference
provider_type: inline::meta-reference
config:
kvstore:
db_path: .llama/distributions/ollama/meta_reference_eval.db
namespace: null
type: sqlite
inference:
- provider_id: azure
provider_type: remote::azure
config:
api_key: ${env.AZURE_API_KEY}
api_base: https://ols-test.openai.azure.com/
api_version: 2024-02-15-preview
api_type: ${env.AZURE_API_TYPE:=}
post_training:
- provider_id: huggingface
provider_type: inline::huggingface-gpu
config:
checkpoint_format: huggingface
device: cpu
distributed_backend: null
dpo_output_dir: "."
safety:
- provider_id: llama-guard
provider_type: inline::llama-guard
config:
excluded_categories: []
scoring:
- provider_id: basic
provider_type: inline::basic
config: {}
- provider_id: llm-as-judge
provider_type: inline::llm-as-judge
config: {}
- provider_id: braintrust
provider_type: inline::braintrust
config:
openai_api_key: '********'
telemetry:
- provider_id: meta-reference
provider_type: inline::meta-reference
config:
service_name: 'lightspeed-stack-telemetry'
sinks: sqlite
sqlite_db_path: .llama/distributions/ollama/trace_store.db
tool_runtime:
- provider_id: model-context-protocol
provider_type: remote::model-context-protocol
config: {}
scoring_fns: []
server:
auth: null
host: null
port: 8321
quota: null
tls_cafile: null
tls_certfile: null
tls_keyfile: null
shields: []
models:
- model_id: gpt-4o-mini
model_type: llm
provider_id: azure
provider_model_id: gpt-4o-mini
128 changes: 128 additions & 0 deletions tests/e2e/configs/run-azure.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
version: '2'
image_name: minimal-viable-llama-stack-configuration

apis:
- agents
- datasetio
- eval
- files
- inference
- post_training
- safety
- scoring
- telemetry
- tool_runtime
- vector_io
benchmarks: []
container_image: null
datasets: []
external_providers_dir: null
inference_store:
db_path: .llama/distributions/ollama/inference_store.db
type: sqlite
logging: null
metadata_store:
db_path: .llama/distributions/ollama/registry.db
namespace: null
type: sqlite
providers:
files:
- provider_id: localfs
provider_type: inline::localfs
config:
storage_dir: /tmp/llama-stack-files
metadata_store:
type: sqlite
db_path: .llama/distributions/ollama/files_metadata.db
agents:
- provider_id: meta-reference
provider_type: inline::meta-reference
config:
persistence_store:
db_path: .llama/distributions/ollama/agents_store.db
namespace: null
type: sqlite
responses_store:
db_path: .llama/distributions/ollama/responses_store.db
type: sqlite
datasetio:
- provider_id: huggingface
provider_type: remote::huggingface
config:
kvstore:
db_path: .llama/distributions/ollama/huggingface_datasetio.db
namespace: null
type: sqlite
- provider_id: localfs
provider_type: inline::localfs
config:
kvstore:
db_path: .llama/distributions/ollama/localfs_datasetio.db
namespace: null
type: sqlite
eval:
- provider_id: meta-reference
provider_type: inline::meta-reference
config:
kvstore:
db_path: .llama/distributions/ollama/meta_reference_eval.db
namespace: null
type: sqlite
inference:
- provider_id: azure
provider_type: remote::azure
config:
api_key: ${env.AZURE_API_KEY}
api_base: https://ols-test.openai.azure.com/
api_version: 2024-02-15-preview
api_type: ${env.AZURE_API_TYPE:=}
post_training:
Comment on lines +72 to +79
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Invalid env default syntax; parameterize api_base

  • llama-stack env substitution supports ${env.VAR}; the default form ${env.VAR:=} is likely unsupported and may be passed as a literal.
       config: 
         api_key: ${env.AZURE_API_KEY}
-        api_base: https://ols-test.openai.azure.com/
+        api_base: ${env.AZURE_API_BASE}
         api_version: 2024-02-15-preview
-        api_type: ${env.AZURE_API_TYPE:=}
+        api_type: ${env.AZURE_API_TYPE}

Set AZURE_API_BASE and (optionally) AZURE_API_TYPE in the environment (compose/workflow).

🤖 Prompt for AI Agents
In tests/e2e/configs/run-azure.yaml around lines 72 to 79, the file uses
unsupported env default syntax (${env.AZURE_API_TYPE:=}) and hardcodes api_base;
change api_base to use an environment variable (e.g. api_base:
${env.AZURE_API_BASE}) and replace the unsupported default form with a plain env
substitution (api_type: ${env.AZURE_API_TYPE}) or remove the api_type line if
optional, then ensure AZURE_API_BASE (and AZURE_API_TYPE if used) are set in the
environment/compose or workflow.

- provider_id: huggingface
provider_type: inline::huggingface-gpu
config:
checkpoint_format: huggingface
device: cpu
distributed_backend: null
dpo_output_dir: "."
safety:
- provider_id: llama-guard
provider_type: inline::llama-guard
config:
excluded_categories: []
scoring:
- provider_id: basic
provider_type: inline::basic
config: {}
- provider_id: llm-as-judge
provider_type: inline::llm-as-judge
config: {}
- provider_id: braintrust
provider_type: inline::braintrust
config:
openai_api_key: '********'
telemetry:
- provider_id: meta-reference
provider_type: inline::meta-reference
config:
service_name: 'lightspeed-stack-telemetry'
sinks: sqlite
sqlite_db_path: .llama/distributions/ollama/trace_store.db
tool_runtime:
- provider_id: model-context-protocol
provider_type: remote::model-context-protocol
config: {}
scoring_fns: []
server:
auth: null
host: null
port: 8321
quota: null
tls_cafile: null
tls_certfile: null
tls_keyfile: null
shields: []
models:
- model_id: gpt-4o-mini
model_type: llm
provider_id: azure
provider_model_id: gpt-4o-mini
Loading