In [1]:
import jupyter_black # type: ignore

from baml_client.sync_client import b
from baml_py import Collector # type: ignore
from dotenv import load_dotenv # type: ignore
from notebooks._utils import print_used_model

jupyter_black.load()
load_dotenv()

c = Collector()
b = b.with_options(collector=c)

# How to use LLM Clients effectively

### 1. How to conveniently route any of your LLM calls to any LLM provider at runtime

Let's say you have explicitly stated the provider and model you use with a function:

In [2]:
# This command shows file contents
!awk '/^class LLMClientConfig_ExtractedCity/,/^}/' ../baml_src/notebooks/01_baml_llm_client_config.baml
!awk '/^function LLMClientConfig_ExtractCity/,/^}/' ../baml_src/notebooks/01_baml_llm_client_config.baml

class LLMClientConfig_ExtractedCity {
    city string @description(#"Unabbreviated city name without the state"#)
}
function LLMClientConfig_ExtractCity(text: string) -> LLMClientConfig_ExtractedCity {
  client "openai/gpt-4.1-nano"
  prompt #"
    Extract: 
    "{{ text }}". 
    {{ ctx.output_format }}
  "#
}


This is what the prompt will look like:

In [3]:
from baml_agents import display_prompt

request = b.request.LLMClientConfig_ExtractCity("I'm John from Philly")
display_prompt(request)

[system]
Extract: 
"I'm John from Philly". 
Answer in JSON using this schema:
{
  // Unabbreviated city name without the state
  city: string,
}


This is changing the client used to call the LLM::

In [4]:
answer = b.LLMClientConfig_ExtractCity("I'm John from Philly")
print_used_model(c.last)
print(answer.city)

Used model: gpt-4.1-nano-2025-04-14
Philadelphia


You can easily replace it:

In [5]:
from baml_agents import with_model

b = with_model(b, "gpt-4.1-mini")

city = b.LLMClientConfig_ExtractCity("I'm John from Philly")
print_used_model(c.last)
print(answer.city)

Used model: gpt-4.1-mini-2025-04-14
Philadelphia


You can also replace with a predefined client. Let's say you have:

In [6]:
# This command shows file contents
!awk '/^client<llm> GPT41/,/^}/' ../baml_src/notebooks/01_baml_llm_client_config.baml

In [7]:
from baml_agents import with_client

b = with_client(
    b,
    provider="openai",
    options={"model": "gpt-4.1"},
)

city = b.LLMClientConfig_ExtractCity("I'm John from Philly")
print_used_model(c.last)
print(answer.city)

Used model: gpt-4.1-2025-04-14
Philadelphia


### 2. What to do if `baml` doesn't support an LLM tracing integration you need (e.g. Langfuse, LangSmith, etc.)

#### Option 1:
- Download LLM Proxy such as [OpenRouter](https://openrouter.ai/), [LiteLLM](https://www.litellm.ai/), etc.
- Configure it to use the tracing integration you need
- Start it, it will become available with some URL, such as `http://localhost:8080`
- Set the environment variable `OPENAI_API_BASE` to that URL

You'll then be able to switch with just a single line of code change:

#### Option 2 (Advanced):
- Use [baml Modular API](https://docs.boundaryml.com/guide/baml-advanced/modular-api) to directly get the request and response objects. You would then proceed to do integration yourself.

### 3. What to do if `baml` doesn't support the LLM provider you need (e.g. IBM WatsonX AI, etc.)

#### Option 1:
- Download LLM Proxy such as [OpenRouter](https://openrouter.ai/), [LiteLLM](https://www.litellm.ai/), etc.
- Configure it to use the models you need
- Start it, it will become available with some URL, such as `http://localhost:8080`
- Set the environment variable `OPENAI_API_BASE` to that URL

You'll then be able to switch with just a single line of code change:

In [8]:
b = with_model(b, "gpt-4")
b = with_model(b, "claude-3-opus-20240229")
b = with_model(b, "claude-3-opus-20240229-with-thinking-enabled")
b = with_model(b, "claude-3-opus-20240229-another-api-key")
b = with_model(b, "TinyLlama/TinyLlama-1.1B-Chat-v1.0")
b = with_model(b, "NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO")
b = with_model(b, "WizardLM/WizardCoder-Python-13B-V1.0")
b = with_model(b, "my-localhost-model")
# etc.


#### Option 2 (Advanced):
- Use [baml Modular API](https://docs.boundaryml.com/guide/baml-advanced/modular-api) to directly get the request and response objects. You would then proceed to do integration yourself.

### 3. What to do I can't get `baml` to find the environment variables (e.g. OPENAI_API_KEY, etc.) or I don't want to use them


Use the [Client Registry](https://docs.boundaryml.com/ref/baml_client/client-registry) to register your client with the API key and other parameters you need. You can also take a look at `with_model()` implementation details to use as an example.