# Distributions

If you don’t need Llama Stack Server or it’s not available in your environment, you can keep working with Llama Stack capabilities. This approach allows you to directly utilize Llama Stack features without the overhead of managing a separate server process.

 A distribution is a pre-packaged set of Llama Stack components configured to work out of the box. You can use different default distributions as ollama, tgi or remote-vllm or create your own specifically for your needs.

For instance, when utilizing an inference service provided by vLLM, you may prefer it over a local Ollama instance. In such cases, configure Llama Stack to use the specific inference service by employing "remote-vllm" distribution

# Ollama Distribution

In [None]:
import os

from llama_stack_client import Agent
from llama_stack.distribution.library_client import LlamaStackAsLibraryClient

model = "meta-llama/Llama-3.2-3B-Instruct"
os.environ["INFERENCE_MODEL"] = model
client = LlamaStackAsLibraryClient('ollama')
client.initialize()
agent = Agent(
  client,
  model=model,
  instructions="You are an helpful agent",
)

response = agent.create_turn(
  messages=[{"role": "user", "content": "say Hi!"}],
  stream=False,
  session_id=agent.create_session('new_session'),
)
print(response.output_message.content)

# Remote VLLM Ditribution

In [None]:
import os

from llama_stack_client import Agent
from llama_stack.distribution.library_client import LlamaStackAsLibraryClient

model = "mistral-7b-instruct"
os.environ["INFERENCE_MODEL"] = model
os.environ["VLLM_API_TOKEN"] = "xxxx"
os.environ["VLLM_URL"] = "https://mistral-7b-instruct-v0-3-maas-services.com:443/v1"
client = LlamaStackAsLibraryClient('remote-vllm')
client.initialize()
agent = Agent(
  client,
  model=model,
  instructions="You are an helpful agent",
)

response = agent.create_turn(
  messages=[{"role": "user", "content": "say Hi!"}],
  stream=False,
  session_id=agent.create_session('new_session'),
)
print(response.output_message.content)