# ChatLlamaStack

This will help you getting started with Llama Stack [chat models](/docs/concepts/#chat-models). For detailed documentation of all ChatLlamaStack features and configurations head to the [API reference](https://api.python.langchain.com/en/latest/chat_models/langchain_llama_stack.chat_models.ChatLlamaStack.html).

## Overview
### Integration details

| Class | Package | Local | Serializable | [JS support](https://js.langchain.com/docs/integrations/chat/llama_stack) | Package downloads | Package latest |
| :--- | :--- | :---: | :---: |  :---: | :---: | :---: |
| [ChatLlamaStack](https://api.python.langchain.com/en/latest/chat_models/langchain_llama_stack.chat_models.ChatLlamaStack.html) | [langchain-llama-stack](https://api.python.langchain.com/en/latest/llama_stack_api_reference.html) | ✅ | ❌ | ❌ | ![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain-llama-stack?style=flat-square&label=%20) | ![PyPI - Version](https://img.shields.io/pypi/v/langchain-llama-stack?style=flat-square&label=%20) |

### Model features
| [Tool calling](/docs/how_to/tool_calling) | [Structured output](/docs/how_to/structured_output/) | JSON mode | [Image input](/docs/how_to/multimodal_inputs/) | Audio input | Video input | [Token-level streaming](/docs/how_to/chat_streaming/) | Native async | [Token usage](/docs/how_to/chat_token_usage_tracking/) | [Logprobs](/docs/how_to/logprobs/) |
| :---: | :---: | :---: | :---: |  :---: | :---: | :---: | :---: | :---: | :---: |
| ✅ | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ✅ | ✅ | 

## Setup

Install the `langchain-llama-stack` integration package.

### Credentials

Uncomment and use this if your Llama Stack distribution requires an API key.

In [None]:
# import getpass
# import os

# if not os.getenv("LLAMA_STACK_API_KEY"):
#     os.environ["LLAMA_STACK_API_KEY"] = getpass.getpass("Enter your Llama Stack API key: ")

If you want to get automated tracing of your model calls you can also set your [LangSmith](https://docs.smith.langchain.com/) API key by uncommenting below:

In [None]:
# os.environ["LANGCHAIN_TRACING_V2"] = "true"
# os.environ["LANGCHAIN_API_KEY"] = getpass.getpass("Enter your LangSmith API key: ")

### Installation

The LangChain Llama Stack integration lives in the `langchain-llama-stack` package:

In [None]:
%pip install -qU langchain-llama-stack

## Instantiation

Now we can instantiate our model object and generate chat completions:

In [3]:
from langchain_llama_stack import ChatLlamaStack

llm = ChatLlamaStack(
    model="meta/llama-3.1-8b-instruct",
    base_url="http://localhost:8321",
)

## Invocation

In [4]:
messages = [
    (
        "system",
        "You are a helpful assistant that translates English to French. Translate the user sentence.",
    ),
    ("human", "I love programming."),
]
ai_msg = llm.invoke(messages)
ai_msg

AIMessage(content='Here is the translation:\n\nJ\'adore le développement de logiciels.\n\n(Note: "de logiciels" is a more accurate translation of "programming" than just "le développement", which could also mean general project management or planning. "Programming" specifically refers to the activity of coding and software development.)', additional_kwargs={}, response_metadata={'stop_reason': 'end_of_turn'}, id='run-ee8c2938-03e4-4da7-a946-19821e1c1cb7-0')

In [5]:
print(ai_msg.content)

Here is the translation:

J'adore le développement de logiciels.

(Note: "de logiciels" is a more accurate translation of "programming" than just "le développement", which could also mean general project management or planning. "Programming" specifically refers to the activity of coding and software development.)


## Chaining

We can [chain](/docs/how_to/sequence/) our model with a prompt template like so:

In [6]:
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate(
    [
        (
            "system",
            "You are a helpful assistant that translates {input_language} to {output_language}.",
        ),
        ("human", "{input}"),
    ]
)

chain = prompt | llm
chain.invoke(
    {
        "input_language": "English",
        "output_language": "German",
        "input": "I love programming.",
    }
)

AIMessage(content='Ich liebe Programmieren.\n\nWould you like me to help with any specific programming-related topics or would you like more translation?', additional_kwargs={}, response_metadata={'stop_reason': 'end_of_turn'}, id='run-b2842f34-de35-451d-a908-0ff3a768d902-0')

## API reference

For detailed documentation of all ChatLlamaStack features and configurations head to the API reference: https://api.python.langchain.com/en/latest/chat_models/langchain_llama_stack.chat_models.ChatLlamaStack.html