# LLMs

This page considers LangChain interfaces for LLMs.

In [1]:
import os
from langchain_ollama import ChatOllama

os.environ["HF_HUB_DISABLE_PROGRESS_BARS"] = "1"
model = ChatOllama(model="llama3.1", temperature=0)

## Invoke

The `invoke` method triggers the request to LLM.

In most cases, it returns an `AIMessage`, but in some special cases, it can return some special output. For example, structured output langchain object return `dict` or `Pydantic.BaseModel`.

---

The following cell shows the kind of object `langchain_core.language_models.BaseChatModel` heir returns.

In [7]:
model.invoke("Hello world")

AIMessage(content='Hello! How can I assist you today?', additional_kwargs={}, response_metadata={'model': 'llama3.1', 'created_at': '2026-01-06T14:39:17.313641499Z', 'done': True, 'done_reason': 'stop', 'total_duration': 23004167206, 'load_duration': 22818871591, 'prompt_eval_count': 12, 'prompt_eval_duration': 27074996, 'eval_count': 10, 'eval_duration': 141396789, 'model_name': 'llama3.1', 'model_provider': 'ollama'}, id='lc_run--fe6d8bde-d9c8-4e4c-a7f2-df3839f539a6-0', usage_metadata={'input_tokens': 12, 'output_tokens': 10, 'total_tokens': 22})

## Structured ouput

Some providers support the structured output. The model will return the data in the specified format.

To specify the model to follow the specified format, use the `with_strucutred_ouput` method. It returns the modified chat object that will follow specified rules. 

Check if the provider supports structured ouput in the JSON mode column of the [provided features](https://docs.langchain.com/oss/python/integrations/chat#featured-providers) section.

---

The following cell illustrates how the the user characteristics are extracted from the given text.

In [8]:
from pydantic import BaseModel


class OutputSchema(BaseModel):
    id: str
    name: str


structured_model = model.with_structured_output(OutputSchema)
response = structured_model.invoke(
    "Extract data: 'User llm_lover with id 777 tries to acess the database.'"
)
response

OutputSchema(id='777', name='llm_lover')

## Tokens

There is a set of methods in the `langchain_core.language_models.base.BaseLanguageModel` that allows to **estimate** the number of tokens that a piece of text will take:

- `get_token_ids`: returns the indeces of tokens.
- `get_num_tokens`: returns the number of tokens for given text. 
- `get_num_tokens_from_messages`: returns the number of tokens for given list of messages.

**Note.** The methods use a special default tokeniser, so the particular model will actually use different tokenizer.

---

The following cell defines two models wrapped with ollama.

In [None]:
deepseek = ChatOllama(model="deepseek-r1:1.5b")
llama = ChatOllama(model="llama3")

test_text = "This is some tricky text: olala"

The output of the `get_tokens_ids` method for different models.

In [57]:
print(deepseek.get_token_ids(test_text))
print(llama.get_token_ids(test_text))

[1212, 318, 617, 17198, 2420, 25, 25776, 6081]
[1212, 318, 617, 17198, 2420, 25, 25776, 6081]


The outputs are the same because the default tokenizer was used, despite the models use different tokenizers.

The following cell show the output of the `get_num_tokens` method.

In [59]:
deepseek.get_num_tokens(test_text)

8

And the application of the `get_num_tokens_from_messages` method.

In [62]:
from langchain_core import messages

deepseek.get_num_tokens_from_messages(
    [messages.HumanMessage(test_text)]
)

10

The output is higher because the number includes the service tokens that wrap the messages.