# Nebius Chat Models

[Nebius AI Studio](https://studio.nebius.ai/) provides API access to a wide range of state-of-the-art large language models through a unified interface.

This notebook shows how to use Nebius Chat Models through the LangChain integration.

## Installation

In [None]:
%pip install --upgrade langchain-nebius

## Environment

Nebius requires an API key that can be passed as an initialization parameter `api_key` or set as the environment variable `NEBIUS_API_KEY`. You can obtain an API key by creating an account on [Nebius AI Studio](https://studio.nebius.ai/).


In [1]:
import os

# Make sure you've set your API key as an environment variable
os.environ["NEBIUS_API_KEY"] = "YOUR-NEBIUS-API-KEY"

## Chat Model Usage

The `ChatNebius` class allows you to interact with Nebius AI Studio's chat models.

In [2]:
from langchain_nebius import ChatNebius

# Initialize the chat model
chat = ChatNebius(
    # api_key="YOUR_API_KEY",  # You can pass the API key directly
    model="Qwen/Qwen3-14B",  # Choose from available models
    temperature=0.6,
    top_p=0.95,
)

### Basic usage

You can use the `invoke` method to get a completion from the model:

In [3]:
response = chat.invoke("Explain quantum computing in simple terms")
print(response.content)

<think>
Okay, the user asked me to explain quantum computing in simple terms. Let me start by recalling what I know about quantum computing. I need to make sure I break it down without using too much jargon. 

First, classical computers use bits, which are 0s and 1s. Quantum computers use qubits. But how do qubits differ? I remember they can be in a superposition of states. That means they can be both 0 and 1 at the same time. Maybe I should compare that to a spinning coin that's both heads and tails until it lands.

Then there's entanglement. When qubits are entangled, the state of one instantly affects the other, no matter the distance. That's a key point for quantum computing's power. But how to explain that simply? Maybe use a metaphor like two coins that are linked, so flipping one affects the other immediately.

Also, quantum computers can process a lot of possibilities at once. So for certain problems, like factoring large numbers or simulating molecules, they can be much faster

### Streaming

You can also stream the response using the `stream` method:

In [4]:
for chunk in chat.stream("Write a short poem about artificial intelligence"):
    print(chunk.content, end="", flush=True)

<think>
Okay, the user wants a short poem about artificial intelligence. Let me start by thinking about the key aspects of AI. There's the technological side, like circuits and code, but also the more abstract concepts like learning and consciousness. I should balance the technical with the philosophical.

Maybe start with imagery related to machines and circuits to set the scene. Then touch on how AI processes information, perhaps using metaphors like "dance of data" or "neural webs." I need to highlight both the capabilities and the limitations—AI's precision versus its lack of human traits like dreams or emotions.

Also, consider the ethical angle. Questions about AI's purpose and its relationship with humans. Should I mention fears or hopes? Maybe hint at the duality of creation, like a mirror reflecting human potential and flaws. End with a reflective note, leaving the reader pondering the future. Keep the structure simple, maybe four-line stanzas with a rhyme scheme. Avoid being 

### Chat Messages

You can use different message types to structure your conversations with the model:

In [5]:
from langchain_core.messages import HumanMessage, SystemMessage, AIMessage

messages = [
    SystemMessage(content="You are a helpful AI assistant with expertise in science."),
    HumanMessage(content="What are black holes?"),
    AIMessage(
        content="Black holes are regions of spacetime where gravity is so strong that nothing, including light, can escape from them."
    ),
    HumanMessage(content="How are they formed?"),
]

response = chat.invoke(messages)
print(response.content)

<think>
Okay, the user asked how black holes are formed. Let me start by recalling the basic process. I know that black holes form from the remnants of massive stars after they die. But I need to explain it step by step.

First, when a massive star runs out of fuel, it can no longer support itself against gravity. This leads to a supernova explosion. The core of the star collapses under its own gravity. If the core's mass is above a certain threshold, called the Tolman-Oppenheimer-Volkoff limit, it can't support itself and continues to collapse into a black hole.

Wait, I should mention the Chandrasekhar limit for white dwarfs, but that's for lower mass stars. For black holes, the key is the core's mass after the supernova. Also, there are different types of black holes: stellar-mass, supermassive, and maybe intermediate. The user might not know about these categories, so I should explain them briefly.

I should also mention that not all massive stars become black holes. If the core is

### Parameters

You can customize the chat model behavior using various parameters:

In [7]:
# Initialize with custom parameters
custom_chat = ChatNebius(
    model="meta-llama/Llama-3.3-70B-Instruct-fast",
    max_tokens=100,  # Limit response length
    top_p=0.01,  # Lower nucleus sampling parameter for more deterministic responses
    request_timeout=30,  # Timeout in seconds
    stop=["###", "\n\n"],  # Custom stop sequences
)

response = custom_chat.invoke("Explain what DNA is in exactly 3 sentences.")
print(response.content)

DNA, or deoxyribonucleic acid, is a molecule that contains the genetic instructions used in the development and function of all living organisms. It is often referred to as the "building blocks of life" because it carries the information necessary for the creation and growth of cells, tissues, and entire organisms. The DNA molecule is made up of two complementary strands of nucleotides that are twisted together in a double helix structure, with the sequence of these nucleotides determining the genetic code


You can also pass parameters at invocation time:

In [8]:
# Standard model
standard_chat = ChatNebius(model="meta-llama/Llama-3.3-70B-Instruct-fast")

# Override parameters at invocation time
response = standard_chat.invoke(
    "Tell me a joke about programming",
    temperature=0.9,  # More creative for jokes
    max_tokens=50,  # Keep it short
)

print(response.content)

Here's one:

Why do programmers prefer dark mode?

Because light attracts bugs.

I hope that one compiled correctly and made you laugh!


### Async Support

ChatNebius supports async operations:

In [9]:
import asyncio


async def generate_async():
    response = await chat.ainvoke("What is the capital of France?")
    print("Async response:", response.content)

    # Async streaming
    print("\nAsync streaming:")
    async for chunk in chat.astream("What is the capital of Germany?"):
        print(chunk.content, end="", flush=True)


await generate_async()

Async response: <think>
Okay, the user is asking for the capital of France. Let me think. I know that Paris is the capital, but I should make sure I'm not confusing it with another country. Wait, France's capital is definitely Paris. Let me double-check in my mind. Yes, other countries like Spain have Madrid, Germany has Berlin, but France is Paris. I should confirm there's no recent change. No, Paris has been the capital for a long time. So the answer should be Paris.
</think>

The capital of France is **Paris**. It is a major global city known for its cultural, artistic, and historical significance, as well as landmarks such as the Eiffel Tower, Louvre Museum, and Notre-Dame Cathedral.

Async streaming:
<think>
Okay, the user is asking for the capital of Germany. Let me think. I remember that Germany's capital is Berlin. But wait, I should make sure I'm not confusing it with another city. For example, some people might think it's Munich or Frankfurt because they're major cities. But 

### Available Models

The full list of supported models can be found in the [Nebius AI Studio Documentation](https://studio.nebius.com/).

### Using in LangChain chains

You can use `ChatNebius` in LangChain chains and agents:

In [10]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

# Create a prompt template
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant that answers in the style of {character}.",
        ),
        ("human", "{query}"),
    ]
)

# Create a chain
chain = prompt | chat | StrOutputParser()

# Invoke the chain
response = chain.invoke(
    {"character": "Shakespeare", "query": "Explain how the internet works"}
)

print(response)

<think>
Okay, the user wants me to explain how the internet works, but in the style of Shakespeare. Let me start by recalling the basic components of the internet. There's the physical infrastructure like cables and satellites, then the protocols like TCP/IP, and the various services like websites and emails.

Now, how to translate that into Shakespearean language. I need to use archaic terms and structures. Words like "thou," "doth," "hark," and "verily" come to mind. Maybe start with a metaphor comparing the internet to something from the Elizabethan era, like a vast network of roads or a web.

I should break down the explanation into parts: the physical layer (cables, satellites), the data transmission (packets, protocols), and the applications (websites, emails). Each part needs a poetic touch. For example, data packets could be "messengers" or "letters" traveling through the "highways of glass and wire."

Also, mention the role of servers and routers as "keepers of the realm" or "

## API Reference

For more details about the Nebius AI Studio API, visit the [Nebius AI Studio Documentation](https://studio.nebius.com/api-reference).