# Nebius

[Nebius AI Studio](https://studio.nebius.ai/) provides API access to a wide range of state-of-the-art large language models and embedding models for various use cases.

This notebook shows how to use LangChain with Nebius AI Studio models for chat, embeddings, retrieval, and agent tools.

## Installation

In [None]:
%pip install --upgrade langchain-nebius

## Environment

To use Nebius AI Studio, you'll need an API key which you can obtain from [Nebius AI Studio](https://studio.nebius.ai/). The API key can be passed as an initialization parameter `api_key` or set as the environment variable `NEBIUS_API_KEY`.


In [2]:
import os

# Make sure you've set your API key as an environment variable
os.environ["NEBIUS_API_KEY"] = "YOUR-NEBIUS-API-KEY"

### Available Models

The full list of supported models can be found in the [Nebius AI Studio Documentation](https://studio.nebius.com/).

## Chat Models

In [3]:
from langchain_nebius import ChatNebius

# Initialize the chat model
chat = ChatNebius(
    # api_key="YOUR_API_KEY",  # You can pass the API key directly
    model="Qwen/Qwen3-14B",  # Choose from available models
    temperature=0.6,
    top_p=0.95,
)

# Generate a response
response = chat.invoke("Explain quantum computing in simple terms")
print(response.content)

<think>
Okay, the user wants me to explain quantum computing in simple terms. Let me start by recalling what I know about quantum computing. It's a type of computing that uses quantum bits or qubits instead of classical bits. But how do I make that simple?

First, I should compare it to classical computers. Regular computers use bits that are either 0 or 1. Qubits can be 0, 1, or both at the same time thanks to superposition. That's a key point. Maybe use an analogy, like a spinning coin that's both heads and tails until it lands.

Then there's entanglement. When qubits are entangled, the state of one instantly affects the other, no matter the distance. That's a bit tricky, but maybe use the example of two coins that are linked, so flipping one affects the other instantly.

Quantum gates manipulate qubits, similar to logic gates in classical computers but with more possibilities. But I need to keep it simple, not too technical.

Applications are important too. Quantum computers could s

In [4]:
# Streaming responses
for chunk in chat.stream("Write a short poem about artificial intelligence"):
    print(chunk.content, end="", flush=True)

<think>
Okay, the user wants a short poem about artificial intelligence. Let me start by thinking about the key aspects of AI. There's the contrast between human and machine, the idea of learning and processing data, maybe some themes about the future or ethical considerations.

I should use imagery related to circuits, code, maybe something like neurons or networks. Rhyming scheme? Maybe a simple ABAB pattern to keep it flowing. Let me think of some lines. Start with something like "In circuits deep where silence speaks," to evoke the hidden, complex world of AI. Then mention learning from data, "A mind of code, yet learns to dream," to show the paradox of AI having a mind made of code but capable of learning.

Next stanza could touch on the duality of AI, like "It calculates the stars' cold dance," showing its capability in vast calculations, then contrast with "Yet aches to grasp the rose's chance," implying a longing for human experiences. Then maybe address the ethical side: "Can 

For a more detailed walkthrough of the ChatNebius component, see [this notebook](https://python.langchain.com/docs/integrations/chat/nebius/).

## Embedding Models

In [5]:
from langchain_nebius import NebiusEmbeddings

# Initialize embeddings
embeddings = NebiusEmbeddings(
    # api_key="YOUR_API_KEY",  # You can pass the API key directly
    model="BAAI/bge-en-icl"  # Default embedding model
)

# Get embeddings for a query
query_embedding = embeddings.embed_query("What is machine learning?")
print(f"Embedding dimension: {len(query_embedding)}")
print(f"First few values: {query_embedding[:5]}")

# Get embeddings for documents
document_embeddings = embeddings.embed_documents(
    [
        "Machine learning is a subfield of AI",
        "Natural language processing is fascinating",
    ]
)
print(f"Number of document embeddings: {len(document_embeddings)}")

Embedding dimension: 4096
First few values: [0.007419586181640625, 0.002246856689453125, 0.00193023681640625, -0.0066070556640625, -0.0179901123046875]
Number of document embeddings: 2


For a more detailed walkthrough of the NebiusEmbeddings component, see [this notebook](https://python.langchain.com/docs/integrations/text_embedding/nebius/).

## Retrievers

In [6]:
from langchain_core.documents import Document
from langchain_nebius import NebiusEmbeddings, NebiusRetriever

# Create sample documents
docs = [
    Document(page_content="Paris is the capital of France"),
    Document(page_content="Berlin is the capital of Germany"),
    Document(page_content="Rome is the capital of Italy"),
    Document(page_content="Madrid is the capital of Spain"),
]

# Initialize embeddings
embeddings = NebiusEmbeddings()

# Create retriever
retriever = NebiusRetriever(
    embeddings=embeddings, docs=docs, k=2  # Number of documents to return
)

# Retrieve relevant documents
results = retriever.invoke("What is the capital of France?")
for doc in results:
    print(doc.page_content)

Paris is the capital of France
Rome is the capital of Italy


For a more detailed walkthrough of the NebiusRetriever component, see [this notebook](https://python.langchain.com/docs/integrations/retrievers/nebius/).

## Tools

In [7]:
from langchain_nebius import NebiusEmbeddings, NebiusRetriever, NebiusRetrievalTool
from langchain_core.documents import Document

# Create sample documents
docs = [
    Document(page_content="Paris is the capital of France and has the Eiffel Tower"),
    Document(
        page_content="Berlin is the capital of Germany and has the Brandenburg Gate"
    ),
    Document(page_content="Rome is the capital of Italy and has the Colosseum"),
    Document(page_content="Madrid is the capital of Spain and has the Prado Museum"),
]

# Create embeddings and retriever
embeddings = NebiusEmbeddings()
retriever = NebiusRetriever(embeddings=embeddings, docs=docs)

# Create retrieval tool
tool = NebiusRetrievalTool(
    retriever=retriever,
    name="nebius_search",
    description="Search for information about European capitals",
)

# Use the tool
result = tool.invoke({"query": "What is in Paris?", "k": 1})
print(result)

Document 1:
Paris is the capital of France and has the Eiffel Tower



In [8]:
# Using with an agent
from langchain.agents import create_openai_functions_agent, AgentExecutor
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate

# Only run this if you have OpenAI API key
try:
    # Create an LLM for the agent
    llm = ChatOpenAI(temperature=0)

    # Create a prompt template
    prompt = ChatPromptTemplate.from_messages(
        [
            (
                "system",
                "You are a helpful assistant that answers questions about European capitals.",
            ),
            ("user", "{input}"),
        ]
    )

    # Create the agent
    agent = create_openai_functions_agent(llm, [tool], prompt)
    agent_executor = AgentExecutor(agent=agent, tools=[tool], verbose=True)

    # Run the agent
    response = agent_executor.invoke({"input": "What famous landmark is in Paris?"})
    print(f"\nFinal answer: {response['output']}")
except Exception as e:
    print(f"Skipped agent example: {e}")

Skipped agent example: `ChatOpenAI` is not fully defined; you should define `BaseCache`, then call `ChatOpenAI.model_rebuild()`.

For further information visit https://errors.pydantic.dev/2.11/u/class-not-fully-defined


## Building a RAG Application

In [9]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_nebius import ChatNebius, NebiusEmbeddings, NebiusRetriever

# Initialize components
embeddings = NebiusEmbeddings()
retriever = NebiusRetriever(embeddings=embeddings, docs=docs)
llm = ChatNebius(model="meta-llama/Llama-3.3-70B-Instruct-fast")

# Create prompt
prompt = ChatPromptTemplate.from_template(
    """
Answer the question based only on the following context:

Context:
{context}

Question: {question}
"""
)


# Format documents function
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


# Create RAG chain
rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

# Run the chain
answer = rag_chain.invoke("What famous landmark is in Paris?")
print(answer)

The Eiffel Tower is a famous landmark in Paris.


## API Reference

For more details about the Nebius AI Studio API, visit the [Nebius AI Studio Documentation](https://studio.nebius.com/api-reference).