# Running RAG Completion with Nebius LLM and Embedding Models

This notebook demonstrates how to build a **Retrieval-Augmented Generation (RAG)** system using Nebius AI. Nebius AI provides access to a variety of state-of-the-art LLM models. You can check out the full list of available models [here](https://studio.nebius.ai/).

Visit [Nebius AI Studio](https://studio.nebius.ai/) and sign up to obtain an API key.

## Installation of Required Libraries

In [None]:
%pip install llama-index-llms-nebius llama-index-embeddings-nebius

In [None]:
!pip install -U llama-index

## Setting Up Environment Variables



In [None]:
# set api key in env or in llm
import os
os.environ["NEBIUS_API_KEY"] = "your api key"


## Importing Required Modules

We will import the necessary modules from llama-index to work with Nebius LLM and embeddings.

In [None]:
from llama_index.core import SimpleDirectoryReader,Settings, VectorStoreIndex
from llama_index.embeddings.nebius import NebiusEmbedding
from llama_index.llms.nebius import NebiusLLM

## Defining a Function for RAG Completion

This function initializes the Nebius LLM and embedding models, loads documents, creates an index, and retrieves relevant information based on the query.

Runs retrieval-augmented generation (RAG) using Nebius models.
    
Parameters:
  - document_dir (str): Path to the directory containing documents.
  - query_text (str): The user query for which relevant information needs to be retrieved.
  - embedding_model (str): The embedding model to use.
  - generative_model (str): The generative model to use.
    
Returns:
  - str: The generated response based on retrieved documents.

In [None]:
# Provide a template following the LLM's original chat template.
def completion_to_prompt(completion: str) -> str:
  return f"<s>[INST] {completion} [/INST] </s>\n"


def run_rag_completion(
    document_dir: str,
    query_text: str,
    embedding_model: str ="BAAI/bge-en-icl",
    generative_model: str ="deepseek-ai/DeepSeek-V3"
    ) -> str:

    llm = NebiusLLM(
    model=generative_model,
    api_key=os.getenv("NEBIUS_API_KEY")
    )

    embed_model = NebiusEmbedding(
        model_name=embedding_model,
        api_key=os.getenv("NEBIUS_API_KEY")
    )
    Settings.llm = llm
    Settings.embed_model = embed_model
    documents = SimpleDirectoryReader(document_dir).load_data()
    index = VectorStoreIndex.from_documents(documents)
    response = index.as_query_engine(similarity_top_k=5).query(query_text)

    return str(response)

## Running the RAG Completion Process

We specify the document directory and the query text, then execute the `run_rag_completion` function

In [None]:
query_text = "Give me all the details of the invoice in short"
document_dir = "./data"

response = run_rag_completion(document_dir, query_text)
print(response)