<a target="_blank" href="https://colab.research.google.com/github/UpstageAI/cookbook/blob/main/Solar-Fullstack-LLM-101/09_9_RAG_API.ipynb">
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

# Upstage RAG API (experimental)



## Environment

In [2]:
!pip install -qU langchain-upstage langchain python-dotenv openai rich


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.3.1[0m[39;49m -> [0m[32;49m24.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


### Environment variables

Set up environment variables 
* UPSTAGE_API_KEY


In [3]:
# @title set API key
from pprint import pprint
from rich import print as rprint

import os

import warnings

warnings.filterwarnings("ignore")

if "google.colab" in str(get_ipython()):
    # Running in Google Colab. Please set the UPSTAGE_API_KEY in the Colab Secrets
    from google.colab import userdata

    os.environ["UPSTAGE_API_KEY"] = userdata.get("UPSTAGE_API_KEY")
else:
    # Running locally. Please set the UPSTAGE_API_KEY in the .env file
    from dotenv import load_dotenv

    load_dotenv()

assert (
    "UPSTAGE_API_KEY" in os.environ
), "Please set the UPSTAGE_API_KEY environment variable"

In [14]:
rag_base_url = "https://experimental-api.x.upstage.ai/rag-api/"

# Plase chose your model name
rag_model_name = "sung-pro-1001"

In [24]:
from openai import OpenAI


client = OpenAI(base_url=rag_base_url, api_key=os.environ["UPSTAGE_API_KEY"])

print("Building/Rebuilding the index...")
directory = "pdfs"
# Upload all files in the pdfs folder
for file in os.listdir(directory):
    file_path = os.path.join(directory, file)
    with open(file_path, "rb") as f:
        file_response = client.files.create(
            file=f, purpose="assistants", extra_body={"model_name": rag_model_name}
        )
        rprint(file_response)

print("Index build/rebuild submission completed.")

public_model_name = file_response.public_model_name
print("Public Mode Name:", public_model_name)

Building/Rebuilding the index...


Index build/rebuild submission completed.
Public Mode Name: 36127e427ea30842d7f1


In [16]:
# Import necessary libraries
from langchain_upstage import ChatUpstage as Chat

# Initialize the ChatUpstage model
# Use the previously defined rag_model_name and rag_base_url
chat = Chat(model=rag_model_name, base_url=rag_base_url)

# Send a query to the chat model
question = "What's the Document AI?"
response = chat.invoke(question)

# Print the response content in a formatted way
rprint(response)

# Upstage RAG-API Configuration Options

The `Chat` object is initialized with several options to customize its behavior:

1. `api_key`: The API key for authentication. Get from https://console.upstage.ai/

2. `model`: The name of the RAG (Retrieval-Augmented Generation) model to be used.

3. `base_url`: The base URL for the RAG API.

4. `extra_body`: A dictionary containing additional parameters for the API request:

   - `hybrid_search` (bool): When set to `True`, enables a combination of semantic and keyword-based search for improved retrieval accuracy.
   
   - `contextual_query` (bool): If `True`, the system considers the context of previous interactions when processing the current query.
   
   - `contextual_chunk` (bool): When enabled, the system retrieves and processes information in context-aware chunks, potentially improving relevance.
   
   - `knowledge_graph` (bool): If set to `True`, the system utilizes a knowledge graph to enhance understanding and connections between concepts.
   
   - `kv_pairs` (bool): When enabled, the system extracts and utilizes key-value pairs from the input, which can be useful for structured data processing.
   
   - `llm_model_name` (str): Specifies the name of the language model to be used, in this case "solar-pro".

These options allow for fine-tuning of the RAG system's behavior, potentially improving the relevance and accuracy of responses based on the specific use case and requirements.

In [17]:
# Import necessary libraries
from langchain_upstage import ChatUpstage as Chat

# Initialize the ChatUpstage model
# Use the previously defined rag_model_name and rag_base_url
chat = Chat(
    api_key=os.environ["UPSTAGE_API_KEY"],
    model=rag_model_name,
    base_url=rag_base_url,
    extra_body={
        "hybrid_search": True,
        "contextual_query": True,
        "contextual_chunk": True,
        "knowledge_graph": True,
        "kv_pairs": True,
        "llm_model_name": "solar-pro",
    },
)

# Send a query to the chat model
question = "What's the Document AI?"
response = chat.invoke(question)

# Print the response content in a formatted way
rprint(response)

In [18]:
from openai import OpenAI

# Initialize the OpenAI client with custom base URL and API key
client = OpenAI(
    base_url=rag_base_url,  # Using the custom RAG API base URL
    api_key=os.environ[
        "UPSTAGE_API_KEY"
    ],  # Using the API key from environment variables
)

# Send a request to the RAG model
response = client.chat.completions.create(
    model=rag_model_name,  # Using the previously defined RAG model name
    messages=[{"role": "user", "content": "What's Solar 10.7B LLM MMLU?"}],  # The user's query
    stream=False,  # Disable streaming for synchronous response
    extra_body={
        "hybrid_search": True,  # Enable hybrid search for better results
        "contextual_chunk": False,
        "contextual_query": True,
        "knowledge_graph": False,
        "kv_pairs": False,
        "verbose": True,  # Request verbose output for debugging
    },
)

# Print the model's response
rprint(response)

In [19]:
# Contextual Query

from openai import OpenAI

# Initialize the OpenAI client with custom base URL and API key
client = OpenAI(
    base_url=rag_base_url,  # Using the custom RAG API base URL
    api_key=os.environ[
        "UPSTAGE_API_KEY"
    ],  # Using the API key from environment variables
)
# Send a request to the RAG model
response = client.chat.completions.create(
    model=rag_model_name,  # Using the previously defined RAG model name
    messages=[
        {"role": "user", "content": "What's the benefits of Document AI?"},
        {"role": "assistant", "content": "It can help you understand the Document AI"},
        {"role": "user", "content": "How about its features?"},  # The user's query
    ],
    stream=False,  # Disable streaming for synchronous response
    extra_body={
        "contextual_query": True,
        "contextual_chunk": False,
        "knowledge_graph": False,
        "kv_pairs": False,
        "verbose": True,  # Request verbose output for debugging
    },
)

# Print the model's response
rprint(response)

In [20]:
from openai import OpenAI

# Initialize the OpenAI client with custom base URL and API key
client = OpenAI(
    base_url=rag_base_url,  # Using the custom RAG API base URL
    api_key=os.environ[
        "UPSTAGE_API_KEY"
    ],  # Using the API key from environment variables
)

# Send a request to the RAG model
response = client.chat.completions.create(
    model=rag_model_name,  # Using the previously defined RAG model name
    messages=[{"role": "user", "content": "Explain Document AI"}],  # The user's query
    stream=False,  # Disable streaming for synchronous response
    extra_body={
        "hybrid_search": True,  # Enable hybrid search for better results
        "contextual_query": True,  # Enable contextual querying
        "knowledge_graph": True,
        "kv_pairs": True,
        "verbose": True,  # Request verbose output for debugging
    },
)

# Print the model's response
rprint(response)

In [21]:
# Initialize the chat model using the public model name (accessible to everyone)
chat = Chat(model=public_model_name, base_url=rag_base_url)

# Define the question about Document AI and its usage
question = "What's Document AI? How can it be used?"

# Invoke the chat model with the question
response = chat.invoke(question)
rprint(response)

In [22]:
# RAG API with history
from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.messages import HumanMessage
from langchain_core.output_parsers import StrOutputParser

prompt_template = ChatPromptTemplate.from_messages(
    [
        MessagesPlaceholder("chat_history"),
        ("user", "{input}"),
    ]
)
chat = Chat(model=public_model_name, base_url=rag_base_url)

chain = prompt_template | chat

questions = [
    "What's the benefits of Document AI?",
    "How about its features?",
    "Why we need it?",
]

chat_history = []

for question in questions:
    print("Question: ", question)

    human_message = HumanMessage(content=question)
    response = chain.invoke({"input": question, "chat_history": chat_history})
    rprint(response)
    chat_history.append(human_message)
    chat_history.append(response)

# Print the chat history
print("Chat History:")
for chat in chat_history:
    rprint(chat)

Question:  What's the benefits of Document AI?


Question:  How about its features?


Question:  Why we need it?


Chat History:


In [23]:
# Public mode name is read-only
rprint(f"Try to upload a file to the public model...{public_model_name}")

# Expect 400 BadRequest Error
# If you want to upload a file to the public model, you need to use the private model name

try:
    file_response = client.files.create(
        file=open("pdfs/solar_paper.pdf", "rb"),
        purpose="assistants",
        extra_body={"model_name": public_model_name},
    )
except Exception as e:
    rprint(e)