# Calling Center of Excellence (CoE) Models

This notebook demonstrates how to use the `use_coe_model.py` script to call the Center of Excellence (CoE) models using different approaches. We'll explore three examples:

1. Using SambaVerse to call CoE Model
2. Using SambaStudio to call CoE with Named Expert
3. Using SambaStudio to call CoE with Routing

Before we begin, make sure you have the `use_coe_model.py` script in the same directory as this notebook.

Let's get started!

## Example 1: Using SambaVerse to call CoE Model

In this example, we'll use SambaVerse to call the CoE model. SambaVerse provides the expert name and their API key.

In [None]:
from use_coe_model import SambaNovaEmbeddingModel, SambaverseEndpoint, create_stuff_documents_chain, create_retrieval_chain
from langchain_community.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain.prompts import ChatPromptTemplate
import os

# Update the config.yaml file with the following:
# api: sambaverse
# llm:
#   sambaverse_model_name: "Mistral/Mistral-7B-Instruct-v0.2"
#   samabaverse_select_expert: "Mistral-7B-Instruct-v0.2"

# Create a SambaNovaEmbeddingModel object
snsdk_model = SambaNovaEmbeddingModel()
embeddings = snsdk_model

# Load documents and split into chunks
loader = WebBaseLoader("https://docs.smith.langchain.com")
docs = loader.load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
documents = text_splitter.split_documents(docs)

# Create a vector database using Chroma
vector = Chroma.from_documents(documents, embeddings)

# Define the prompt template
prompt = ChatPromptTemplate.from_template(
    """Answer the following question based only on the provided context:

    <context>
    {context}
    </context>

    Question: {input}"""
)

# Set up the language model
llm = SambaverseEndpoint(
    sambaverse_model_name="Mistral/Mistral-7B-Instruct-v0.2",
    sambaverse_api_key=os.getenv("SAMBAVERSE_API_KEY"),
    model_kwargs={
        "do_sample": False,
        "max_tokens_to_generate": 1024,
        "temperature": 0.1,
        "process_prompt": True,
        "select_expert": "Mistral-7B-Instruct-v0.2",
    },
)

# Create the document chain and retrieval chain
document_chain = create_stuff_documents_chain(llm, prompt)
retriever = vector.as_retriever()
retrieval_chain = create_retrieval_chain(retriever, document_chain)

# Invoke the retrieval chain with the user query
user_query = "Give me the code for creating a vector db in langchain"
response = retrieval_chain.invoke({"input": user_query})
print(response["answer"])

## Example 2: Using SambaStudio to call CoE with Named Expert

In this example, we'll use SambaStudio to call the CoE model with a named expert.

In [None]:
from use_coe_model import SambaNovaEmbeddingModel, SambaNovaEndpoint, create_stuff_documents_chain, create_retrieval_chain
from langchain_community.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain.prompts import ChatPromptTemplate

# Update the config.yaml file with the following:
# api: sambastudio
# llm:
#   samabaverse_select_expert: "Mistral-7B-Instruct-v0.2"

# Create a SambaNovaEmbeddingModel object
snsdk_model = SambaNovaEmbeddingModel()
embeddings = snsdk_model

# Load documents and split into chunks
loader = WebBaseLoader("https://docs.smith.langchain.com")
docs = loader.load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
documents = text_splitter.split_documents(docs)

# Create a vector database using Chroma
vector = Chroma.from_documents(documents, embeddings)

# Define the prompt template
prompt = ChatPromptTemplate.from_template(
    """Answer the following question based only on the provided context:

    <context>
    {context}
    </context>

    Question: {input}"""
)

# Set up the language model
llm = SambaNovaEndpoint(
    model_kwargs={
        "do_sample": True,
        "temperature": 0.1,
        "max_tokens_to_generate": 1024,
        "select_expert": "Mistral-7B-Instruct-v0.2",
        "process_prompt": False,
    }
)

# Create the document chain and retrieval chain
document_chain = create_stuff_documents_chain(llm, prompt)
retriever = vector.as_retriever()
retrieval_chain = create_retrieval_chain(retriever, document_chain)

# Invoke the retrieval chain with the user query
user_query = "Give me the code for creating a vector db in langchain"
response = retrieval_chain.invoke({"input": user_query})
print(response["answer"])

## Example 3: Using SambaStudio to call CoE with Routing

In this example, we'll use SambaStudio to call the CoE model with routing. The script will automatically determine the appropriate expert based on the user query.

In [None]:
from use_coe_model import SambaNovaEmbeddingModel, SambaNovaEndpoint, create_stuff_documents_chain, create_retrieval_chain, get_expert, get_expert_val
from langchain_community.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain.prompts import ChatPromptTemplate

# Update the config.yaml file with the following:
# api: sambastudio
# llm:
#   coe_routing: true

# Create a SambaNovaEmbeddingModel object
snsdk_model = SambaNovaEmbeddingModel()
embeddings = snsdk_model

# Load documents and split into chunks
loader = WebBaseLoader("https://docs.smith.langchain.com")
docs = loader.load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
documents = text_splitter.split_documents(docs)

# Create a vector database using Chroma
vector = Chroma.from_documents(documents, embeddings)

# Define the prompt template
prompt = ChatPromptTemplate.from_template(
    """Answer the following question based only on the provided context:

    <context>
    {context}
    </context>

    Question: {input}"""
)

user_query = "Give me the code for creating a vector db in langchain"

# Get the expert by calling SambaStudio with a custom prompt workflow
expert_response = get_expert(user_query)
print(f"Expert response: {expert_response}")

# Extract the expert name from the response
expert = get_expert_val(expert_response)
print(f"Expert: {expert}")

# Look up the model name based on the expert
named_expert = {
    "Finance expert": "finance-chat",
    "Math expert": "deepseek-llm-67b-chat",
    "Code expert": "deepseek-llm-67b-chat",
    "Medical expert": "medicine-chat",
    "Legal expert": "law-chat",
    "Generalist": "Mistral-7B-Instruct-v0.2",
}[expert]
print(f"Named expert: {named_expert}")

# Set up the language model
llm = SambaNovaEndpoint(
    model_kwargs={
        "do_sample": True,
        "temperature": 0.1,
        "max_tokens_to_generate": 1024,
        "select_expert": named_expert,
        "process_prompt": False,
    }
)

# Create the document chain and retrieval chain
document_chain = create_stuff_documents_chain(llm, prompt)
retriever = vector.as_retriever()
retrieval_chain = create_retrieval_chain(retriever, document_chain)

# Invoke the retrieval chain with the user query
response = retrieval_chain.invoke({"input": user_query})
print(f"Response: {response['answer']}")

In each example, we walk through the following steps:

1. Update the `config.yaml` file with the appropriate API information and LLM parameters.
2. Create a `SambaNovaEmbeddingModel` object for embeddings.
3. Load documents from a URL and split them into chunks.
4. Create a vector database using Chroma.
5. Define the prompt template.
6. Set up the language model based on the example configuration.
7. Create the document chain and retrieval chain.
8. Invoke the retrieval chain with the user query.
9. Print the response.

For Example 3, we additionally:
- Call `get_expert()` to determine the appropriate expert based on the user query.
- Extract the expert name using `get_expert_val()`.
- Look up the model name based on the expert.

Feel free to explore and experiment with different configurations and queries to see how the CoE models respond!

If you have any questions or need further assistance, please don't hesitate to ask.