# Calling Composition Of Experts (CoE) Models

This notebook demonstrates how to use the `use_coe_model.py` script to call the Composition Of Experts (CoE) models using different approaches. We'll explore three examples:

1. Using Sambaverse to call CoE Model
2. Using SambaStudio to call CoE with Named Expert
3. Using SambaStudio to call CoE with Routing

Before we begin, make sure you have the `use_coe_model.py` script in the same directory as this notebook.

Let's get started!

In [1]:
import os
import sys
import yaml

from use_CoE_model import SambaStudioEmbeddings, Sambaverse, SambaStudio, create_stuff_documents_chain, create_retrieval_chain, get_expert, get_expert_val
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain.prompts import ChatPromptTemplate

current_dir = os.getcwd()
kit_dir = os.path.abspath(os.path.join(current_dir, ".."))
repo_dir = os.path.abspath(os.path.join(kit_dir, ".."))
CONFIG_PATH = os.path.join(current_dir, "config.yaml")

sys.path.append(kit_dir)
sys.path.append(repo_dir)

from dotenv import load_dotenv
# Load environment variables from .env file
load_dotenv(os.path.join(current_dir, ".env"))



True

## Example 1: Using Sambaverse to call CoE Model

In this example, we'll use Sambaverse to call the CoE model. Sambaverse provides the expert name and their API key.

In [2]:
# List of key environment variables to check
env_vars_to_check = [
    "SAMBASTUDIO_BASE_URL",
    "SAMBASTUDIO_BASE_URI",
    "SAMBASTUDIO_PROJECT_ID",
    "SAMBASTUDIO_ENDPOINT_ID",
    "SAMBASTUDIO_API_KEY",
    "SAMBAVERSE_API_KEY" , # Include this if you're using Sambaverse
    "SAMBASTUDIO_EMBEDDINGS_BASE_URL",
    "SAMBASTUDIO_EMBEDDINGS_PROJECT_ID",
    "SAMBASTUDIO_EMBEDDINGS_ENDPOINT_ID",
    "SAMBASTUDIO_EMBEDDINGS_API_KEY"
    
    ]

# Print the values of the environment variables
print("Environment Variables:")
for var in env_vars_to_check:
    value = os.getenv(var)
    if value:
        # Print only the first few characters of the API keys for security
        if "API_KEY" in var:
            print(f"{var}: {value[:5]}...{value[-5:]}")
        else:
            print(f"{var}: {value}")
    else:
        
        print(f"{var}: Not set")

Environment Variables:
SAMBASTUDIO_BASE_URL: https://sjc3-e6.sambanova.net
SAMBASTUDIO_BASE_URI: Not set
SAMBASTUDIO_PROJECT_ID: 59945dca-705c-44c1-8b6a-75d89d6d70d8
SAMBASTUDIO_ENDPOINT_ID: 24222811-cbc6-4365-9673-3122199c8ee8
SAMBASTUDIO_API_KEY: cd10b...dbd8b
SAMBAVERSE_API_KEY: e300e...27823
SAMBASTUDIO_EMBEDDINGS_BASE_URL: https://sjc3-demo2.sambanova.net
SAMBASTUDIO_EMBEDDINGS_PROJECT_ID: 4e1e3d93-79b9-4694-bdfc-181b5a3e019b
SAMBASTUDIO_EMBEDDINGS_ENDPOINT_ID: 5fc68ee8-2de8-429c-b4d9-b0a17a13ee87
SAMBASTUDIO_EMBEDDINGS_API_KEY: 2634c...ba7d0


In [None]:
## Example 1: Using Sambaverse to call CoE Model

#In this example, we'll use Sambaverse to call the CoE model. Sambaverse provides the expert name and their API key.

# Update the config.yaml file with the following:
# api: sambaverse
# llm:
#   sambaverse_model_name: "Mistral/Mistral-7B-Instruct-v0.2"
#   samabaverse_select_expert: "Mistral-7B-Instruct-v0.2"

with open(CONFIG_PATH, "r") as yaml_file:
    config = yaml.safe_load(yaml_file)
api_info = config["api"]
llm_info = config["llm"]

# Since Embedding Models are only available on SambaStudio and not Sambaverse we create a local Hugging Face Embeddings Object
# In the SambaStudio examples later, we use an Embeddings Models hosted on SambaStudio
embeddings = HuggingFaceEmbeddings()

# Load documents and split into chunks
loader = WebBaseLoader("https://docs.smith.langchain.com")
docs = loader.load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
documents = text_splitter.split_documents(docs)

# Create a vector database using Chroma
vector = Chroma.from_documents(documents, embeddings, collection_name='sambaverse_coe_aisk')

# Define the prompt template
prompt = ChatPromptTemplate.from_template(
    """Answer the following question based only on the provided context:

    <context>
    {context}
    </context>

    Question: {input}"""
)

# Set up the language model
llm = Sambaverse(
    sambaverse_model_name=llm_info["sambaverse_model_name"],
    sambaverse_api_key=os.getenv("SAMBAVERSE_API_KEY"),
    model_kwargs={
        "do_sample": False,
        "max_tokens_to_generate": llm_info["max_tokens_to_generate"],
        "temperature": llm_info["temperature"],
        "process_prompt": True,
        "select_expert": llm_info["samabaverse_select_expert"],
    },
)

# Create the document chain and retrieval chain
document_chain = create_stuff_documents_chain(llm, prompt)
retriever = vector.as_retriever()
retrieval_chain = create_retrieval_chain(retriever, document_chain)

# Invoke the retrieval chain with the user query
user_query = "How can you use langsmith for testing"
response = retrieval_chain.invoke({"input": user_query})
print(response["answer"])


## Example 2: Using SambaStudio to call CoE with Named Expert

In this example, we'll use SambaStudio to call the CoE model with a named expert.

In [3]:
## Example 2: Using SambaStudio to call CoE with Named Expert

#In this example, we'll use SambaStudio to call the CoE model with a named expert.

# Update the config.yaml file with the following:
# api: sambastudio
# llm:
#   samabaverse_select_expert: "Mistral-7B-Instruct-v0.2"

with open(CONFIG_PATH, "r") as yaml_file:
    config = yaml.safe_load(yaml_file)
api_info = config["api"]
llm_info = config["llm"]

# Create a SambaStudioEmbeddings object
snsdk_model = SambaStudioEmbeddings()
embeddings = snsdk_model

# Load documents and split into chunks
loader = WebBaseLoader("https://docs.smith.langchain.com")
docs = loader.load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
documents = text_splitter.split_documents(docs)

# Create a vector database using Chroma
vector = Chroma.from_documents(documents, embeddings, collection_name='sambastudio_coe_aisk')

# Define the prompt template
prompt = ChatPromptTemplate.from_template(
    """Answer the following question based only on the provided context:

    <context>
    {context}
    </context>

    Question: {input}"""
)

# Set up the language model
llm = SambaStudio(
    streaming=True,
    model_kwargs={
        "do_sample": True,
        "temperature": llm_info["temperature"],
        "max_tokens_to_generate": llm_info["max_tokens_to_generate"],
        "select_expert": llm_info["samabaverse_select_expert"],
        "process_prompt": False,
    }
)

# Create the document chain and retrieval chain
document_chain = create_stuff_documents_chain(llm, prompt)
retriever = vector.as_retriever()
retrieval_chain = create_retrieval_chain(retriever, document_chain)

# Invoke the retrieval chain with the user query
user_query = "Tell me how I can use langsmith within applications"
response = retrieval_chain.invoke({"input": user_query})
print(response["answer"])

INFO:chromadb.telemetry.product.posthog:Anonymized telemetry enabled. See                     https://docs.trychroma.com/telemetry for more information.


. 

    Answer: LangSmith is a platform for building production-grade LLM applications. It allows you to closely monitor and evaluate your application, so you can ship quickly and with confidence. You can use LangSmith on its own, without the need for LangChain. To get started, you can follow the quick start guide which involves installing LangSmith, creating an API key, setting up your environment, logging your first trace, and running your first evaluation.


## Example 3: Using SambaStudio to call CoE with Routing

In this example, we'll use SambaStudio to call the CoE model with routing. The script will automatically determine the appropriate expert based on the user query.

In [4]:
# Update the config.yaml file with the following:
# api: sambastudio
# llm:
#   coe_routing: true

with open(CONFIG_PATH, "r") as yaml_file:
    config = yaml.safe_load(yaml_file)
api_info = config["api"]
llm_info = config["llm"]

# Create a SambaStudioEmbeddings object
snsdk_model = SambaStudioEmbeddings()
embeddings = snsdk_model

# Load documents and split into chunks
loader = WebBaseLoader("https://docs.smith.langchain.com")
docs = loader.load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
documents = text_splitter.split_documents(docs)

# Create a vector database using Chroma
vector = Chroma.from_documents(documents, embeddings,collection_name='sambastudio_coe_aisk')

# Define the prompt template
prompt = ChatPromptTemplate.from_template(
    """Answer the following question based only on the provided context:

    <context>
    {context}
    </context>

    Question: {input}"""
)

user_query = "Tell me how I can use langsmith for testing"

# Get the expert by calling SambaStudio with a custom prompt workflow
expert_response = get_expert(user_query, use_requests=True)
print(f"Router expert response: {expert_response}")

# Extract the expert name from the response
expert = get_expert_val(expert_response)
print(f"Routing Named Expert: {expert}")

# Look up the model name based on the expert
named_expert = config["coe_name_map"][expert]
print(f"Named expert Model Name: {named_expert}")

# Set up the language model
llm = SambaStudio(
    streaming=True,
    model_kwargs={
        "do_sample": True,
        "temperature": llm_info["temperature"],
        "max_tokens_to_generate": llm_info["max_tokens_to_generate"],
        "select_expert": named_expert,
        "process_prompt": False,
    }
)

# Create the document chain and retrieval chain
document_chain = create_stuff_documents_chain(llm, prompt)
retriever = vector.as_retriever()
retrieval_chain = create_retrieval_chain(retriever, document_chain)

# Invoke the retrieval chain with the user query
response = retrieval_chain.invoke({"input": user_query})
print(f"Response: {response['answer']}")

Router expert response: {'status': {'complete': True, 'exitCode': 0, 'elapsedTime': 2.917145013809204, 'message': '', 'progress': 1, 'progressMessage': '', 'reason': ''}, 'predictions': [{'completion': ']}{"conversation_id": "sambaverse-conversation-id", "messages": [{"message_id": 0, "role": "user", "content": "Tell me how I can use langsmith for testing"}, {"message_id": 1, "role": "assistant", "content": "<<code generation>>:\\n\\nI classified this message as \'code generation\' because it asks about using a specific tool (Langsmith) for testing, which is a task typically related to software development and coding.\\n"}], "prompt": "Tell me how I can use langsmith for testing"}', 'logprobs': {'text_offset': [], 'top_logprobs': []}, 'prompt': '{"conversation_id": "sambaverse-conversation-id", "messages": [{"message_id": 0, "role": "user", "content": "Tell me how I can use langsmith for testing"}], "prompt": "<|begin_of_text|><|start_header_id|>system<|end_header_id|>\\nA message can 

## Conclusion

In each example, we walked through the following steps:

1. Update the `config.yaml` file with the appropriate API information and LLM parameters
2. Create a `SambaNovaEmbeddingModel` (or a `HuggingFaceEmbeddings`) object for embeddings
3. Load documents from a URL and split them into chunks
4. Create a vector database using Chroma
5. Define the prompt template
6. Set up the language model based on the example configuration
7. Create the document chain and retrieval chain
8. Invoke the retrieval chain with the user query
9. Print the response

For Example 3, we additionally:
- Called `get_expert()` to determine the appropriate expert based on the user query.
- Extracted the expert name using `get_expert_val()`.
- Looked up the model name based on the expert.

Feel free to explore and experiment with different configurations and queries to see how the CoE models respond!

If you have any questions or need further assistance, please don't hesitate to ask.