# Tutorial 2: Deploying Q-SPARC Chatbot

This tutorial provides exercises to help you familiar with the deploy Q-SPARC Chatbot serve based on local LLMs.

## Prerequisite

Please read the [tutorial 1 documentation](./tutorial_1_getting_started.ipynb)
you need to success deploy local LLMs on some gpu. pls see the last document.

## 0. Activate the Environment

conda activate llm_env

## 1. How to test history componets of the Q-SPARC Chatbot
See and run the below code, note that it is just offline test. In the step 3, we will intergrate all of the function together, and deploy an accessible port.

In [8]:
#!/usr/bin/env python
import os

from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
from langserve import add_routes
from langchain_core.messages import  HumanMessage, AIMessage
from langchain_core.chat_history import  BaseChatMessageHistory, InMemoryChatMessageHistory
from langchain_core.runnables.history import  RunnableWithMessageHistory

store = {}

def get_session_history(session_id: str) -> BaseChatMessageHistory:
    if session_id not in store:
        store[session_id] = InMemoryChatMessageHistory()
    return store[session_id]


os.environ["CUDA_VISIBLE_DEVICES"] = "2"
os.environ["NO_PROXY"] = "localhost,127.0.0.1"
base_url ="http://localhost:8000/v1"
api_key ="EMPTY"
model_id ="/hpc/fxu244/Documents/Code/LLMs/Qwen3-32B"

# 2. Create model
model =ChatOpenAI(base_url=base_url, api_key=api_key, model=model_id)  

from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer all questions to me. Show concise think content, give the quick response.",
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

chain = prompt | model

with_message_history = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="messages",
)

config = {"configurable": {"session_id": "abc11"}}

response = with_message_history.invoke(
    {"messages": [HumanMessage(content="hi! I'm todd")], "language": "Spanish"},
    config=config,
)

print(response.content)
config = {"configurable": {"session_id": "abc11"}}

response = with_message_history.invoke(
    {"messages": [HumanMessage(content="whats my name?")], "language": "Spanish"},
    config=config,
)

print(response.content)

<think>
Okay, the user just introduced themselves as Todd. I should respond in a friendly and welcoming manner. Let me make sure to acknowledge their name and offer assistance. Maybe something like, "Hi Todd! How can I assist you today?" That should be concise and open-ended, inviting them to ask questions or share what they need help with. Let me check for any typos or errors. Looks good. Ready to send.
</think>

Hi Todd! How can I assist you today? ðŸ˜Š
<think>
Okay, the user just asked, "whats my name?" I need to figure out the best way to respond. Let me start by recalling the conversation history. The user introduced himself as Todd at the beginning, so I should know that.

Wait, maybe the user is testing if I remember their name. It's possible they want to confirm if the AI retains information from previous messages. In the first message, they said "hi! I'm todd," and I responded with "Hi Todd! How can I assist you today? ðŸ˜Š". Now they're asking again about their name.

I shoul

## 2.How to link SCKAN dataset with the Q-SPARC

In [None]:
import json
from langchain_community.document_loaders import JSONLoader
from langchain_community.embeddings import HuggingFaceEmbeddings # You can replace with OpenAIEmbeddings or another model
from langchain.schema import Document
from langchain_chroma import Chroma

# --- 1. Load and parse the original JSON data ---

# Define file path and JQ schema
file_path = '/hpc/fxu244/Documents/Code/LLMs/a-b-via-c.json'
jq_schema = '.results.bindings[]'

# text_content=False is critical; it preserves each element's JSON structure
loader = JSONLoader(
    file_path=file_path,
    jq_schema=jq_schema,
    text_content=False 
)

# Load documents; each document's page_content is a JSON string
raw_docs = loader.load()

# --- 2. Transform data to the target format and create LangChain Document objects ---
def get_val(record, key):
    """Safely extract the 'value' for the given key from the record"""
    if key in record and isinstance(record.get(key), dict):
        return record[key].get('value')
    return None # Return None if key doesn't exist or format is invalid

final_documents = []
for doc in raw_docs:
    # Parse page_content (JSON string) into a Python dictionary
    record = json.loads(doc.page_content)

    # Extract and transform data in the desired format
    clean_data = {
        "Neuron_ID": get_val(record, "Neuron_ID"),
        "A": get_val(record, "A"),
        "B": get_val(record, "B"),
        "C": get_val(record, "C"),
        "Target_Organ": get_val(record, "Target_Organ"),
        # Special handling for C_Type; set to "N/A" if missing
        "C_Type": get_val(record, "C_Type") if "C_Type" in record else "N/A",
    }
    
    # For better semantic search, convert structured data to meaningful text
    # This text will be used to generate vector embeddings
    page_content = (
        f"Neuron connection info: Neuron ID is {clean_data['Neuron_ID']}."
        f" It connects from {clean_data['A']} to {clean_data['B']} via {clean_data['C']}."
        f" The target organ is {clean_data['Target_Organ']}."
        f" Connection type (C_Type) is {clean_data['C_Type']}."
    )

    # Create a new LangChain Document object
    # page_content is used for vector search
    # metadata stores clean structured data for filtering or use
    final_documents.append(
        Document(page_content=page_content, metadata=clean_data)
    )

# Print the first processed document as a check
if final_documents:
    print("--- Sample of first processed Document ---")
    print(f"Page Content: {final_documents[0].page_content}")
    print(f"Metadata: {final_documents[0].metadata}")
    print("-" * 30)


# --- 3. Initialize embedding model and vector database, then add documents ---

# Initialize an embedding model. This uses HuggingFace's open-source model, which runs locally.
# The model will be downloaded automatically the first time.
# You can also replace it with OpenAIEmbeddings(openai_api_key="sk-...") or other models.
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")

# from_documents handles embedding and indexing of all documents
print("Creating vector database, this might take a while...")

vector_store = Chroma.from_documents(
    final_documents,
    # embedding = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2"),
    embedding = embeddings
)

print("Vector database created successfully!")

# --- 4. (Optional) Demo: How to use the vector database ---
print("\n--- Similarity Search Demo ---")

query = 'Is there a connection from inferior mesenteric ganglion to the urinary bladder in rats? Summarize the pathways based on the nerves involved.'
results = vector_store.similarity_search(query, k=10)

if results:
    print(f"Query: '{query}'")
    print("\nTop relevant result:",results)


--- Sample of first processed Document ---
Page Content: Neuron connection info: Neuron ID is http://uri.interlex.org/tgbugs/uris/readable/neuron-type-aacar-10a. It connects from Atrial intrinsic cardiac ganglion to Atrial intrinsic cardiac ganglion via cardiac interganglionic nerve. The target organ is heart. Connection type (C_Type) is N/A.
Metadata: {'Neuron_ID': 'http://uri.interlex.org/tgbugs/uris/readable/neuron-type-aacar-10a', 'A': 'Atrial intrinsic cardiac ganglion', 'B': 'Atrial intrinsic cardiac ganglion', 'C': 'cardiac interganglionic nerve', 'Target_Organ': 'heart', 'C_Type': 'N/A'}
------------------------------
Creating vector database, this might take a while...
Vector database created successfully!

--- Similarity Search Demo ---
Query: 'Is there a connection from inferior mesenteric ganglion to the urinary bladder in rats? Summarize the pathways based on the nerves involved.'

Top relevant result: [Document(id='062ab189-4219-4fd9-bedd-b4a73681ceac', metadata={'B': 'Do

## 3. How to deploy the Q-SPARC Chatbot online, and some example Output

In our 'src/LLM_Server/server.py' we give the whole code about how to intergrate history and SCKAN dataset with our local LLM model. So you only need to run this python file named 'src/LLM_Server/server.py', and access this link 'localhost:1236/chain/playground' or 'localhost:1236/chat, you can use our Q-SPARC chatbot. Because of the Jupyter can not link online server, so we only give some example output compared with current SCKAN-NLI application.