# Semantic Kernel Tool Use Example

This document provides an overview and explanation of the code used to create a Semantic Kernel-based tool that integrates with ChromaDB for Retrieval-Augmented Generation (RAG). The example demonstrates how to build an AI agent that retrieves travel documents from a ChromaDB collection, augments user queries with semantic search results, and streams detailed travel recommendations.

## Initializing the Environment

SQLite Version Fix
If you encounter the error:
```
RuntimeError: Your system has an unsupported version of sqlite3. Chroma requires sqlite3 >= 3.35.0
```

Uncomment this code block at the start of your notebook:

In [10]:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from webdriver_manager.chrome import ChromeDriverManager
import time

# 设置 Chrome 为 headless 模式
options = webdriver.ChromeOptions()
options.add_argument('--headless')  # 无头模式
options.add_argument('--no-sandbox')
options.add_argument('--disable-dev-shm-usage')
options.binary_location = "/usr/bin/chromium-browser"  # 指定 Chromium 路径

# 初始化 WebDriver
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=options)

# 访问某一楼层
url = "https://www.uscardforum.com/t/topic/109417/389"
driver.get(url)

# 等待加载
time.sleep(3)

# 获取页面中所有楼层内容（简化提取）
posts = driver.find_elements(By.CLASS_NAME, "cooked")
for i, post in enumerate(posts):
    print(f"\n--- 第 {i+1} 个楼层内容 ---\n{post.text[:500]}")

driver.quit()

WebDriverException: Message: unknown error: no chrome binary at /usr/bin/chromium-browser
Stacktrace:
#0 0x5c33636394e3 <unknown>
#1 0x5c3363368c76 <unknown>
#2 0x5c336338f5e0 <unknown>
#3 0x5c336338e029 <unknown>
#4 0x5c33633ccccc <unknown>
#5 0x5c33633cc47f <unknown>
#6 0x5c33633c3de3 <unknown>
#7 0x5c33633992dd <unknown>
#8 0x5c336339a34e <unknown>
#9 0x5c33635f93e4 <unknown>
#10 0x5c33635fd3d7 <unknown>
#11 0x5c3363607b20 <unknown>
#12 0x5c33635fe023 <unknown>
#13 0x5c33635cc1aa <unknown>
#14 0x5c33636226b8 <unknown>
#15 0x5c3363622847 <unknown>
#16 0x5c3363632243 <unknown>
#17 0x78812ceaa1f5 <unknown>


In [34]:
import requests

def get_all_posts_from_discourse_thread(url: str):
    if not url.endswith('.json'):
        if url[-1] != '/':
            url += '/'
        url += '.json'

    headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 "
                      "(KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36"
    }

    response = requests.get(url, headers=headers)
    if response.status_code != 200:
        raise Exception(f"请求失败：{response.status_code}")
    
    data = response.json()
    posts = data['post_stream']['posts']
    all_posts = []

    for post in posts:
        floor = post['post_number']
        username = post['username']
        content = post['cooked']
        text = clean_html(content)
        all_posts.append({
            "floor": floor,
            "user": username,
            "text": text
        })

    return all_posts

from bs4 import BeautifulSoup

def clean_html(raw_html):
    """
    清除 HTML 标签，保留纯文本。
    """
    soup = BeautifulSoup(raw_html, "html.parser")
    return soup.get_text(separator="\n").strip()

In [35]:
url = "https://www.uscardforum.com/t/topic/109417"
forum_chunks = get_all_posts_from_discourse_thread(url)

# 快速预览前几楼
for post in forum_chunks[:3]:
    print(f"{post['floor']}F - {post['user']}: {post['text'][:50]}...")

Exception: 请求失败：403

### Importing Packages
The following code imports the necessary packages:

In [22]:
# Step 1: Install dependencies (run this in a cell)
# %pip install openai chromadb sentence-transformers

import os
from sentence_transformers import SentenceTransformer
import chromadb
from chromadb.utils import embedding_functions
from openai import OpenAI, AsyncOpenAI

from azure.ai.inference import ChatCompletionsClient
from azure.ai.inference.models import SystemMessage, UserMessage
from azure.core.credentials import AzureKeyCredential

In [2]:
# Step 2: Prepare the document chunks (simulate a long forum post)
forum_chunks = [
    {"floor": 1, "user": "楼主", "text": "大家好，我的电脑最近老是自动重启，怎么办？"},
    {"floor": 3, "user": "Alice", "text": "可能是你的电源出了问题，建议检查一下插座。"},
    {"floor": 5, "user": "Bob", "text": "我遇到过类似的问题，结果是 CPU 过热。"},
    {"floor": 8, "user": "Charlie", "text": "你有没有在夏天开空调？我夏天开不了空调，电脑常死机。"}
]

In [None]:
# Step 3: Embed the documents using a multilingual model
model = SentenceTransformer('paraphrase-multilingual-MiniLM-L12-v2')
texts = [f"{c['user']}({c['floor']}F): {c['text']}" for c in forum_chunks]
embeddings = model.encode(texts)

In [4]:
# Step 4: Create and populate a ChromaDB collection
client = chromadb.Client()
collection = client.create_collection(name="forum")
for idx, (chunk, vector) in enumerate(zip(forum_chunks, embeddings)):
    collection.add(
        documents=[chunk['text']],
        ids=[f"chunk_{idx}"],
        embeddings=[vector],
        metadatas=[{"floor": chunk['floor'], "user": chunk['user']}]
    )

In [41]:
# Step 5: Define a simple query function
def get_completion(prompt, client, model_name, temperature=1.0, max_tokens=1000, top_p=1.0):
    response = client.complete(
        messages=[
            {
                "role": "system",
                "content": "You are a helpful assistant.",
            },
            {
                "role": "user",
                "content": prompt,
            },
        ],
        model=model_name,
        temperature=temperature,
        max_tokens=max_tokens,
        top_p=top_p
    )
    return response.choices[0].message.content

def query_agent(question: str):
    token = os.environ["GITHUB_TOKEN"]
    endpoint = "https://models.inference.ai.azure.com"
    model_name = "gpt-4o"
    client = ChatCompletionsClient(
    endpoint=endpoint,
    credential=AzureKeyCredential(token),
)
    question_embedding = model.encode([question])[0]
    results = collection.query(query_embeddings=[question_embedding], n_results=3)
    context = "\n".join([f"{doc['user']}({doc['floor']}F): {doc['text']}" 
                          for doc, meta in zip(forum_chunks, results['metadatas'][0]) 
                          if 'user' in meta])
    prompt = f"""
你是一个懂技术的论坛助手，请根据以下楼层内容回答用户问题。

【帖子内容】
{context}

【用户提问】
{question}

【请简洁回答】
"""
    # Use OpenAI API to get an answer
    response = get_completion(prompt, client, model_name)
    # answer = response.choices[0].message['content']
    print(results['metadatas'][0])
    print("--- Prompt Sent to LLM ---")
    print(prompt)
    print("--- LLM Answer ---")
    print(response)

In [42]:
# Step 6: Run a sample query
query_agent("为什么电脑会突然重启？")

[{'floor': 1, 'user': '楼主'}, {'floor': 5, 'user': 'Bob'}, {'floor': 3, 'user': 'Alice'}]
--- Prompt Sent to LLM ---

你是一个懂技术的论坛助手，请根据以下楼层内容回答用户问题。

【帖子内容】
楼主(1F): 大家好，我的电脑最近老是自动重启，怎么办？
Alice(3F): 可能是你的电源出了问题，建议检查一下插座。
Bob(5F): 我遇到过类似的问题，结果是 CPU 过热。

【用户提问】
为什么电脑会突然重启？

【请简洁回答】

--- LLM Answer ---
电脑突然重启可能是以下原因：  
1. **电源问题**：电源损坏或插座接触不良。  
2. **硬件过热**：如 CPU 或显卡过热会触发自动保护重启。  
3. **驱动或系统问题**：驱动冲突或系统崩溃可能导致重启。  
建议按上述原因逐一检查并排除。
