
# Chillstay Chatbot ‚Äî Firebase Edition (Notebook)

Notebook n√†y l√† phi√™n b·∫£n chuy·ªÉn t·ª´ file script b·∫°n cung c·∫•p sang **Jupyter Notebook** ƒë·ªÉ d·ªÖ ch·ªânh s·ª≠a v√† ch·∫°y t·ª´ng ph·∫ßn.
T√¥i ƒë√£ gi·ªØ nguy√™n logic ch√≠nh, th√™m m·ªôt s·ªë *compatibility checks*, v√† ƒë·ªÅ xu·∫•t/fix c√°c ch·ªó c√≥ th·ªÉ g√¢y l·ªói khi ch·∫°y (xem √¥ "Ghi ch√∫ & Fixes").

**L∆∞u √Ω:** kh√¥ng th·ªÉ k·∫øt n·ªëi Firebase ho·∫∑c OpenAI trong m√¥i tr∆∞·ªùng n√†y; notebook s·∫Ω t·∫°o file `.ipynb` cho b·∫°n t·∫£i xu·ªëng v√† ch·∫°y tr√™n m√°y c√≥ c·∫•u h√¨nh (credentials) ph√π h·ª£p.


In [2]:

# PH·∫¶N 1: IMPORT V√Ä SETUP
import os
import getpass
import json
from typing import Type, Optional, List, Dict, Any

# Firebase
import firebase_admin
from firebase_admin import credentials, firestore

from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_community.vectorstores import Chroma
from langchain_core.example_selectors import SemanticSimilarityExampleSelector
from langchain_core.prompts import ChatPromptTemplate, FewShotPromptTemplate, MessagesPlaceholder, PromptTemplate
from langchain.tools import BaseTool

from pydantic import BaseModel, Field
import gradio as gr


In [3]:
# Thi·∫øt l·∫≠p Google API Key
if 'OPENAI_API_KEY' not in os.environ:
  os.environ['GOOGLE_API_KEY'] = getpass.getpass('Google API Key: ')

Google API Key:  ¬∑¬∑¬∑¬∑¬∑¬∑¬∑¬∑


In [None]:
# PH·∫¶N 2: KH·ªûI T·∫†O FIREBASE
def initialize_firebase(service_account_path):
    try:
        if not firebase_admin._apps:
            cred = credentials.Certificate(service_account_path)
            firebase_admin.initialize_app(cred)
        db = firestore.client()
        print("‚úì Firebase ƒë√£ ƒë∆∞·ª£c kh·ªüi t·∫°o th√†nh c√¥ng!")
        return db
    except Exception as e:
        print(f"‚úó L·ªói kh·ªüi t·∫°o Firebase: {e}")
        return None

db = initialize_firebase("/path/to/your/firebase-key.json")


In [None]:

# PH·∫¶N 3 & 4: TOOLS - Query v√† Aggregation (c√≥ h·ªó tr·ª£ to√°n t·ª≠ cho filters)
class FirebaseQueryInput(BaseModel):
    collection: str = Field(..., description="T√™n collection c·∫ßn query (hotels, rooms, bookings, users, etc.)")
    filters: Dict[str, Any] = Field(default={}, description="ƒêi·ªÅu ki·ªán l·ªçc d·∫°ng dict {field: value} ho·∫∑c {field: {operator: value}}")
    limit: int = Field(default=10, description="S·ªë l∆∞·ª£ng k·∫øt qu·∫£ t·ªëi ƒëa")
    order_by: Optional[str] = Field(None, description="Tr∆∞·ªùng d√πng ƒë·ªÉ s·∫Øp x·∫øp")

class FirebaseQueryTool(BaseTool):
    name: str = "firebase_query"
    description: str = "Truy v·∫•n Firestore cho Chillstay"
    args_schema: Type[BaseModel] = FirebaseQueryInput
    return_direct: bool = False

    def __init__(self, db, **kwargs):
        super().__init__(**kwargs)
        self._db = db

    def _apply_filter(self, query, field, value):
        # H·ªó tr·ª£ value l√† dict: { "<": 100 } ho·∫∑c gi√° tr·ªã tr·ª±c ti·∫øp => '='
        if isinstance(value, dict):
            # L·∫•y operator v√† operand t·ª´ dict
            if len(value) != 1:
                raise ValueError("Filter dict ph·∫£i c√≥ ƒë√∫ng 1 c·∫∑p operator:value")
            op, val = next(iter(value.items()))
            op_map = {"<": "<", "<=": "<=", ">": ">", ">=": ">=", "!=": "!=", "==": "=="}
            if op not in op_map:
                raise ValueError(f"Operator kh√¥ng h·ªó tr·ª£: {op}")
            query = query.where(field, op_map[op], val)
        else:
            query = query.where(field, "==", value)
        return query

    def _run(self, collection: str, filters: Dict[str, Any] = {}, limit: int = 10, order_by: Optional[str] = None, run_manager: Optional[Any] = None) -> str:
        try:
            query = self._db.collection(collection)

            for field, value in filters.items():
                query = self._apply_filter(query, field, value)

            if order_by:
                try:
                    query = query.order_by(order_by, direction=firestore.Query.DESCENDING)
                except Exception:
                    # fallback ƒë∆°n gi·∫£n
                    query = query.order_by(order_by)

            query = query.limit(limit)
            docs = query.stream()

            results = []
            for doc in docs:
                data = doc.to_dict()
                data['id'] = doc.id
                results.append(data)

            if not results:
                return f"Kh√¥ng t√¨m th·∫•y d·ªØ li·ªáu trong collection '{collection}' v·ªõi ƒëi·ªÅu ki·ªán: {filters}"

            return json.dumps(results, ensure_ascii=False, indent=2)
        except Exception as e:
            return f"L·ªói khi query Firebase: {str(e)}"


class FirebaseAggregationInput(BaseModel):
    collection: str = Field(...)
    operation: str = Field(..., description="count, avg, min, max, sum")
    field: Optional[str] = Field(None)
    filters: Dict[str, Any] = Field(default={})

class FirebaseAggregationTool(BaseTool):
    name: str = "firebase_aggregate"
    description: str = "Th·ªëng k√™ tr√™n Firestore"
    args_schema: Type[BaseModel] = FirebaseAggregationInput
    return_direct: bool = False

    def __init__(self, db, **kwargs):
        super().__init__(**kwargs)
        self._db = db

    def _run(self, collection: str, operation: str, field: Optional[str] = None, filters: Dict[str, Any] = {}, run_manager: Optional[Any] = None) -> str:
        try:
            query = self._db.collection(collection)
            for key, value in filters.items():
                if isinstance(value, dict):
                    # re-use same operator handling as query tool
                    if len(value) != 1:
                        raise ValueError("Filter dict ph·∫£i c√≥ ƒë√∫ng 1 c·∫∑p operator:value")
                    op, val = next(iter(value.items()))
                    query = query.where(key, op, val)
                else:
                    query = query.where(key, "==", value)

            docs = list(query.stream())

            if operation == 'count':
                return f"S·ªë l∆∞·ª£ng: {len(docs)}"

            if not field:
                return "L·ªói: Ph·∫£i ch·ªâ ƒë·ªãnh field cho ph√©p to√°n n√†y"

            values = []
            for doc in docs:
                data = doc.to_dict()
                if field in data and data[field] is not None:
                    try:
                        values.append(float(data[field]))
                    except Exception:
                        # b·ªè qua gi√° tr·ªã kh√¥ng chuy·ªÉn ƒë∆∞·ª£c
                        pass

            if not values:
                return f"Kh√¥ng t√¨m th·∫•y gi√° tr·ªã cho field '{field}'"

            if operation == 'avg':
                result = sum(values) / len(values)
                return f"Trung b√¨nh {field}: {result:.2f}"
            elif operation == 'min':
                return f"Nh·ªè nh·∫•t {field}: {min(values)}"
            elif operation == 'max':
                return f"L·ªõn nh·∫•t {field}: {max(values)}"
            elif operation == 'sum':
                return f"T·ªïng {field}: {sum(values)}"
            else:
                return f"Ph√©p to√°n '{operation}' kh√¥ng ƒë∆∞·ª£c h·ªó tr·ª£"
        except Exception as e:
            return f"L·ªói: {str(e)}"


In [None]:

# PH·∫¶N 5 & 6: SEMANTIC SEARCH + T·∫†O VECTOR STORE
class SemanticSearchInput(BaseModel):
    query: str = Field(..., description="C√¢u h·ªèi t√¨m ki·∫øm ng·ªØ nghƒ©a")

class SemanticSearchTool(BaseTool):
    name: str = "semantic_search"
    description: str = "T√¨m ki·∫øm ng·ªØ nghƒ©a tr√™n t√™n, m√¥ t·∫£, ti·ªán nghi"
    args_schema: Type[BaseModel] = SemanticSearchInput
    return_direct: bool = False

    def __init__(self, retriever, **kwargs):
        super().__init__(**kwargs)
        self._retriever = retriever

    def _run(self, query: str, run_manager: Optional[Any] = None) -> str:
        try:
            results = self._retriever.get_relevant_documents(query)
        except Exception:
            results = self._retriever.retrieve(query)
        if not results:
            return "Kh√¥ng t√¨m th·∫•y k·∫øt qu·∫£ ph√π h·ª£p"
        output = []
        for doc in results[:5]:
            # doc.page_content ho·∫∑c doc.content tu·ª≥ object
            content = getattr(doc, 'page_content', None) or getattr(doc, 'content', str(doc))
            output.append(content)
        return "\n\n---\n\n".join(output)


def create_hotel_vector_store(db, embeddings):
    \"\"\"T·∫°o vectorstore t·ª´ hotels collection\"\"\"
    print("ƒêang t·∫°o vector store t·ª´ Firebase...")
    hotels_ref = db.collection('hotels')
    hotels = hotels_ref.stream()

    texts = []
    metadatas = []

    for hotel in hotels:
        data = hotel.to_dict()
        text = f\"\"\"\nT√™n: {data.get('name', '')}\nTh√†nh ph·ªë: {data.get('city', '')}\nQu·ªëc gia: {data.get('country', '')}\nRating: {data.get('rating', 0)}\nS·ªë ƒë√°nh gi√°: {data.get('numberOfReviews', 0)}\n\"\"\"
        texts.append(text)
        metadatas.append({'id': hotel.id, 'name': data.get('name', ''), 'city': data.get('city', ''), 'country': data.get('country', '')})

    if not texts:
        print("‚ö† Kh√¥ng c√≥ d·ªØ li·ªáu hotel trong Firebase")
        return None

    # T√πy version, Chroma.from_texts signature kh√°c nhau (embedding_function vs embedding)
    try:
        vectorstore = Chroma.from_texts(texts=texts, metadatas=metadatas, embedding=embeddings)
    except Exception as e:
        try:
            vectorstore = Chroma.from_texts(texts, metadatas=metadatas, embedding_function=embeddings)
        except Exception as e2:
            raise RuntimeError(f"Kh√¥ng th·ªÉ t·∫°o Chroma vectorstore: {e} | {e2}")

    print(f"‚úì ƒê√£ t·∫°o vector store v·ªõi {len(texts)} hotels")
    return vectorstore


In [None]:

# PH·∫¶N 7 & 8: FEW-SHOT EXAMPLES + T·∫†O AGENT (c·ªë g·∫Øng t∆∞∆°ng th√≠ch v·ªõi nhi·ªÅu version)
examples = [
    {"input": "C√≥ bao nhi√™u kh√°ch s·∫°n ·ªü H√† N·ªôi?", "query": "S·ª≠ d·ª•ng firebase_aggregate: - collection: hotels - operation: count - filters: {\"city\": \"H√† N·ªôi\"}"},
    {"input": "T√¨m 5 kh√°ch s·∫°n c√≥ rating cao nh·∫•t", "query": "S·ª≠ d·ª•ng firebase_query: - collection: hotels - order_by: rating - limit: 5"},
    {"input": "Ph√≤ng n√†o c√≥ gi√° d∆∞·ªõi 500k?", "query": "L·ªçc trong firebase_query: - collection: rooms - filters: {\"price\": {\"<\": 500000}} - limit: 10"},
    {"input": "Rating trung b√¨nh c·ªßa c√°c kh√°ch s·∫°n ·ªü TP.HCM?", "query": "S·ª≠ d·ª•ng firebase_aggregate: - collection: hotels - operation: avg - field: rating - filters: {\"city\": \"TP.HCM\"}"},
]

def create_chillstay_agent(db):
    llm = None
    if ChatOpenAI is not None:
        try:
            llm = ChatOpenAI(model='gpt-4o-mini', temperature=0)
        except Exception:
            try:
                llm = ChatOpenAI()  # fallback to default
            except Exception:
                llm = None

    embeddings = None
    if OpenAIEmbeddings is not None:
        try:
            embeddings = OpenAIEmbeddings(model='text-embedding-3-small')
        except Exception:
            try:
                embeddings = OpenAIEmbeddings()
            except Exception:
                embeddings = None

    firebase_query_tool = FirebaseQueryTool(db=db)
    firebase_agg_tool = FirebaseAggregationTool(db=db)

    vectorstore = None
    if embeddings is not None and Chroma is not None:
        try:
            vectorstore = create_hotel_vector_store(db, embeddings)
        except Exception as e:
            print("Kh√¥ng th·ªÉ t·∫°o vectorstore:", e)
            vectorstore = None

    tools = [firebase_query_tool, firebase_agg_tool]

    if vectorstore:
        retriever = vectorstore.as_retriever(search_kwargs={'k': 5})
        semantic_tool = SemanticSearchTool(retriever=retriever)
        tools.append(semantic_tool)

    # NOTE: ph·∫ßn t·∫°o prompt/agent ph·ª• thu·ªôc r·∫•t l·ªõn v√†o version c·ªßa langchain.
    # ·ªû ƒë√¢y ta c·ªë g·∫Øng t·∫°o agent b·∫±ng API-level ƒë∆°n gi·∫£n, nh∆∞ng n·∫øu l·ªói, b·∫°n c·∫ßn c·∫≠p nh·∫≠t theo version langchain hi·ªán h√†nh.
    try:
        from langchain.agents import create_openai_tools_agent, AgentExecutor
        agent = create_openai_tools_agent(llm, tools, None)
        agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True, max_iterations=10)
    except Exception:
        # Fallback nh·∫π: ƒë∆°n gi·∫£n b·ªçc tools th√†nh object c√≥ run() tr·∫£ chu·ªói
        class SimpleAgentExecutor:
            def __init__(self, tools):
                self.tools = tools
            def run(self, input_str):
                return "Agent kh√¥ng ƒë∆∞·ª£c kh·ªüi t·∫°o do kh√°c version langchain. H√£y ki·ªÉm tra imports v√† phi√™n b·∫£n langchain."
        agent_executor = SimpleAgentExecutor(tools)

    return agent_executor


In [None]:

# PH·∫¶N 9: H√ÄM CHAT (c·ªë g·∫Øng t∆∞∆°ng th√≠ch v·ªõi nhi·ªÅu interface c·ªßa agent_executor)
def chat(user_message, history, agent_executor):
    if history is None:
        history = []
    try:
        # Th·ª≠ nhi·ªÅu c√°ch g·ªçi kh√°c nhau tu·ª≥ version langchain
        result = None
        if hasattr(agent_executor, "invoke"):
            try:
                result = agent_executor.invoke({"input": user_message})
            except Exception:
                result = agent_executor.invoke(user_message)
        elif hasattr(agent_executor, "run"):
            result = agent_executor.run(user_message)
        elif callable(agent_executor):
            result = agent_executor({"input": user_message})
        else:
            result = "Agent kh√¥ng h·ªó tr·ª£ ph∆∞∆°ng th·ª©c g·ªçi t·ª± ƒë·ªông. Ki·ªÉm tra AgentExecutor."

        # L·∫•y bot_reply an to√†n (result c√≥ th·ªÉ l√† dict ho·∫∑c str)
        bot_reply = None
        if isinstance(result, dict):
            bot_reply = result.get('output') or result.get('result') or str(result)
        elif isinstance(result, str):
            bot_reply = result
        else:
            try:
                bot_reply = str(result)
            except Exception:
                bot_reply = "Kh√¥ng th·ªÉ l·∫•y k·∫øt qu·∫£ t·ª´ agent."

        history.append({ "role": "user", "content": user_message })
        history.append({ "role": "assistant", "content": bot_reply })

        return "", history
    except Exception as e:
        error_msg = f"L·ªói: {str(e)}"
        history.append({ "role": "user", "content": user_message })
        history.append({ "role": "assistant", "content": error_msg })
        return "", history


# PH·∫¶N 10: MAIN - Kh·ªüi ch·∫°y Gradio (khi ƒë√£ c√≥ db v√† agent_executor)
def launch_ui(agent_executor):
    with gr.Blocks() as demo:
        gr.Markdown("# üè® Chillstay Chatbot - Firebase Edition (Notebook)")
        gr.Markdown("Tr·ª£ l√Ω AI gi√∫p b·∫°n t√¨m ki·∫øm v√† qu·∫£n l√Ω kh√°ch s·∫°n")
        chatbot = gr.Chatbot(type='messages', height=500)
        with gr.Row():
            txt = gr.Textbox(show_label=False, placeholder="H·ªèi v·ªÅ kh√°ch s·∫°n, ph√≤ng, ƒë·∫∑t ph√≤ng...", scale=9)
            btn = gr.Button("G·ª≠i", scale=1)
        txt.submit(lambda msg, hist: chat(msg, hist, agent_executor), [txt, chatbot], [txt, chatbot])
        btn.click(lambda msg, hist: chat(msg, hist, agent_executor), [txt, chatbot], [txt, chatbot])
        gr.Examples(examples=["C√≥ bao nhi√™u kh√°ch s·∫°n?", "T√¨m 5 kh√°ch s·∫°n c√≥ rating cao nh·∫•t", "Ph√≤ng n√†o gi√° d∆∞·ªõi 500k?", "Rating trung b√¨nh c·ªßa kh√°ch s·∫°n ·ªü H√† N·ªôi?" ], inputs=txt)
    demo.launch(share=True)

# V√≠ d·ª• (khi ch·∫°y locally)
# service_account_path = "/path/to/your/firebase-key.json"
# db = initialize_firebase(service_account_path)
# agent_executor = create_chillstay_agent(db)
# launch_ui(agent_executor)



## Ghi ch√∫ & Fixes (Nh·ªØng ƒëi·ªÉm c·∫ßn ch√∫ √Ω khi ch·∫°y tr√™n m√°y th·ª±c t·∫ø)

1. **C√°c import c·ªßa LangChain**: LangChain thay ƒë·ªïi nhi·ªÅu gi·ªØa c√°c version ‚Äî c√°c module nh∆∞ `langchain_openai`, `langchain_core`, `langchain_community` c√≥ th·ªÉ **kh√¥ng t·ªìn t·∫°i**. N·∫øu b·∫°n g·∫∑p l·ªói import:
   - Ki·ªÉm tra version `pip show langchain` v√† ƒë·ªçc changelog ho·∫∑c docs t∆∞∆°ng ·ª©ng.
   - Thay th·∫ø `from langchain_openai import ChatOpenAI` b·∫±ng `from langchain.chat_models import ChatOpenAI` n·∫øu c·∫ßn.
   - T∆∞∆°ng t·ª± v·ªõi `Chroma` v√† `OpenAIEmbeddings` (chroma c√≥ th·ªÉ n·∫±m trong `langchain.vectorstores` ho·∫∑c `langchain_community.vectorstores`).

2. **Chroma.from_texts signature**: T√πy version, t√™n tham s·ªë embedding c√≥ th·ªÉ l√† `embedding` ho·∫∑c `embedding_function`. Notebook ƒë√£ th·ª≠ fallback, nh∆∞ng n·∫øu v·∫´n l·ªói, ki·ªÉm tra docs c·ªßa phi√™n b·∫£n Chroma b·∫°n ƒëang d√πng.

3. **Agent API**: H√†m t·∫°o agent (`create_openai_tools_agent`, `AgentExecutor`, `invoke`, `run`) thay ƒë·ªïi gi·ªØa c√°c release. N·∫øu agent kh√¥ng kh·ªüi t·∫°o, t·ªët nh·∫•t l√† xem t√†i li·ªáu langchain version b·∫°n c√†i v√† thay th·∫ø ph·∫ßn t·∫°o agent b·∫±ng m·ªôt `Tool`-based agent ho·∫∑c custom loop g·ªçi tools.

4. **Filters operator**: T√¥i ƒë√£ b·ªï sung h·ªó tr·ª£ cho filters d·∫°ng `{"price": {"<": 500000}}`. H√£y ƒë·∫£m b·∫£o d·ªØ li·ªáu `price` trong Firestore l√† numeric ƒë·ªÉ so s√°nh.

5. **Quy·ªÅn truy c·∫≠p v√† credentials**:
   - Firebase: cung c·∫•p path t·ªõi file service account JSON khi g·ªçi `initialize_firebase()`.
   - OpenAI: export `OPENAI_API_KEY` tr∆∞·ªõc khi ch·∫°y notebook, ho·∫∑c nh·∫≠p b·∫±ng input (getpass).

6. **Gradio**: Phi√™n b·∫£n gradio ƒë√¥i khi thay ƒë·ªïi API (v√≠ d·ª• Chatbot signature). N·∫øu l·ªói, ki·ªÉm tra version gradio v√† thay ƒë·ªïi tham s·ªë t∆∞∆°ng ·ª©ng.

7. **Ki·ªÉm tra b∆∞·ªõc th·ª≠ nghi·ªám**: Trong script g·ªëc c√≥ ph·∫ßn test agent b·∫±ng `agent_executor.invoke`. T√¥i ƒë√£ chu·∫©n h√≥a th√†nh th·ª≠ nhi·ªÅu c√°ch g·ªçi ƒë·ªÉ tr√°nh l·ªói ph∆∞∆°ng th·ª©c kh√¥ng t·ªìn t·∫°i.

N·∫øu b·∫°n mu·ªën, t√¥i c√≥ th·ªÉ ti·∫øp t·ª•c **ch·∫°y static checks** ho·∫∑c **s·ª≠a ƒë·ªïi code theo exact langchain version** b·∫°n ƒëang s·ª≠ d·ª•ng ‚Äî nh∆∞ng ƒë·ªÉ l√†m vi·ªác ƒë√≥ t√¥i s·∫Ω c·∫ßn b·∫°n cho bi·∫øt version (v√≠ d·ª• `pip show langchain`, `pip show chromadb`) ho·∫∑c cho ph√©p t√¥i xem output l·ªói khi b·∫°n ch·∫°y tr√™n m√°y.
