**EECS 6895 Information Processing Final Project - Emotional Nuitionist Chatbot with RAG and Agent**  

*Author: Dongbing Han (UNI: DH3071)*  
*Author: Ziyao Zhou (UNI: ZZ2915)*  
*Author: Yigang Meng (UNI: YM3068)*

- **Project Focus:**  
  Build an Emotion-Aware Nutritionist Chatbot that combines emotion classification with retrieval-augmented generation (RAG) to deliver personalized and contextually relevant dietary advice.

- **Core Model:**  
  - Leverages OpenAI's GPT-4o for high-quality natural language understanding   
  and response generation.
  - Responses are enriched using context retrieved from a domain-specific vector store of nutrition articles.
- **BEAM Fine-Tuning Process:**  
  - Fine-tunes a roberta-base model on the GoEmotions (simplified 27-label) dataset to build a domain-specific emotion classifier.
  - The trained model detects the user's emotional state from each query, enabling prompt rewriting for emotionally-aware and empathetic GPT responses.

- **Emotion Integration:**  
  - Incorporates the BEAM Emotion Classifier to detect the user's emotional state from each query.
  - Ensures the chatbot can generate responses that are both relevant and informative within the nuition field.

- **Retrieval Mechanism:**  
  - Automatically fetches the most pertinent articles from the NCBI database.  
  - Uses these articles as a knowledge base for the RAG component.

- **Hybrid Search Based on Similarity Threshold**  
  - Leverages a two-step retrieval process by first searching a local FAISS index for relevant documents.
  - If the similarity score of retrieved documents meets or exceeds a defined threshold (e.g., 0.65), the system returns the results directly. Otherwise the system do realtime queries NCBI to fetch new articles, which are then embedded and indexed.
  - By maintaining pmids.json, we effectively filters out the duplicates, making sure the efficiency of our retrieval system.

- **Integrated Approach:**  
  - Combines fine-tuned language generation with dynamic retrieval from NCBI.  
  - Delivers context-aware recommendations and insights.


- **Working Flow:**  
  - When a user asks a question about the impact of a surge for diabetes, if it is the first question:  
    - Search and retrieve relevant articles from the **NCBI database**.  
    - Generate a response based on the retrieved articles and the query.  

  - If it is not the first question:  
    - Perform a **cosine similarity search** across all the articles already retrieved from the **NCBI database**.  
    - Check if the existing materials provide enough support for answering the question.  
      - If sufficient material is available → Directly generate the response based on retrieved articles.  
      - If material is insufficient → Search the **NCBI database** for more relevant articles to support the question.  

  - This retrieval-augmented generation (RAG) method follows a **stack RAG** approach:  
    - Newly retrieved materials are added to the original articles.  
    - The system continuously expands the knowledge base to ensure stronger evidence for responses.  
    - This dynamic retrieval ensures more comprehensive and accurate answers to evolving queries.  


# Part 0: Project and Model Setup

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
!pip install --upgrade openai

Collecting openai
  Downloading openai-1.78.1-py3-none-any.whl.metadata (25 kB)
Downloading openai-1.78.1-py3-none-any.whl (680 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m680.9/680.9 kB[0m [31m20.9 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: openai
  Attempting uninstall: openai
    Found existing installation: openai 1.77.0
    Uninstalling openai-1.77.0:
      Successfully uninstalled openai-1.77.0
Successfully installed openai-1.78.1


In [None]:
import openai
from openai import OpenAI
import os
import json
import pandas as pd

In [None]:
# --- Testing the Chatgpt ---

client = OpenAI(api_key='Your API Key')

user_query = 'Write a one-sentence bedtime story about a unicorn.'

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": f"{user_query}"}
        ],
        temperature=0.5
    )

print(response.choices[0].message.content)

Under a shimmering starlit sky, Luna the unicorn softly trotted through the enchanted forest, her glowing horn lighting the way for dreams to follow her gentle path.


In [None]:
def call_gpt_agent(role_desc, user_query, data_context):
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": role_desc},
            {"role": "user", "content": f"{user_query}\nContext:\n{data_context}"}
        ],
        temperature=0.5
    )
    return response.choices[0].message.content

In [None]:
!pip install -U langchain-community

Collecting langchain-community
  Downloading langchain_community-0.3.24-py3-none-any.whl.metadata (2.5 kB)
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain-community)
  Downloading dataclasses_json-0.6.7-py3-none-any.whl.metadata (25 kB)
Collecting pydantic-settings<3.0.0,>=2.4.0 (from langchain-community)
  Downloading pydantic_settings-2.9.1-py3-none-any.whl.metadata (3.8 kB)
Collecting httpx-sse<1.0.0,>=0.4.0 (from langchain-community)
  Downloading httpx_sse-0.4.0-py3-none-any.whl.metadata (9.0 kB)
Collecting marshmallow<4.0.0,>=3.18.0 (from dataclasses-json<0.7,>=0.5.7->langchain-community)
  Downloading marshmallow-3.26.1-py3-none-any.whl.metadata (7.3 kB)
Collecting typing-inspect<1,>=0.4.0 (from dataclasses-json<0.7,>=0.5.7->langchain-community)
  Downloading typing_inspect-0.9.0-py3-none-any.whl.metadata (1.5 kB)
Collecting python-dotenv>=0.21.0 (from pydantic-settings<3.0.0,>=2.4.0->langchain-community)
  Downloading python_dotenv-1.1.0-py3-none-any.whl.metadata (24 kB

In [None]:
!pip install faiss-cpu sentence-transformers langchain

Collecting faiss-cpu
  Downloading faiss_cpu-1.11.0-cp311-cp311-manylinux_2_28_x86_64.whl.metadata (4.8 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch>=1.11.0->sentence-transformers)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch>=1.11.0->sentence-transformers)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch>=1.11.0->sentence-transformers)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch>=1.11.0->sentence-transformers)
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==12.4.5.8 (from torch>=1.11.0->sentence-transformers)
  Downloading nvidia_cublas_cu12-12.4.5.8-py3-none-manylinux2014_x86_6

In [None]:
!pip install transformers torch accelerate



In [None]:
!pip install tiktoken



In [None]:
import time
import re
import textwrap
import pandas as pd
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from sentence_transformers import SentenceTransformer
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings, SentenceTransformerEmbeddings
from langchain.docstore.document import Document
from langchain.vectorstores import FAISS
from langchain.chains.question_answering import load_qa_chain
from langchain.prompts import PromptTemplate
from langchain_community.embeddings import VoyageEmbeddings, HuggingFaceInferenceAPIEmbeddings
from langchain.schema import Document

In [None]:
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
import numpy as np

import os
import json
import faiss
import pandas as pd
# from your_pubmed_module import perform_esearch_ids, perform_esearch_abstracts

In [None]:
import requests
from xml.etree import ElementTree as ET
import math

# Part 1: Emotion Recognition Module

In [None]:
id2label = [
    "admiration", "amusement", "anger", "annoyance", "approval",
    "caring", "confusion", "curiosity", "desire", "disappointment",
    "disapproval", "disgust", "embarrassment", "excitement", "fear",
    "gratitude", "grief", "joy", "love", "nervousness",
    "optimism", "pride", "realization", "relief", "remorse",
    "sadness", "surprise"
]

In [None]:
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline

class BeamEmotionClassifier:
    def __init__(self, model_path="/content/drive/MyDrive/beam_bert"):
        self.tokenizer = AutoTokenizer.from_pretrained(model_path)
        self.model = AutoModelForSequenceClassification.from_pretrained(model_path)
        self.classifier = pipeline(
            "text-classification",
            model=self.model,
            tokenizer=self.tokenizer,
            top_k=1
        )
        self.id2label = [
            "admiration", "amusement", "anger", "annoyance", "approval",
            "caring", "confusion", "curiosity", "desire", "disappointment",
            "disapproval", "disgust", "embarrassment", "excitement", "fear",
            "gratitude", "grief", "joy", "love", "nervousness",
            "optimism", "pride", "realization", "relief", "remorse",
            "sadness", "surprise"
        ]

    def predict(self, text):
        result = self.classifier(text)[0][0]
        label_idx = int(result['label'].split("_")[-1])
        emotion = self.id2label[label_idx]
        return emotion, result['score']

Rewrite the prompt the enhance emotion awareness

In [None]:
import openai
def rewrite_prompt_with_emotion(user_input, emotion):
    system_prompt = f"""
You are an assistant that rewrites user prompts to make them emotion-aware.

User Input: {user_input}

Detected Emotion: {emotion}

Be human-like and nuanced.
Rewrite the prompt based on detected emotion, also refine the clarity.
    """.strip()

    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": "You rewrite prompts to be emotion-aware."},
            {"role": "user", "content": system_prompt}
        ],
        temperature=0.3
    )

    rewritten_prompt = response.choices[0].message.content
    return rewritten_prompt

# Part 2:  Abstracts Retrieving Module from the NCBI Database

In [None]:
#Import NCBI database
db = 'pubmed'
base_url = 'https://eutils.ncbi.nlm.nih.gov/entrez/eutils/'
NCBI_API_KEY = 'Your API Key'

In [None]:
def perform_esearch_ids(query, api_key, sort_by="relevance", retmax=100):
    # Build parameters dictionary for the API request
    params = {
        "db": db,
        "term": query,
        "usehistory": "y",
        "api_key": NCBI_API_KEY,
        "sort": sort_by,
        "retmax": retmax
    }

    # Execute the GET request with parameters
    response = requests.get(f"{base_url}esearch.fcgi", params=params)
    xml_response = response.text

    # Parse the XML response
    root = ET.fromstring(xml_response)
    pmids = [elem.text for elem in root.findall(".//Id")]

    # Extract the WebEnv value directly from the XML
    webenv = root.findtext(".//WebEnv")

    return pmids, webenv


In [None]:
def perform_esearch_abstracts(pmids, web, api_key, chunk_size=200):
    articles_info = []

    # Use the minimum of chunk_size or the total number of PMIDs
    effective_chunk_size = min(len(pmids), chunk_size)

    for i in range(0, len(pmids), effective_chunk_size):
        current_ids = pmids[i:i + effective_chunk_size]
        params = {
            "db": db,
            "WebEnv": web,
            "api_key": NCBI_API_KEY,
            "retmode": "xml",
            "rettype": "abstract",
            "id": ",".join(current_ids)
        }
        response = requests.get(f"{base_url}efetch.fcgi", params=params)
        root = ET.fromstring(response.text)

        for article in root.findall('.//PubmedArticle'):
            # Extract the article title
            title = article.findtext('.//ArticleTitle')

            # Extract publication date if available
            try:
                pub_date_elem = article.find('.//ArticleDate')
                year = pub_date_elem.findtext('Year')
                month = pub_date_elem.findtext('Month')
                day = pub_date_elem.findtext('Day')
                pub_date = f"{year}-{month}-{day}"
            except Exception:
                pub_date = None

            # Extract authors
            authors = []
            for author in article.findall('.//Author'):
                first_name = author.findtext('ForeName')
                last_name = author.findtext('LastName')
                if first_name and last_name:
                    authors.append(f"{first_name} {last_name}")
            if not authors:
                authors = None

            # Extract journal name
            journal_info = article.find('.//MedlineJournalInfo')
            journal_name = journal_info.findtext('MedlineTA') if journal_info is not None else None

            # Extract DOI
            doi = article.findtext('.//ELocationID[@EIdType="doi"]')

            # Extract abstract text by joining all parts
            abstract_parts = [elem.text for elem in article.findall('.//AbstractText') if elem.text]
            abstract_text = " ".join(abstract_parts)

            articles_info.append([abstract_text, title, pub_date, authors, journal_name, doi])

    return articles_info

In [None]:
def extract_keywords_for_ncbi(text):
    # 1. System role: define the agent’s job
    role_desc = (
        "You are an assistant that extracts exactly two concise search terms "
        "suitable for querying the NCBI database. "
        "Only output the two terms separated by a comma, with no extra text."
    )

    # 2. User prompt: what to do with the input question
    user_query = (
        f"Extract two keywords from this question for an NCBI dataset search:\n"
        f"\"{text}\"\n"
        f"Keywords:"
    )

    # 3. No additional context needed here
    data_context = ""

    # 4. Call your GPT helper
    response = call_gpt_agent(role_desc, user_query, data_context)

    # 5. Return the raw, stripped output
    return response.strip()

# Part 3:  Full Article Retrieving Module from the NCBI Database

In [None]:
def rank_docs_by_tfidf(pmids, abstracts, query, top_n=10, stop_words='english'):
    """
    Ranks documents by cosine similarity between their TF–IDF vectors and the query.

    Args:
        pmids (List[str]): List of PubMed IDs corresponding to each abstract.
        abstracts (List[str]): List of abstract texts.
        query (str): The user query or keywords string.
        top_n (int): Number of top documents to return.
        stop_words (str or List[str]): Stop words to remove during tokenization.

    Returns:
        List[Tuple[str, float, int]]: List of tuples (pmid, similarity_score, index),
                                      sorted by descending score.
    """
    # Fit TF–IDF on all abstracts plus the query as the last document
    vectorizer = TfidfVectorizer(stop_words=stop_words)
    tfidf_matrix = vectorizer.fit_transform(abstracts + [query])

    # Separate document and query vectors
    doc_vectors = tfidf_matrix[:-1]
    query_vector = tfidf_matrix[-1].reshape(1, -1)

    # Compute cosine similarities
    scores = cosine_similarity(doc_vectors, query_vector).flatten()

    # Get top_n indices sorted by similarity descending
    top_indices = np.argsort(scores)[::-1][:top_n]

    # Build results: (pmid, score, original_index)
    results = [(pmids[i], scores[i], i) for i in top_indices]
    return results

In [None]:
def get_articles(query, nb_article=100, pmid_file="pmids.json"):
    """
    Retrieve articles from NCBI based on the query and filter out those that have already been processed,
    using a local JSON file to track PMIDs.

    Parameters:
      query (str): The search query.
      nb_article (int): Number of articles to retrieve.
      pmid_file (str): Path to the JSON file used to store processed PMIDs.

    Returns:
      pd.DataFrame: A DataFrame containing new articles.
    """
    # Load existing PMIDs from the JSON file if it exists, otherwise use an empty set.
    if os.path.exists(pmid_file):
        with open(pmid_file, "r") as f:
            existing_pmids = set(json.load(f))
    else:
        existing_pmids = set()

    NCBI_API_KEY = 'Your API Key'
    publication_ids, web_key = perform_esearch_ids(query, NCBI_API_KEY, sort_by="relevance", retmax=nb_article)
    new_retrieved_articles = perform_esearch_abstracts(publication_ids, web_key, NCBI_API_KEY)
    new_trtieved_abstracts = [article[0] for article in new_retrieved_articles]
    top10_score = rank_docs_by_tfidf(publication_ids, new_trtieved_abstracts, query, top_n=10)
    top10_pmids = [pmid for pmid, score, idx in top10_score]
    new_publication_ids = list(existing_pmids.union(top10_pmids))

    if not new_publication_ids:
        return pd.DataFrame()

    articles_informations = perform_esearch_abstracts(new_publication_ids, web_key, NCBI_API_KEY)

    df_pmids = pd.DataFrame(new_publication_ids, columns=['PmID'])
    df_info = pd.DataFrame(articles_informations,
                           columns=['Abstract', 'Title', 'Publication_Date', 'Authors', 'Journal', 'DOI'])
    df = pd.concat([df_pmids, df_info], axis=1)
    df['query'] = query

    # Update the set of processed PMIDs and save it back to the JSON file.
    existing_pmids.update(new_publication_ids)
    with open(pmid_file, "w") as f:
        json.dump(list(existing_pmids), f)

    return df

In [None]:
def prepare_corpus(df_corpus, existing_pmids):
    # Filter out articles that have already been indexed based on their PubMed IDs
    clean_df_corpus = df_corpus[~df_corpus["PmID"].isin(existing_pmids)]

    # Store meta_chunk for RAG
    # 1) Drop the Abstract column to get only your metadata columns
    meta_df = clean_df_corpus.drop(columns=["Abstract"])
    # 2) Convert each row of metadata into a dict
    meta_rows  = meta_df.to_dict(orient="records")

    # Create a text splitter to divide long abstracts into smaller chunks (e.g., 2000 characters each)
    splitter = RecursiveCharacterTextSplitter(chunk_size=2000, chunk_overlap=0)

    # # Use all columns except 'Abstract' as metadata
    # metadata_columns = [col for col in clean_df_corpus.columns if col != 'Abstract']

    # # Create documents: each document is a text chunk with associated metadata
    # documents = splitter.create_documents(
    #     clean_df_corpus['Abstract'],
    #     metadatas=clean_df_corpus[metadata_columns].to_dict('records')
    # )

    docs = splitter.create_documents(clean_df_corpus["Abstract"], metadatas=meta_rows)
    chunks = splitter.split_documents(docs)

    # # Further split the documents into chunks if necessary
    # document_chunks = splitter.split_documents(documents)

    # Return the prepared document chunks and a list of new PMIDs
    return docs, list(clean_df_corpus["PmID"].unique()), chunks


In [None]:
def get_text_chunks(text_list):
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=10000, chunk_overlap=1000)
    all_chunks = []
    for text in text_list:
        # Ensure the text is a string; if not, convert it.
        if not isinstance(text, str):
            text = str(text)
        chunks = text_splitter.split_text(text)
        all_chunks.extend(chunks)
    return all_chunks

In [None]:
def processing_with_model(question, num_article=100):
    key_terms = extract_keywords_for_ncbi(question)
    df = get_articles(key_terms, nb_article=num_article)
    if df.empty:
        print("No new articles found.")
        return [], []

    # here you could maintain a persistent `seen_pmids` set
    seen = set()

    docs, new_pids, _ = prepare_corpus(df, seen)
    if not docs:
        print("No new docs to index.")
        return [], []

    # extract text & metadata per chunk
    text_chunks = [d.page_content for d in docs]
    meta_chunks = [d.metadata     for d in docs]
    return text_chunks, meta_chunks

# Part 4: Design RAG with OpenAI Embeddings

In [None]:
# 1) Initialize the OpenAI embedding model
#    This will under the hood call the embeddings endpoint (e.g. text-embedding-ada-002).
embeddings_model = OpenAIEmbeddings(
    model="text-embedding-ada-002",           # or another embedding-capable model
    openai_api_key="Your API Key"
)
print(embeddings_model)

client=<openai.resources.embeddings.Embeddings object at 0x7ae474f6ba10> async_client=<openai.resources.embeddings.AsyncEmbeddings object at 0x7ae474f55650> model='text-embedding-ada-002' deployment='text-embedding-ada-002' openai_api_version='' openai_api_base=None openai_api_type='' openai_proxy='' embedding_ctx_length=8191 openai_api_key='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' openai_organization=None allowed_special=set() disallowed_special='all' chunk_size=1000 max_retries=2 request_timeout=None headers=None tiktoken_enabled=True tiktoken_model_name=None show_progress_bar=False model_kwargs={} skip_empty=False default_headers=None default_query=None retry_min_seconds=4 retry_max_seconds=20 http_client=None


  embeddings_model = OpenAIEmbeddings(


In [None]:
EMB_MODEL = embeddings_model
INDEX_DIR = "faiss_index"
PMID_FILE = "pmids.json"

In [None]:
def get_normalized_query_embedding(query: str):
    emb = EMB_MODEL.embed_query(query)
    norm = np.linalg.norm(emb)
    return emb / norm if norm else emb

In [None]:
def build_vector_store(chunks, metadatas, index_path="faiss_index"):
    """
    Build or update a FAISS vector store using OpenAI embeddings.

    Args:
      chunks      List[str]: the text of each chunk
      metadatas   List[dict]: parallel list of metadata dicts for each chunk
      index_path  str: folder to save/load index

    Returns:
      FAISS: a LangChain-wrapped FAISS store
    """
    docs = [Document(page_content=txt, metadata=md)
            for txt,md in zip(chunks, metadatas)]

    if os.path.isdir(index_path):
        # Trusting that this is your own index, so allow pickle deserialization:
        store = FAISS.load_local(
            index_path,
            embeddings_model,
            allow_dangerous_deserialization=True
        )
        store.add_documents(docs)
    else:
        store = FAISS.from_documents(docs, embeddings_model)

    store.save_local(index_path)
    return store


In [None]:
def search_vector_store(vector_store: FAISS, query_embedding: np.ndarray, top_k: int = 5):
    # Ensure correct shape
    if query_embedding.ndim == 1:
        query_embedding = query_embedding.reshape(1, -1)

    # FAISS search
    scores, indices = vector_store.index.search(query_embedding, top_k)

    # Map back to Documents
    docs = [
        vector_store.docstore.search(vector_store.index_to_docstore_id[i])
        for i in indices[0] if i in vector_store.index_to_docstore_id
    ]
    return list(zip(docs, scores[0]))

In [None]:
def local_search(query_embed: np.ndarray, top_k: int, threshold: float = 0.65):
    """
    Returns (qualified_local_hits, k_remaining).
    """
    if not os.path.isdir(INDEX_DIR):
        return [], top_k

    store = FAISS.load_local(
        INDEX_DIR,
        EMB_MODEL,
        allow_dangerous_deserialization=True
    )
    hits = search_vector_store(store, query_embed, top_k)
    qualified = [(doc, float(score)) for doc, score in hits if score >= threshold]
    k_remain = top_k - len(qualified)
    return qualified, k_remain

In [None]:
def hybrid_search(query: str, top_k: int = 10, threshold: float = 0.65):
    """
    1) Do a local FAISS search & filter by threshold.
    2) If we have >= top_k local hits, return them.
    3) Otherwise fetch/process k additional articles, ingest them, re-search for k,
       and return local_hits + fresh_hits.
    """
    # 1) Embed and local search
    q_embed, = get_normalized_query_embedding(query),
    local_hits, k = local_search(q_embed, top_k, threshold)

    # 2) If enough local, just return
    if k <= 0:
        return local_hits

    # 3) Need k more → fetch & process new articles
    # print(f"Only {len(local_hits)} local hits ≥{threshold}. Fetching {k} new articles…")
    text_chunks, meta_chunks = processing_with_model(query)

    if not text_chunks:
        # fallback to whatever local we have
        return local_hits

    # 6) Return combined results
    return local_hits + text_chunks

def hybrid_search(query: str, top_k: int = 10, threshold: float = 0.65):
    """
    1) Do a local FAISS search & filter by threshold.
    2) If we have >= top_k local hits, return them.
    3) Otherwise fetch/process k additional articles, ingest them, re-search for k,
       and return local_hits + fresh_hits.
    """
    # 1) Embed and local search
    q_embed = get_normalized_query_embedding(query)
    local_hits, k = local_search(q_embed, top_k, threshold)

    # 2) If enough local, just return
    if k <= 0:
        return local_hits

    # 3) Need k more → fetch & process new articles
    # print(f"Only {len(local_hits)} local hits ≥{threshold}. Fetching {k} new articles…")
    text_chunks, meta_chunks = processing_with_model(query)

    if not text_chunks:
        # fallback to whatever local we have
        return local_hits

    # ADD THIS: Build/update the vector store with new chunks
    build_vector_store(text_chunks, meta_chunks, INDEX_DIR)

    # 6) Return combined results
    return local_hits + text_chunks

# Part 5: Design the ChatBot with UI


## Router Agent: Tavily or NBCI?


Router

Tavily Agent

NCBI Agent

In [None]:
!pip install -U langchain-openai

Collecting langchain-openai
  Downloading langchain_openai-0.3.16-py3-none-any.whl.metadata (2.3 kB)
Downloading langchain_openai-0.3.16-py3-none-any.whl (62 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m62.8/62.8 kB[0m [31m3.9 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: langchain-openai
Successfully installed langchain-openai-0.3.16


In [None]:
!pip install -U langgraph

Collecting langgraph
  Downloading langgraph-0.4.3-py3-none-any.whl.metadata (7.9 kB)
Collecting langgraph-checkpoint<3.0.0,>=2.0.10 (from langgraph)
  Downloading langgraph_checkpoint-2.0.25-py3-none-any.whl.metadata (4.6 kB)
Collecting langgraph-prebuilt>=0.1.8 (from langgraph)
  Downloading langgraph_prebuilt-0.1.8-py3-none-any.whl.metadata (5.0 kB)
Collecting langgraph-sdk>=0.1.42 (from langgraph)
  Downloading langgraph_sdk-0.1.66-py3-none-any.whl.metadata (1.8 kB)
Collecting ormsgpack<2.0.0,>=1.8.0 (from langgraph-checkpoint<3.0.0,>=2.0.10->langgraph)
  Downloading ormsgpack-1.9.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (43 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m43.5/43.5 kB[0m [31m3.0 MB/s[0m eta [36m0:00:00[0m
Downloading langgraph-0.4.3-py3-none-any.whl (151 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m151.2/151.2 kB[0m [31m8.5 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading langgraph_checkpoi

In [None]:
pip install -U langchain_community langchain_anthropic langchain_experimental matplotlib langgraph

Collecting langchain_anthropic
  Downloading langchain_anthropic-0.3.13-py3-none-any.whl.metadata (1.9 kB)
Collecting langchain_experimental
  Downloading langchain_experimental-0.3.4-py3-none-any.whl.metadata (1.7 kB)
Collecting matplotlib
  Downloading matplotlib-3.10.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (11 kB)
Collecting anthropic<1,>=0.51.0 (from langchain_anthropic)
  Downloading anthropic-0.51.0-py3-none-any.whl.metadata (25 kB)
Downloading langchain_anthropic-0.3.13-py3-none-any.whl (26 kB)
Downloading langchain_experimental-0.3.4-py3-none-any.whl (209 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m209.2/209.2 kB[0m [31m13.0 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading matplotlib-3.10.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (8.6 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m8.6/8.6 MB[0m [31m99.5 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading anthropic-0.51.0-py3-none-any.whl (2

api

In [None]:
import getpass
import os

openai_key='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'
tavily_key = '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'
os.environ["OPENAI_API_KEY"] = openai_key
os.environ["TAVILY_API_KEY"] = tavily_key
from langchain.chat_models import init_chat_model
model = init_chat_model("gpt-4o", model_provider="openai")

tools

In [None]:
from langchain_community.tools.tavily_search import TavilySearchResults
from typing import Annotated
from langchain_core.tools import tool

# Tavily Search
search = TavilySearchResults(max_results=2, verbose=True)


# NCBI
@tool # Added description to the decorator
def NCBI(query: str, top_k: int = 10, threshold: float = 0.65) -> list:
    """
    This tool retrieve acdemic and credible articles from NCBI.
    1) Do a local FAISS search & filter by threshold.
    2) If we have >= top_k local hits, return them.
    3) Otherwise fetch/process k additional articles, ingest them, re-search for k,
       and return local_hits + fresh_hits.
    """
    # 1) Embed and local search
    q_embed, = get_normalized_query_embedding(query),
    local_hits, k = local_search(q_embed, top_k, threshold)

    # 2) If enough local, just return
    if k <= 0:
        return local_hits

    # 3) Need k more → fetch & process new articles
    # print(f"Only {len(local_hits)} local hits ≥{threshold}. Fetching {k} new articles…")
    text_chunks, meta_chunks = processing_with_model(query)

    if not text_chunks:
        # fallback to whatever local we have
        return local_hits

    # 6) Return combined results
    return local_hits + text_chunks


tools = [search, NCBI]




Create Agents

In [None]:
from langgraph.prebuilt import create_react_agent
from langchain_core.messages import HumanMessage
# emotion = "Depression"
# Router_agent = create_react_agent(model, tools,
#                                   prompt =
#                                   f"""
#                                   Your are a emotion-awared agent in nutrition and mental health.
#                                   Based on users prompt and the {emotion}, you must call the best tool to retrieve the result.
#                                   Search information when user is in dangerous.""")

In [None]:
# for step in Router_agent.stream(
#     {"messages": [HumanMessage(content="I can't live without her. I want to die")]},
#     stream_mode="values",
# ):
#     step["messages"][-1].pretty_print()

In [None]:
# for step, metadata in Router_agent.stream(
#     {"messages": [HumanMessage(content="What is Diabetes")]},
#     stream_mode="messages",
# ):
#     if metadata["langgraph_node"] == "agent" and (text := step.text()):
#         print(text, end="")

# UI

In [None]:
# ⚙️ 1) Imports & setup
import os
import numpy as np
import ipywidgets as widgets
from IPython.display import display
# Instantiate emotion classifier
emotion_classifier = BeamEmotionClassifier()
# — your existing functions must already be in the notebook:
#    get_normalized_query_embedding, search_vector_store,
#    processing_with_model, build_vector_store,
#    hybrid_search, etc.

# 2) Create an output area for conversation
conversation_output = widgets.Output(layout=widgets.Layout(border='1px solid gray', padding='10px', height='300px', overflow='auto'))

# 3) Sample question buttons
sample_questions = [
    "How can I incorporate more antioxidants into my diet?",
    "What's the effect of sugar on overall health?",
    "Are there any foods that can improve skin health?",
    "Exit System. See You Next Time !"
]

sample_buttons = [
    widgets.Button(description=q, layout=widgets.Layout(width='100%'))
    for q in sample_questions
]
sample_buttons_box = widgets.VBox(sample_buttons)

# 4) User text input + submit
text_widget = widgets.Text(
    value="",
    placeholder="Or ask your own question",
    description="Question:",
    layout=widgets.Layout(width='80%')
)
submit_btn = widgets.Button(
    description="Submit",
    button_style='success',
    tooltip='Submit your question'
)
user_input_box = widgets.HBox([text_widget, submit_btn])

# Updated display_response with proper RAG to GPT integration
def display_response(question: str):
    with conversation_output:
        print("\n")
        print(f">>> You: {question}")

        # Exit handling
        if question.lower() == "exit system. see you next time!" or "exit" in question.lower():
            print("Goodbye! See You Next Time!")
            text_widget.disabled = True
            submit_btn.disabled = True
            for btn in sample_buttons:
                btn.disabled = True
            return

        print("…thinking…\n")

    # ✅ 1️⃣ Emotion Classification (outside UI output so logs don't clutter)
    emotion, confidence = emotion_classifier.predict(question)
    print(f"[LOG] Emotion detected: {emotion} ({confidence:.2f})")

    # ✅ 2️⃣ Rewrite question with emotion context
    rewritten_question = rewrite_prompt_with_emotion(question, emotion)
    print(f"[LOG] Rewritten prompt: {rewritten_question}")

    with conversation_output:
        Router_agent = create_react_agent(model, tools,
                                        prompt =
                                        f"""
You are an emotion-aware agent specializing in nutrition and mental health.
When you receive a user’s prompt plus the detected {emotion}, choose the appropriate external tool to fulfill their request:

1. If the query is academic or research-oriented (e.g., clinical studies, evidence summaries, scientific references), call **NBCI**.
2. If the query requires current, real-time information (e.g., emerging dietary guidelines, news alerts, trending mental-health advice), call **TavilySearchResults**.
3. If and only if the user’s really need combines both academic rigor and up-to-date details, invoke **both**: first **NBCI**, then **TavilySearchResults**.

Always call at least one tool. If the user’s situation appears dangerous (e.g., signs of self-harm, severe malnutrition), prioritize real-time resources via **TavilySearchResults**.

""")

        for step in Router_agent.stream(
            {"messages": [HumanMessage(content=rewritten_question)]},
            stream_mode="values",
        ):
            step["messages"][-1].pretty_print()


# 6) Button callbacks
def on_sample_click(b):
    question = b.description
    display_response(question)

def on_submit_click(b):
    q = text_widget.value.strip()
    if q:
        display_response(q)
    text_widget.value = ""

for btn in sample_buttons:
    btn.on_click(on_sample_click)
submit_btn.on_click(on_submit_click)

# 7) Initial welcome message
with conversation_output:
    print("👋 Welcome to the Nutritional RAG Chatbot!")
    print("Ask me anything about nutrition, or click one of the sample questions below.\n")

# 8) Lay out and display
ui = widgets.VBox([
    conversation_output,
    widgets.HTML("<b>Sample questions:</b>"),
    sample_buttons_box,
    widgets.HTML("<b>Your question:</b>"),
    user_input_box
])
display(ui)

Device set to use cpu


VBox(children=(Output(layout=Layout(border='1px solid gray', height='300px', overflow='auto', padding='10px'))…

[LOG] Emotion detected: curiosity (0.72)
[LOG] Rewritten prompt: I'm curious about the benefits of drinking coffee. Could you also recommend some of the best coffee shops in Long Island City?


## UI

In [None]:
# # ⚙️ 1) Imports & setup
# import os
# import numpy as np
# import ipywidgets as widgets
# from IPython.display import display
# # Instantiate emotion classifier
# emotion_classifier = BeamEmotionClassifier()
# # — your existing functions must already be in the notebook:
# #    get_normalized_query_embedding, search_vector_store,
# #    processing_with_model, build_vector_store,
# #    hybrid_search, etc.

# # 2) Create an output area for conversation
# conversation_output = widgets.Output(layout=widgets.Layout(border='1px solid gray', padding='10px', height='300px', overflow='auto'))

# # 3) Sample question buttons
# sample_questions = [
#     "How can I incorporate more antioxidants into my diet?",
#     "What's the effect of sugar on overall health?",
#     "Are there any foods that can improve skin health?",
#     "Exit System. See You Next Time !"
# ]

# sample_buttons = [
#     widgets.Button(description=q, layout=widgets.Layout(width='100%'))
#     for q in sample_questions
# ]
# sample_buttons_box = widgets.VBox(sample_buttons)

# # 4) User text input + submit
# text_widget = widgets.Text(
#     value="",
#     placeholder="Or ask your own question",
#     description="Question:",
#     layout=widgets.Layout(width='80%')
# )
# submit_btn = widgets.Button(
#     description="Submit",
#     button_style='success',
#     tooltip='Submit your question'
# )
# user_input_box = widgets.HBox([text_widget, submit_btn])

# # Updated display_response with proper RAG to GPT integration
# def display_response(question: str):
#     with conversation_output:
#         print("\n")
#         print(f">>> You: {question}")

#         # Exit handling
#         if question.lower() == "exit system. see you next time!" or "exit" in question.lower():
#             print("Goodbye! See You Next Time!")
#             text_widget.disabled = True
#             submit_btn.disabled = True
#             for btn in sample_buttons:
#                 btn.disabled = True
#             return

#         print("…thinking…\n")

#     # ✅ 1️⃣ Emotion Classification (outside UI output so logs don't clutter)
#     emotion, confidence = emotion_classifier.predict(question)
#     print(f"[LOG] Emotion detected: {emotion} ({confidence:.2f})")

#     # ✅ 2️⃣ Rewrite question with emotion context
#     rewritten_question = rewrite_prompt_with_emotion(question, emotion)
#     print(f"[LOG] Rewritten prompt: {rewritten_question}")

#     # ✅ 3️⃣ Retrieve relevant context
#     results = hybrid_search(question, top_k=10, threshold=0.65)

#     with conversation_output:
#         if not results:
#             print("Sorry, I couldn't find any relevant articles.")
#             return

#         # ✅ 4️⃣ Set role description with emotion
#         role_desc = (
#             "You are an expert nutritionist. "
#             "Answer the user's emotionally-aware question based on the provided context, and feel free to include your own insights. "
#             f"The user is feeling {emotion}."
#         )
#         user_query = rewritten_question
#         data_context = results[:8]

#         # ✅ 5️⃣ Call GPT agent
#         response = call_gpt_agent(role_desc, user_query, data_context)

#         # ✅ 6️⃣ Output the final answer
#         print("\n🤖 AI:", response)


# # 6) Button callbacks
# def on_sample_click(b):
#     question = b.description
#     display_response(question)

# def on_submit_click(b):
#     q = text_widget.value.strip()
#     if q:
#         display_response(q)
#     text_widget.value = ""

# for btn in sample_buttons:
#     btn.on_click(on_sample_click)
# submit_btn.on_click(on_submit_click)

# # 7) Initial welcome message
# with conversation_output:
#     print("👋 Welcome to the Nutritional RAG Chatbot!")
#     print("Ask me anything about nutrition, or click one of the sample questions below.\n")

# # 8) Lay out and display
# ui = widgets.VBox([
#     conversation_output,
#     widgets.HTML("<b>Sample questions:</b>"),
#     sample_buttons_box,
#     widgets.HTML("<b>Your question:</b>"),
#     user_input_box
# ])
# display(ui)

Device set to use cpu


VBox(children=(Output(layout=Layout(border='1px solid gray', height='300px', overflow='auto', padding='10px'))…