# Introduction

In this notebook, we will experiment with how user queries are handled in our veterinary information retrieval system. Several collections have already been set up in the Chroma database, allowing us to directly perform information retrieval without additional setup. This environment enables us to test and refine the process of transforming user input into actionable queries and retrieving relevant information from our knowledge base.

In [2]:
from langchain_experimental.open_clip import OpenCLIPEmbeddings
from langchain_chroma import Chroma

persist_directory = '../chroma/textbook_test_Nutrition'
id_key = "doc_id"

open_clip_embeddings = OpenCLIPEmbeddings(model_name="ViT-g-14", checkpoint="laion2b_s34b_b88k")

# Vectorstore for summaries (for similarity search)
vectorstore = Chroma(
    collection_name="summaries",
    persist_directory=persist_directory,
    embedding_function=open_clip_embeddings
)
# Persistent docstore for originals (all modalities)
docstore = Chroma(
    collection_name="originals",
    persist_directory=persist_directory,
    embedding_function=open_clip_embeddings
)

# Instantiate the retriever

class UnifiedRetriever:
        
        def __init__(self, vectorstore, docstore, id_key="doc_id"):
            self.vectorstore = vectorstore
            self.docstore = docstore
            self.id_key = id_key
            self._collection = docstore._collection

        def retrieve(self, query, k=5):
            results = self.vectorstore.similarity_search_with_score(query, k=k)
            output = []
            for doc, score in results:
                doc_id = doc.metadata.get(self.id_key)
                try:
                    original = self._collection.get(ids=[doc_id], include=["documents", "metadatas"])
                    original_doc = original["documents"][0] if original["documents"] else None
                    original_meta = original["metadatas"][0] if original["metadatas"] else None
                except Exception as e:
                    original_doc = None
                    original_meta = None
                output.append({
                    "summary": doc.page_content,
                    "original": original_doc,
                    "original_metadata": original_meta,
                    "summary_metadata": doc.metadata,
                    "score": score
                })
            return output

retriever = UnifiedRetriever(vectorstore, docstore, id_key=id_key)

# Handling User Input: Analysis Image

Let's use a image of a under weighted cat. This cat is considerablly skinny with bones showing. 

  ![Skinney Cat](./skinny_cat.jpg)

In [3]:
query = "What's going on with my cat? What should I do?" 
image_path = "./skinny_cat.jpg"

import base64
import ollama
import os

# --- Configuration for the image ---
# IMPORTANT: Adjust this path if your cat.jpg is in a different location
# image_model = "minicpm-v:8b" # Or "llava:7b" or another suitable vision model you have installed via Ollama
# image_model = "llava:7b" # 

#This instruct version, q8_0 weight format fits MacBook M1 Pro better
image_model = "llama3.2-vision:11b-instruct-q4_K_M" # 



# --- 1. Generate a textual summary of the image using an LLM ---
print(f" ⏳ Processing image: {image_path}")

image_summary = "Could not generate image summary." # Default in case of error
if not os.path.exists(image_path):
    print(f"Error: Image file not found at {image_path}. Please check the path.")
else:
    try:
        # Read and encode image in base64
        with open(image_path, 'rb') as f:
            image_data = base64.b64encode(f.read()).decode('utf-8')

        # Updated prompt for detailed image summarization
        image_summarization_prompt = """From a feline veterinary stand point, provide a highly detailed and objective 
                description of the image. Focus on all observable elements, actions, 
                objects, subjects, their attributes (e.g., color, size, texture), 
                their spatial relationships, and any discernible context or implied scene. 
                Also focus on all possible health issue.
                Describe any text present in the image. This description must be exhaustive 
                and purely factual, capturing every significant visual detail to serve as a 
                comprehensive textual representation for further analysis by another AI model. 
                If the image is entirely irrelevant or contains no discernible subject, 
                state "No relevant visual information."."""

        # Send image to ollama for vision model processing
        response = ollama.chat(
            model=image_model,
            messages=[
                {
                    'role': 'user',
                    'content': image_summarization_prompt,
                    'images': [image_data]
                }
            ]
        )
        image_summary = response['message']['content']
        print("--- Generated Image Summary ---")
        print(image_summary)

    except Exception as e:
        print(f"Error processing image with Ollama: {e}")

# This 'image_summary' can now be used along with your user's text query
# for retrieval or further processing in your RAG pipeline.

 ⏳ Processing image: ./skinny_cat.jpg
--- Generated Image Summary ---
The image depicts a pale yellow cat with a thin, emaciated appearance, standing on a tiled floor next to a light green bowl. The cat's fur is sparse and lacks luster, and its eyes appear sunken, suggesting a possible health issue. Its collar is black and features a small bell.

The cat is positioned on a floor composed of small, square tiles in a reddish-brown color, which may be a ceramic material. The overall atmosphere of the image suggests that the cat is in a domestic setting, possibly in a home or an animal shelter. The presence of the bowl and the cat's emaciated state raise concerns about the cat's health and well-being.

The image may be intended to raise awareness about the importance of proper care and nutrition for pets, particularly in situations where they may be neglected or abandoned. It could also be used to promote the adoption of cats in need of a loving home. Overall, the image conveys a sense of 

# Handling User Input: Refine Query

In [4]:
from langchain_ollama import ChatOllama
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough

# Define the LLM for query refinement (using the same model as your RAG chain if appropriate)
# Compressed, Distilled Qwen, Response often in CoT <think></think>
# query_refinement_model = ChatOllama(model="deepseek-r1:7b-qwen-distill-q8_0")

#compressed that fits m1 pro, use less RAM. No CoT
query_refinement_model = ChatOllama(model="llama3.2:3b")

# Prompt for query refinement
query_refinement_prompt = ChatPromptTemplate.from_template(
    """You are an intelligent assistant. Your task is to rephrase and expand the given user query \
into a more detailed and context-rich query that can be used to retrieve relevant information \
from a veterinary knowledge base. Use the provided image description to add visual context \
and relevant keywords to the refined query. Focus on adding relevant keywords, clarifying intent, \
and anticipating related information that might be helpful. The output should be a single, refined query.

Original query: {original_query}
Image description: {image_summary}"""
)

# Create the query refinement chain
query_refinement_chain = (
    {
        "original_query": RunnablePassthrough(),
        "image_summary": RunnablePassthrough()
    }
    | query_refinement_prompt
    | query_refinement_model
    | StrOutputParser()
)# --- Demonstration of query refinement and then retrieval with scores ---

# print(f"Original user query: {query}")
# print(f"Image Summary: {image_summary}")

refined_query = query_refinement_chain.invoke(
    {"original_query": query, "image_summary": image_summary}
)

print("-"*80)
print(f"Refined query: {refined_query}")


--------------------------------------------------------------------------------
Refined query: Here's a refined query that incorporates relevant keywords, clarifies intent, and anticipates related information:

"An elderly, pale yellow domestic shorthair cat with a thin, emaciated appearance stands on a ceramic tiled floor in a home setting. The cat's fur is sparse, lacks luster, and its eyes appear sunken. It wears a black collar with a small bell and appears to be proximate to a light green, possibly empty, food bowl. Given the cat's health concerns, I'd like to know:

1. Potential causes of malnutrition or medical conditions in cats, such as diabetes, hyperthyroidism, or gastrointestinal issues.
2. Signs and symptoms of these conditions, including any visible health issues or changes in appetite, water intake, or stool quality.
3. Recommended dietary adjustments for a cat with an unknown feeding schedule, including potential food types, quantities, and frequency.
4. Suggestions for

# Query Decomposition

In [5]:
from langchain_core.output_parsers import JsonOutputParser
# Prompt for query decomposition
query_decomposition_prompt = ChatPromptTemplate.from_template(
    """You are an intelligent assistant. Your task is to break down the given complex query
into a list of simpler, focused sub-queries. Each sub-query should be a standalone question
that can be used to retrieve specific information from a veterinary knowledge base.

Output ONLY a valid JSON array of strings, and nothing else. Do not include any explanations, markdown, or extra text.

Complex query: {refined_query}
"""
)

# Create the query decomposition chain
query_decomposition_chain = (
    query_decomposition_prompt  
    | query_refinement_model    
    | JsonOutputParser() 
)

# --- Demonstration of query decomposition ---

print(f"Original refined query: {refined_query[:300]} ....")

decomposed_queries = query_decomposition_chain.invoke({"refined_query": refined_query})

print("-" * 80)
# print(f"Decomposed queries:\n{decomposed_queries}")

print(f"There are {len(decomposed_queries)} queries after decomposition \n")
print(f"Here's a example of the first one: {decomposed_queries[0]}")


Original refined query: Here's a refined query that incorporates relevant keywords, clarifies intent, and anticipates related information:

"An elderly, pale yellow domestic shorthair cat with a thin, emaciated appearance stands on a ceramic tiled floor in a home setting. The cat's fur is sparse, lacks luster, and its eyes ....
--------------------------------------------------------------------------------
There are 11 queries after decomposition 

Here's a example of the first one: What are common causes of malnutrition in cats?


# Contextual Retrievals 

In [6]:
# Assume decomposed_queries is a list of query strings
# and retriever is already instantiated

seen_doc_ids = set()
all_results = []

for query in decomposed_queries:
    results = retriever.retrieve(query, k=5)
    unique_results = []
    for res in results:
        doc_id = res.get('doc_id') or res.get('summary_metadata', {}).get('doc_id')
        if doc_id and doc_id not in seen_doc_ids:
            seen_doc_ids.add(doc_id)
            unique_results.append(res)
    if unique_results:
        all_results.append({
            "query": query,
            "results": unique_results
        })

# Summarize all_results
total_unique_docs = sum(len(entry['results']) for entry in all_results)
total_queries_with_results = len(all_results)
all_doc_ids = set()
for entry in all_results:
    for res in entry['results']:
        doc_id = res.get('doc_id') or res.get('summary_metadata', {}).get('doc_id')
        if doc_id:
            all_doc_ids.add(doc_id)

print(f"Total unique documents retrieved: {len(all_doc_ids)}")
print(f"Total queries with at least one unique result: {total_queries_with_results}")
print("Number of unique documents retrieved per query:")
for entry in all_results:
    print(f"  Query: {entry['query'][:60]}... -> {len(entry['results'])} unique docs")



Total unique documents retrieved: 15
Total queries with at least one unique result: 8
Number of unique documents retrieved per query:
  Query: What are common causes of malnutrition in cats?... -> 5 unique docs
  Query: What are the signs and symptoms of diabetes in cats?... -> 2 unique docs
  Query: How does hyperthyroidism affect a cat's appetite and water i... -> 1 unique docs
  Query: What is mental stimulation for a domestic shorthair cat?... -> 1 unique docs
  Query: Why has the cat's fur become sparse and lackluster?... -> 1 unique docs
  Query: How often should I take my neglected cat to the vet for heal... -> 2 unique docs
  Query: What diagnostic tests are necessary for malnutrition or medi... -> 2 unique docs
  Query: How can I improve a domestic shorthair cat's overall health ... -> 1 unique docs


In [None]:
all_results[0]

# Directly Answering Query with Retrieved Info

In [7]:
# Combine all summaries from all_results into one context
all_context = "\n".join(
    res['summary']
    for entry in all_results
    for res in entry['results']
)


final_answer_prompt = ChatPromptTemplate.from_template(
    """You are a helpful veterinary assistant. Use the provided context to answer the 
    user's question as thoroughly and concisely as possible. Follow the following steps
    when giving an prompt answer to the user.
    1. Describe what was the main issue that you observed from the image summary?
    2. What could be the cause of the main issue?
    3. How can this main issue be solved?

    User's question: {query}

    Image Summary: {image_summary}

    Context:
    {context}

    Answer:"""
    )

final_answer_chain = (
    final_answer_prompt
    | ChatOllama(model="llama3.2:3b")  
    | StrOutputParser()
)

final_answer = final_answer_chain.invoke({
    "query": query,
    "image_summary": image_summary,
    "context": all_context
})

print("Final Answer:")
print(final_answer)

Final Answer:
1. The main issue that I observed from the image summary is the emaciated appearance of the domestic shorthair cat, suggesting malnutrition or a medical condition.

2. The possible cause of this main issue could be a lack of proper nutrition, dehydration, or an underlying medical condition such as kidney disease or diabetes, which may require customized diets.

3. To solve this issue, it is recommended to:
* Consult with a veterinarian to determine the underlying cause of the cat's emaciated appearance and develop a suitable treatment plan.
* Switch to a high-quality, commercially available diet that meets the cat's nutritional needs, especially if the cat has specific health issues or allergies.
* Monitor the cat's weight and adjust food intake accordingly to prevent overfeeding or underfeeding.
* Ensure access to fresh water at all times to prevent dehydration.
* Provide regular veterinary check-ups to monitor the cat's health and make any necessary adjustments to its d

# Tool Calling Chain of Thought Way To Answer Query

Previously, We used llama3.2:3b model to answer the query with image summary and retrieved info giving to it. However, the answer can be incomprehensive and missing the target. Here we will try to use reasoning model that outputs chain of thought(CoT) that can solve this issue. For example:
1. deepseek-r1 7b 8b 
2. qwen3 4b 8b

Have model to think of a plan to answer query that helps user to understand the cause, possible underlying issue, recommended next steps, etc. If context not enough, fetch info on Wikipedia, or use Tavily. Let it pause and think, what info gap is there? Does it need more info from the owner? i.e. "is vaccine up-to-date?", "is eating and drinking ok?", "is pooping and peeing ok?",...


# Tool Calling Agent Example

In [13]:
# --- New Cell for Conversational Agent ---

from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.messages import HumanMessage, AIMessage
from langchain_ollama import ChatOllama
from langchain.agents import AgentExecutor, create_tool_calling_agent
from langchain.tools import tool

# --- 1. Tools (same as before) ---
@tool
def search_vet_knowledge_base(query: str) -> str:
    """Searches the veterinary knowledge base for specific information."""
    print(f"--- Calling Vet Knowledge Base with query: {query} ---")
    results = retriever.retrieve(query, k=3)
    if not results:
        return "No information found in the knowledge base for this query."
    return "\\n".join([f"Summary: {res['summary']}" for res in results])

@tool
def web_search(query: str) -> str:
    """Searches the web for general information."""
    print(f"--- Calling Web Search with query: {query} ---")
    return "Web search is not implemented. Tell the user you couldn't find external information."

# The `ask_user_for_info` tool is now implicitly handled by the agent's ability to ask questions.
# We will rely on the agent's main output to be the question for the user.

tools = [search_vet_knowledge_base, web_search]

# --- 2. A Better Agent Prompt for Chat ---
# Use a capable reasoning model
# agent_llm = ChatOllama(model="deepseek-coder:6.7b", temperature=0)
agent_llm = ChatOllama(model="llama3.2:3b", temperature=0)


# This new prompt structure is designed for conversation
agent_prompt = ChatPromptTemplate.from_messages([
    ("system", """You are an expert veterinary assistant. Your goal is to provide a comprehensive answer to the user's query about their cat.

- **Analyze the user's message and the conversation history** to understand the situation.
- **Formulate a plan:** First, decide if you need more information from the user to proceed. If so, ask a clarifying question.
- **Gather information:** Use your tools to find the necessary information *before* answering.
- **Synthesize your answer:** Once you have enough information, provide a clear, actionable final answer. Explain potential issues, what to do next, and what to watch for.
- If you have gathered enough information and can provide a final answer, do so without asking more questions."""),
    # The chat history will be inserted here
    MessagesPlaceholder(variable_name="chat_history"),
    # The user's current message
    ("human", "{input}"),
    # This is where the agent's "thinking" (tool calls) will go
    MessagesPlaceholder(variable_name="agent_scratchpad"),
])

# --- 3. Create the Conversational Agent ---
agent = create_tool_calling_agent(agent_llm, tools, agent_prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

# --- 4. The Chat Loop ---
chat_history = []
# Combine the initial query and image summary into the first message
initial_input = f"""My User Query: "{query}"

Here is a summary of the image I provided:
{image_summary}
"""

print("--- Starting conversation with Vet Assistant Agent ---")
print("Agent: Hello! I've reviewed your query and the image summary. I'll do my best to help.")

while True:
    response = agent_executor.invoke({
        "input": initial_input,
        "chat_history": chat_history
    })

    agent_response = response['output']
    print(f"\\nAgent: {agent_response}")

    # Add the interaction to our history
    chat_history.append(HumanMessage(content=initial_input))
    chat_history.append(AIMessage(content=agent_response))

    # Heuristic to decide if the conversation is over.
    # We'll assume if the agent isn't asking a question, it has provided a final answer.
    if '?' not in agent_response:
        print("\\n--- Conversation Finished ---")
        break

    # Get the user's next message from the input prompt
    initial_input = input("\\nYour response: ")

--- Starting conversation with Vet Assistant Agent ---
Agent: Hello! I've reviewed your query and the image summary. I'll do my best to help.


[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `search_vet_knowledge_base` with `{'query': 'improving domestic shorthair cat health and well-being nutrition care'}`


[0m--- Calling Vet Knowledge Base with query: improving domestic shorthair cat health and well-being nutrition care ---
[36;1m[1;3mSummary: Switching diets may be necessary for cats with health issues. Accustoming your cat to a new food can help reduce stress during the transition.\nSummary: Here's a concise summary:

* A cat's nutritional needs vary depending on body type, activity level, coat, metabolism, and individual factors.
* Each cat is unique, so monitoring food intake and adjusting based on activity level and metabolic rate is crucial.
* Cats require more calories for non-maintenance activities, such as pregnancy and lactation.
* Consider offeri