## **LangChain RAG**

In this notebook, we're building a Retrieval-Augmented Generation (RAG) system using LangChain to power ALMA — an AI assistant specialized in health and wellness. ALMA combines a GPT-4 language model with a Pinecone vector store to retrieve relevant knowledge and respond with context-aware, emotionally intelligent answers. The system also includes tools like a health calculator and a second vector index for matching helpful YouTube video clips. This setup creates a dynamic, human-like assistant that can both inform and guide users through natural conversation.



##### Imports

In [7]:
# Built-in
import os, time, tempfile, asyncio, re

# Third-party
import pygame, speech_recognition as sr
from dotenv import load_dotenv, find_dotenv
from pydub import AudioSegment
from pinecone import Pinecone
import edge_tts

# LangChain
from langchain.agents import Tool, initialize_agent, AgentType
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.exceptions import OutputParserException
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_pinecone import PineconeVectorStore

#### Load up Api keys
We load the environment with te API keys and set up the path to use ffmpeg

In [None]:
# Load environment 
load_dotenv(find_dotenv())
PINECONE_API_KEY = os.getenv("PINECONE_API_KEY")
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

# Configure ffmpeg 
FFMPEG_PATH = os.getenv("FFMPEG_PATH", r"C:\ffmpeg\...\bin")
os.environ["PATH"] += os.pathsep + FFMPEG_PATH

### LangChain + Pinecone brain setup for ALMA
We connect to Pinecone using and select the "alma-index" for general health knowledge we extract from the transcripts. Then  we initialize OpenAI's embedding model, which turns text into vectors for similarity search. The vectorstore wraps that index, and the retriever lets ALMA search for the top 4 most relevant context chunks for each question. A second video_vectorstore connects to "alma-video-index", which contains timestamped YouTube clips, allowing ALMA to suggest helpful videos. Finally, the llm is ALMA's core intelligence, using GPT-4 to generate warm, personalized answers.

In [None]:
# LangChain Setup 
pc = Pinecone(api_key=PINECONE_API_KEY)
index = pc.Index("alma-index")
embeddings = OpenAIEmbeddings(api_key=OPENAI_API_KEY)

vectorstore = PineconeVectorStore(index_name="alma-index", embedding=embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 4})

video_vectorstore = PineconeVectorStore(index_name="alma-video-index", embedding=embeddings)
video_retriever = video_vectorstore.as_retriever(search_kwargs={"k": 2})

llm = ChatOpenAI(model="gpt-4", api_key=OPENAI_API_KEY)


 #### Set-up the tool
 We give ALMA a calculator brain for handling quick health computations like BMI, TDEE, and macros.

In [None]:
def alma_calculator(input_text: str) -> str:

    def parse_kv(pairs):
        return dict(re.findall(r'(\w+)=([\w\s.]+)', pairs))

    input_text = input_text.lower().strip()

    try:
        # BMI = Body Mass Index
        if input_text.startswith("bmi"):
            args = parse_kv(input_text)
            weight = float(args["weight"])
            height = float(args["height"])
            bmi = weight / (height ** 2)
            return f"Your BMI is approximately {bmi:.2f}."
        
        # TDEE = Total Daily Energy Expenditure
        elif input_text.startswith("tdee"):
            args = parse_kv(input_text)
            weight = float(args["weight"])
            height = float(args["height"]) * 100  # convert to cm
            age = int(args["age"])
            gender = args["gender"]
            activity = args["activity"]

        # Basal Metabolic Rate
            if gender == "male":
                bmr = 10 * weight + 6.25 * height - 5 * age + 5
            else:
                bmr = 10 * weight + 6.25 * height - 5 * age - 161

            activity_levels = {
                "sedentary": 1.2,
                "light": 1.375,
                "moderate": 1.55,
                "active": 1.725,
                "very active": 1.9,
            }

            multiplier = activity_levels.get(activity, 1.55)
            tdee = bmr * multiplier
            return f"Your estimated TDEE is about {tdee:.0f} calories per day based on a {activity} lifestyle."
        
        # Macronutrient Split (carbs, protein, fat)
        elif input_text.startswith("macros"):
            args = parse_kv(input_text)
            calories = float(args["calories"])
            carbs_pct = float(args["carbs"])
            protein_pct = float(args["protein"])
            fat_pct = float(args["fat"])

            carbs_g = (calories * carbs_pct / 100) / 4
            protein_g = (calories * protein_pct / 100) / 4
            fat_g = (calories * fat_pct / 100) / 9

            return (f"Macro breakdown:\n"
                    f"→ Carbs: {carbs_g:.1f}g\n"
                    f"→ Protein: {protein_g:.1f}g\n"
                    f"→ Fat: {fat_g:.1f}g")

        else:
            return "Sorry, I didn’t recognize that calculation type. Try starting with 'bmi', 'tdee', or 'macros'."

    except Exception as e:
        return f"⚠️ There was an error processing your request: {e}"
    

#### Lets make ALMA alive
In this part we create ALMAS personality using GPT-4 and the right prompt. We set up the Langchain tool (calculator) and a LangChain agent that decides to use that calculator or respond on its own. Then we give ALMA a voice so the user can choose, after a welcome message, whether they would prefer to talk or type. ALMA can suggest a video based on the question but that video needs to be relevant so the the similarity score must be high enough.

After this, we have our main loop, this is the engine of ALMA’s interaction:
Every time you ask a question:
 - ALMA listens (or waits for typed input)
 - If it’s a calculator-type question → uses the Agent (with alma_calculator)
 - Otherwise → retrieves context documents, builds a personalized prompt, and sends it to GPT-4
 - If a highly relevant video matches the query, GPT-4 decides whether to suggest it (based on its content and tags)
- ALMA prints the answer (and speaks it, if in voice mode)
- Everything is logged to chat_history

In [None]:
# Prompt Template 
prompt = ChatPromptTemplate.from_template("""
You are ALMA — a warm, intelligent, and caring AI assistant specialized in health, nutrition, sleep, mental wellness, and healthy aging.

You speak to the user like a thoughtful health coach: clear, knowledgeable, emotionally intelligent, and supportive.

When responding:
- If the user input is a **question**, begin with one of these:
    - "You’ve brought up something really meaningful."
    - "I'm really glad you asked that."
    - "That’s such an important question."
   
- If the user input is a **statement** or expression of emotion, begin with one of these:
    - "Let’s take a closer look together."
    - "Here’s something that might help you."
    - "Thanks for sharing that with me."
- If you're unsure, just start naturally without forcing a phrase.

Your priority is to **give a clear, grounded, and caring answer first** — with insights the user can act on.

If you genuinely have more helpful information to offer, end your response with a warm, relevant follow-up question — and explain why you're asking it. For example:
- “That might help us understand your patterns better.”
- “This could help guide the next step.”
- “It’s often helpful to reflect on that when building new habits.”

Answer using only the information in the provided context. Do not rely on outside knowledge. 
If you cannot find the answer in the context, say "I’m not sure based on what I know, but I can help you explore something else."

If you find a video in the context that directly supports or enhances your answer, it's highly encouraged to offer it to the user.

Say this:
> “Would you like to watch a video that explains this further? I found one that seems really helpful.”

Only offer a video if it clearly reinforces your answer or adds value.
---
{context}
---

Question:
{question}
""")

calc_tool = Tool(
    name="HealthCalculator",
    func=alma_calculator,
    description=(
        "Use this to calculate BMI, TDEE, or macronutrient distribution.\n"
        "Examples:\n"
        "- bmi weight=70 height=1.75\n"
        "- tdee weight=70 height=1.75 age=30 gender=female activity=moderate\n"
        "- macros calories=2000 carbs=50 protein=25 fat=25"
    )
)

tools = [calc_tool]

agent = initialize_agent(
    tools=tools,
    llm=llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=False,
)

# Voice Output with SSML 
async def speak_alma_edge(text):


    with tempfile.NamedTemporaryFile(delete=False, suffix=".mp3") as temp_file:
        temp_path = temp_file.name

    # Use a nice voice with slightly faster speed
    communicate = edge_tts.Communicate(text=text, voice="en-US-JennyNeural", rate="+10%")
    await communicate.save(temp_path)

    # Wait until file is fully written
    timeout = 7
    start = time.time()
    while not os.path.exists(temp_path):
        if time.time() - start > timeout:
            print("⚠️ Timed out waiting for audio file.")
            return
        time.sleep(0.1)

    # Play it with pygame
    try:
        pygame.mixer.init()
        pygame.mixer.music.load(temp_path)
        pygame.mixer.music.play()

        while pygame.mixer.music.get_busy():
            await asyncio.sleep(0.3)

        pygame.mixer.music.stop()
        pygame.mixer.quit()
    except Exception as e:
        print(f"⚠️ Audio playback failed: {e}")
    finally:
        try:
            os.remove(temp_path)
        except Exception as e:
            print(f"⚠️ Could not delete audio file: {e}")



# Voice Input 
def listen_to_user():
    recognizer = sr.Recognizer()
    with sr.Microphone() as source:
        print("\n🎤 Listening... (say 'exit' to quit)")
        audio = recognizer.listen(source)
    try:
        text = recognizer.recognize_google(audio)
        print(f"\n💬 You said: {text}")
        return text
    except sr.UnknownValueError:
        print("❌ Sorry, I didn't catch that.")
        return ""
    except sr.RequestError:
        print("❌ Speech service error.")
        return ""


# Welcome Message
print("""
🌿 Welcome to ALMA 🌿
Hi, I'm ALMA — your AI companion for better living.

I'm here to help you improve your sleep, nutrition, mood, energy, and long-term well-being — all grounded in science and tailored to your needs.

Whenever you're ready, ask me something you'd like to improve about your health.  
It could be as simple as:
→ "How can I fall asleep faster?"
→ "What foods reduce inflammation?"
→ "Do you want to calculate your bmi?"

Let's talk. 💬
""")

# Voice Mode Choice 
voice_mode = input("🎙️ Would you like to talk to ALMA using your voice? (y/n): ").strip().lower() == "y"
chat_history = []

def get_top_video_suggestion(query: str):
    docs = video_retriever.invoke(query)
    if not docs:
        return None

    top_doc = docs[0]
    metadata = top_doc.metadata

    return {
        "title": metadata["video_title"],
        "url": metadata["video_url"],
        "snippet": top_doc.page_content[:300] + "...",
        "tags": metadata.get("tags", [])
    }

# Keep track of shown videos
shown_video_ids = set()

# MAIN LOOP 
async def main():
    while True:
        user_input = listen_to_user().strip() if voice_mode else input("\n\U0001F4AD Your question for ALMA (or type 'exit'): ").strip()

        if user_input.lower() == "exit":
            farewell = "Take care. I'll be here when you need me again."
            print("\U0001F319", farewell)
            if voice_mode:
                await speak_alma_edge(farewell)
            break

        if not user_input:
            continue

        # Build memory-aware context 
        if chat_history:
            full_question = "Conversation so far:\n" + "\n".join(
                [f"User: {q}\nALMA: {a}" for q, a in chat_history[-3:]]
            ) + f"\nUser: {user_input}"
        else:
            full_question = user_input

        docs = retriever.invoke(full_question)
        context_parts, total_chars, max_chars = [], 0, 3000
        for doc in docs:
            text = doc.page_content.strip()
            if total_chars + len(text) <= max_chars:
                context_parts.append(text)
                total_chars += len(text)
            else:
                break

        context = "\n\n".join(context_parts)
        formatted_prompt = prompt.format(context=context, question=user_input)

        # Decision: Use Agent or LLM 
        try:
            if any(kw in user_input.lower() for kw in ["bmi", "calculate", "how many", "how much", "percent", "+", "-", "*", "/"]):
                agent_response = agent.invoke({"input": user_input})
                response = agent_response.get("output")
                if not response:
                    response = "I'm not sure how to calculate that."
                    print("⚠️ Agent returned no output.")
            else:
                response = llm.invoke(formatted_prompt).content
        except OutputParserException:
            response = "I tried to use a tool to answer that, but couldn't parse the response properly. Could you rephrase?"
        except Exception as e:
            response = f"Something went wrong: {e}"

        # Let ALMA decide if she wants to include a video
        def get_top_video_suggestion(query: str, min_similarity: float = 0.85):
            results = video_vectorstore.similarity_search_with_score(query, k=2)
            if not results:
                return None

            top_doc, score = results[0]
            if score < min_similarity:
                print(f"ℹ️ Video similarity ({score:.2f}) below threshold.")
                return None

            metadata = top_doc.metadata
            return {
                "title": metadata["video_title"],
                "url": metadata["video_url"],
                "snippet": top_doc.page_content[:300] + "...",
                "tags": metadata.get("tags", []),
                "video_id": metadata.get("video_id")
            }

        suggestion = get_top_video_suggestion(user_input)

        if suggestion and suggestion["video_id"] not in shown_video_ids:
            video_prompt = f"""
This is the user question: "{user_input}"

This is your response:
{response}

And here is a short video clip that matches it:
Title: {suggestion['title']}
Snippet: "{suggestion['snippet']}"
Tags: {', '.join(suggestion['tags'])}

If the clip supports or enriches your answer, add a short, warm suggestion to watch the video, like:
“Would you like to watch a video that explains this further? I found one that seems really helpful.”

Then show the video link and tags. If the clip isn't relevant, say nothing.
"""
            try:
                video_message = llm.invoke(video_prompt).content
                if video_message.strip():
                    response += "\n\n" + video_message
                    response += f"""\n🎥 **{suggestion['title']}**\n{suggestion['url']}\n_{', '.join(suggestion['tags'])}_"""
                    shown_video_ids.add(suggestion["video_id"])
            except Exception as e:
                print("⚠️ Video suggestion failed:", e)

        print("\n\U0001F4AC ALMA says:\n", response)
        if voice_mode:
            await speak_alma_edge(response)

        chat_history.append((user_input, response))


# Run it 
await main()


🌿 Welcome to ALMA 🌿
Hi, I'm ALMA — your AI companion for better living.

I'm here to help you improve your sleep, nutrition, mood, energy, and long-term well-being — all grounded in science and tailored to your needs.

Whenever you're ready, ask me something you'd like to improve about your health.  
It could be as simple as:
→ "How can I fall asleep faster?"
→ "What foods reduce inflammation?"
→ "Do you want to calculate your bmi?"

Let's talk. 💬

ℹ️ Video similarity (0.84) below threshold.

💬 ALMA says:
 I'm really glad you asked that. Sleep is crucial for our overall health and well-being, and there are several habits and routines that could help improve your sleep. 

First, you can establish a consistent sleep schedule. This means going to bed and waking up at the same time every day. Your body’s circadian rhythm functions on a set loop, aligning itself with sunrise and sunset. Consistency can reinforce your body's sleep-wake cycle.

Second, it's important to create a restful envi