# üß†üéôÔ∏è Psychology Voice Assistant

Welcome to the **Psychology Voice Assistant** notebook. This tool simulates a session with **Dr. Julian**, a positive psychologist specializing in Transition Resilience.

### Features:
- üó£Ô∏è **Voice Input**: Talk to the AI using your browser microphone.
- ü§ñ **Local LLM**: Uses Ollama (running locally/on Colab) for privacy and speed.
- üîä **Voice Response**: The AI replies with audio.
- üíæ **Chat History**: Remembers your conversation context.

---

## 1. üõ†Ô∏è System Setup & Dependencies
Installing necessary system libraries for audio processing and Python packages.

In [None]:
# Install system dependencies for audio (PortAudio, eSpeak, FFmpeg)
!sudo apt-get install -y portaudio19-dev espeak ffmpeg

# Install Ollama script
!curl -fsSL https://ollama.com/install.sh | sh

In [None]:
# Install Python libraries
!pip install streamlit speechrecognition pyttsx3 langchain langchain-community langchain-core langchain-ollama pyaudio streamlit_jupyter gTTS pydub

## 2. üìö Imports & Configuration
Importing libraries and setting up the environment.

In [None]:
import subprocess
import time
import io
import tempfile
from base64 import b64decode

# Web & UI
import streamlit as st
from streamlit_jupyter import StreamlitPatcher

# Audio Processing
import speech_recognition as sr
from gtts import gTTS
from pydub import AudioSegment
from IPython.display import Audio, display, Javascript
from google.colab import output

# AI & LangChain
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.prompts import PromptTemplate
from langchain_ollama import OllamaLLM

# Patch Streamlit to work in Jupyter/Colab
StreamlitPatcher().jupyter()

In [None]:
# --- Configuration ---
MODEL_NAME = "gpt-oss:20b"  # Options: 'llama3', 'phi', 'gpt-oss:20b'
print(f"Selected Model: {MODEL_NAME}")

## 3. üöÄ Ollama Server Initialization
We need to start the Ollama server in the background and pull the requested model.

In [None]:
print("Starting Ollama server...")

# Start Ollama serve as a background process
process = subprocess.Popen("ollama serve", shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
time.sleep(5)  # Give it a moment to initialize

print(f"Pulling model '{MODEL_NAME}' (this may take a few minutes)...")
!ollama pull {MODEL_NAME}

# specific env var for CUDA if needed
!export CUDA_VISIBLE_DEVICES=0

In [None]:
# Verify installed models
!ollama list

## 4. üß© Core Components: Audio & TTS
Functions to handle browser-based audio recording and Text-to-Speech generation.

In [None]:
class ColabEngine:
    """Custom TTS engine wrapper using gTTS for Colab environment."""
    def setProperty(self, name, value):
        pass 

    def say(self, text):
        try:
            tts = gTTS(text=text, lang='en')
            tts.save("response.mp3")
            # IPyWidgets Audio player
            display(Audio("response.mp3", autoplay=True))
        except Exception as e:
            print(f"Error generating audio: {e}")

    def runAndWait(self):
        pass

engine = ColabEngine()

In [None]:
# Javascript code to capture microphone audio in the browser
RECORD_JS = """
const sleep  = time => new Promise(resolve => setTimeout(resolve, time))
const b2text = blob => new Promise(resolve => {
  const reader = new FileReader()
  reader.onloadend = e => resolve(e.srcElement.result)
  reader.readAsDataURL(blob)
})
var record = time => new Promise(async resolve => {
  stream = await navigator.mediaDevices.getUserMedia({ audio: true })
  recorder = new MediaRecorder(stream)
  chunks = []
  recorder.ondataavailable = e => chunks.push(e.data)
  recorder.start()
  await sleep(time)
  recorder.onstop = async ()=>{
    blob = new Blob(chunks)
    text = await b2text(blob)
    resolve(text)
  }
  recorder.stop()
})
"""

def record_audio(sec=5):
    """
    Injects JS to record audio for `sec` seconds.
    Returns: Path to the temporary WAV file.
    """
    display(Javascript(RECORD_JS))
    print(f"üéôÔ∏è Recording for {sec} seconds...")
    s = output.eval_js('record(%d)' % (sec*1000))
    print("‚úÖ Recording finished.")
    b = b64decode(s.split(',')[1])

    # Convert webm to wav
    audio = AudioSegment.from_file(io.BytesIO(b))
    with tempfile.NamedTemporaryFile(suffix=".wav", delete=False) as f:
        audio.export(f.name, format="wav")
        return f.name

In [None]:
recognizer = sr.Recognizer()

def listen_browser():
    """Capture audio and transcribe using Google Speech Recognition."""
    try:
        audio_file = record_audio(sec=5)
        with sr.AudioFile(audio_file) as source:
            audio_data = recognizer.record(source)
            query = recognizer.recognize_google(audio_data)
            st.write(f"**You:** {query}")
            return query.lower()
    except sr.UnknownValueError:
        st.warning("üòï Sorry, I couldn't understand. Please speak clearly.")
        return ""
    except Exception as e:
        st.error(f"‚ö†Ô∏è Error: {e}")
        return ""

## 5. üß† AI Agent Setup (LangChain + Ollama)
Defining the persona "Dr. Julian" and initializing the conversation chain.

In [None]:
# Initialize LLM
llm = OllamaLLM(model=MODEL_NAME)

# Initialize Chat History in Session State
if "chat_history" not in st.session_state:
    st.session_state.chat_history = ChatMessageHistory()

# Persona Template
template = """
### ROLE
You are Dr. Julian, a warm, empathetic, and unwavering Positive Psychologist specializing in "Transition Resilience." 
Your goal is to help people navigating the stress of relocation to see the experience as a profound opportunity for personal growth and adventure.

### GUIDELINES
1. VALIDATE: Acknowledge the difficulty of the move (homesickness, fatigue, confusion).
2. REFRAME: Always shift the narrative toward "The New Chapter" and "Discovery."
3. TONE: Use words like "Courageous," "Growth," "Potential," and "Roots."
4. STYLE: Keep responses conversational, narrative, and grounded in psychological strength.
5. FORMATTING: Use Markdown. Do NOT use tables.

### CONTEXT
History:
{chat_history}

User Input:
{question}

### DR. JULIAN'S RESPONSE:
"""

prompt = PromptTemplate(
    input_variables=["chat_history", "question"],
    template=template
)

In [None]:
def run_chain(question):
    """Execute the LLM chain and update history."""
    # Format history as text
    chat_history_text = "\n".join([
        f"{msg.type.capitalize()}: {msg.content}" for msg in st.session_state.chat_history.messages
    ])
    
    # Invoke LLM
    response = llm.invoke(prompt.format(chat_history=chat_history_text, question=question))
    
    # Save messages
    st.session_state.chat_history.add_user_message(question)
    st.session_state.chat_history.add_ai_message(response)
    return response

## 6. ‚ñ∂Ô∏è Main Application
Run the cell below to start the interaction loop.

In [None]:
st.title("üß†üéôÔ∏è Dr. Julian: AI Voice Assistant")
st.markdown("Tap the play arrow on the cell to run. Then wait for the prompt to speak.")

# Start Interaction
user_input = listen_browser()

if user_input:
    # Generate Response
    with st.spinner("Dr. Julian is thinking..."):
        ai_response = run_chain(user_input)
    
    # Display Response
    st.markdown(f"**Dr. Julian:** {ai_response}")
    
    # Speak Response
    engine.say(ai_response)