🏥 LoopHealth - Voice-Enabled Hospital Network Assistant

LoopHealth is an intelligent voice-enabled hospital network assistant powered by RAG (Retrieval-Augmented Generation) technology. Users can interact naturally through voice to find hospitals, get information about medical facilities, and receive spoken responses in real-time.

✨ Features

🎤 Voice-First Interface - Natural voice interaction with speech-to-text and text-to-speech
🔍 Intelligent Hospital Search - RAG-based semantic search using FAISS vector database
💬 Conversational Memory - Context-aware conversations with session management
🤖 AI-Powered Responses - Google Gemini 2.5 Flash for natural language understanding
🌐 Modern Web UI - Clean, responsive React interface with real-time feedback
🔄 Multi-City Support - Smart disambiguation for hospitals across different cities
⚡ Real-Time Processing - Fast audio processing and response generation

🏗️ Architecture

LoopHealth/
├── backend/                 # FastAPI backend server
│   ├── main.py             # Main API endpoints and orchestration
│   ├── rag/                # RAG implementation
│   │   ├── vector_store.py # FAISS vector database interface
│   │   ├── retriever.py    # Hospital retrieval logic
│   │   ├── build_index.py  # Index building utilities
│   │   ├── hospitals.faiss # Pre-built FAISS index
│   │   └── hospitals_meta.pkl # Hospital metadata
│   ├── data/               # Source data
│   │   └── List of GIPSA Hospitals - Sheet1.csv
│   └── requirements.txt    # Python dependencies
│
└── frontend/               # React frontend application
    ├── src/
    │   ├── App.jsx        # Main application component
    │   ├── App.css        # Styling
    │   └── index.js       # Entry point
    ├── public/            # Static assets
    └── package.json       # Node dependencies

🛠️ Technology Stack

Backend

FastAPI - High-performance async web framework
Faster Whisper - Offline speech-to-text (OpenAI Whisper)
Google Gemini 2.5 Flash - Large language model for natural responses
gTTS - Google Text-to-Speech for voice synthesis
FAISS - Facebook AI Similarity Search for vector database
Sentence Transformers - Semantic embeddings (all-MiniLM-L6-v2)
Python 3.8+ - Core programming language

Frontend

React 19.2 - Modern UI library
React Icons - Icon components
TailwindCSS - Utility-first CSS framework
Web Audio API - Audio recording and playback

📋 Prerequisites

Python 3.8+ installed
Node.js 16+ and npm installed
Google Gemini API Key (Get one here)

🚀 Installation & Setup

1. Clone the Repository

git clone https://github.com/yourusername/LoopHealth.git
cd LoopHealth

2. Backend Setup

cd backend

# Create virtual environment
python -m venv venv

# Activate virtual environment
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Create .env file
echo "GEMINI_API_KEY=your_api_key_here" > .env

Important: Replace your_api_key_here with your actual Google Gemini API key.

3. Frontend Setup

cd ../frontend

# Install dependencies
npm install

▶️ Running the Application

Start Backend Server

cd backend
# Activate virtual environment if not already active
venv\Scripts\activate  # Windows
# source venv/bin/activate  # macOS/Linux

# Run FastAPI server
uvicorn main:app --reload --host 0.0.0.0 --port 8000

The backend API will be available at http://localhost:8000

Start Frontend Development Server

cd frontend

# Run React development server
npm start

The frontend will automatically open at http://localhost:3000

📖 Usage

Open the Application - Navigate to http://localhost:3000 in your browser
Listen to Introduction - Loop AI will automatically greet you
Click the Microphone - Press the microphone button to start recording
Ask Your Question - Speak naturally, e.g., "Find me hospitals in Mumbai"
Receive Response - Loop AI will respond with relevant information via voice and text
Follow-up Questions - Continue the conversation with context-aware follow-ups

Example Queries

"Show me hospitals in Bangalore"
"Which hospitals are near Andheri?"
"Tell me about Apollo Hospital"
"What's the address of the hospital you mentioned?"

🔌 API Endpoints

`GET /introduction`

Returns the Loop AI introduction audio and text.

Response:

{
  "text_response": "Hello, I am Loop AI...",
  "audio_base64": "base64_encoded_audio",
  "audio_format": "mp3"
}

`POST /voice`

Main voice interaction endpoint.

Request:

Headers: X-Session-ID (optional) - Session identifier for conversation continuity
Body: multipart/form-data with audio file

Response:

{
  "text_response": "I found 3 hospitals in Mumbai...",
  "audio_base64": "base64_encoded_audio",
  "audio_format": "mp3",
  "session_id": "uuid-session-id"
}

🧠 How It Works

Voice Processing Pipeline

Speech-to-Text (STT)
- User speaks into microphone
- Audio recorded as WebM format
- Faster Whisper transcribes to text
Retrieval-Augmented Generation (RAG)
- User query embedded using Sentence Transformers
- FAISS searches vector database for top-k similar hospitals
- Retrieved hospital data provides context
LLM Processing
- Conversation history retrieved from session memory
- Google Gemini generates natural language response
- Context includes: conversation history + retrieved hospitals + user query
Text-to-Speech (TTS)
- Response text cleaned of markdown/formatting
- gTTS converts to MP3 audio
- Audio encoded as base64 and sent to frontend
Frontend Playback
- Audio decoded and played automatically
- Text displayed in response box
- Session ID maintained for follow-up questions

Session Management

Each conversation has a unique session ID
Last 5 exchanges (10 messages) stored per session
Enables context-aware follow-up questions
Automatic disambiguation for multi-city hospitals

🗂️ Data

The system uses the GIPSA Hospitals dataset containing:

Hospital names
Cities and locations
Addresses
Contact information

Data is pre-processed and indexed using FAISS for fast semantic search.

🔧 Configuration

Backend Configuration

Edit backend/.env:

GEMINI_API_KEY=your_gemini_api_key

Model Configuration

In backend/main.py:

Whisper Model: base (can upgrade to small, medium, large)
LLM Model: gemini-2.5-flash (free tier friendly)
Embedding Model: all-MiniLM-L6-v2 (384 dimensions)
Max Conversation History: 5 exchanges

Frontend Configuration

In frontend/src/App.jsx:

Backend URL: http://localhost:8000
Audio Format: WebM for recording, MP3 for playback

🧪 Testing

Backend Testing

cd backend
# Test API endpoints
curl http://localhost:8000/introduction

# Test voice endpoint (requires audio file)
curl -X POST http://localhost:8000/voice \
  -F "file=@test_audio.webm" \
  -H "X-Session-ID: test-session"

Frontend Testing

cd frontend
npm test

🚧 Troubleshooting

Backend Issues

Issue: GEMINI_API_KEY not found

Solution: Ensure .env file exists in backend/ directory with valid API key

Issue: ModuleNotFoundError

Solution: Activate virtual environment and reinstall dependencies

Issue: FAISS index not found

Solution: Run python rag/build_index.py to rebuild the index

Frontend Issues

Issue: Cannot connect to backend

Solution: Ensure backend is running on port 8000

Issue: Microphone not working

Solution: Grant browser microphone permissions

Issue: Audio not playing

Solution: Check browser console for errors, ensure audio autoplay is allowed

🤝 Contributing

Contributions are welcome! Please follow these steps:

Fork the repository
Create a feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

OpenAI Whisper - Speech recognition model
Google Gemini - Large language model
Facebook FAISS - Vector similarity search
Sentence Transformers - Semantic embeddings
GIPSA - Hospital dataset

📞 Contact

For questions or support, please open an issue on GitHub.

Built with ❤️ for better healthcare accessibility

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
backend		backend
frontend		frontend
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation