LoopHealth is an intelligent voice-enabled hospital network assistant powered by RAG (Retrieval-Augmented Generation) technology. Users can interact naturally through voice to find hospitals, get information about medical facilities, and receive spoken responses in real-time.
- π€ Voice-First Interface - Natural voice interaction with speech-to-text and text-to-speech
- π Intelligent Hospital Search - RAG-based semantic search using FAISS vector database
- π¬ Conversational Memory - Context-aware conversations with session management
- π€ AI-Powered Responses - Google Gemini 2.5 Flash for natural language understanding
- π Modern Web UI - Clean, responsive React interface with real-time feedback
- π Multi-City Support - Smart disambiguation for hospitals across different cities
- β‘ Real-Time Processing - Fast audio processing and response generation
LoopHealth/
βββ backend/ # FastAPI backend server
β βββ main.py # Main API endpoints and orchestration
β βββ rag/ # RAG implementation
β β βββ vector_store.py # FAISS vector database interface
β β βββ retriever.py # Hospital retrieval logic
β β βββ build_index.py # Index building utilities
β β βββ hospitals.faiss # Pre-built FAISS index
β β βββ hospitals_meta.pkl # Hospital metadata
β βββ data/ # Source data
β β βββ List of GIPSA Hospitals - Sheet1.csv
β βββ requirements.txt # Python dependencies
β
βββ frontend/ # React frontend application
βββ src/
β βββ App.jsx # Main application component
β βββ App.css # Styling
β βββ index.js # Entry point
βββ public/ # Static assets
βββ package.json # Node dependencies
- FastAPI - High-performance async web framework
- Faster Whisper - Offline speech-to-text (OpenAI Whisper)
- Google Gemini 2.5 Flash - Large language model for natural responses
- gTTS - Google Text-to-Speech for voice synthesis
- FAISS - Facebook AI Similarity Search for vector database
- Sentence Transformers - Semantic embeddings (all-MiniLM-L6-v2)
- Python 3.8+ - Core programming language
- React 19.2 - Modern UI library
- React Icons - Icon components
- TailwindCSS - Utility-first CSS framework
- Web Audio API - Audio recording and playback
- Python 3.8+ installed
- Node.js 16+ and npm installed
- Google Gemini API Key (Get one here)
git clone https://github.com/yourusername/LoopHealth.git
cd LoopHealthcd backend
# Create virtual environment
python -m venv venv
# Activate virtual environment
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Create .env file
echo "GEMINI_API_KEY=your_api_key_here" > .envImportant: Replace your_api_key_here with your actual Google Gemini API key.
cd ../frontend
# Install dependencies
npm installcd backend
# Activate virtual environment if not already active
venv\Scripts\activate # Windows
# source venv/bin/activate # macOS/Linux
# Run FastAPI server
uvicorn main:app --reload --host 0.0.0.0 --port 8000The backend API will be available at http://localhost:8000
cd frontend
# Run React development server
npm startThe frontend will automatically open at http://localhost:3000
- Open the Application - Navigate to
http://localhost:3000in your browser - Listen to Introduction - Loop AI will automatically greet you
- Click the Microphone - Press the microphone button to start recording
- Ask Your Question - Speak naturally, e.g., "Find me hospitals in Mumbai"
- Receive Response - Loop AI will respond with relevant information via voice and text
- Follow-up Questions - Continue the conversation with context-aware follow-ups
- "Show me hospitals in Bangalore"
- "Which hospitals are near Andheri?"
- "Tell me about Apollo Hospital"
- "What's the address of the hospital you mentioned?"
Returns the Loop AI introduction audio and text.
Response:
{
"text_response": "Hello, I am Loop AI...",
"audio_base64": "base64_encoded_audio",
"audio_format": "mp3"
}Main voice interaction endpoint.
Request:
- Headers:
X-Session-ID(optional) - Session identifier for conversation continuity - Body:
multipart/form-datawith audio file
Response:
{
"text_response": "I found 3 hospitals in Mumbai...",
"audio_base64": "base64_encoded_audio",
"audio_format": "mp3",
"session_id": "uuid-session-id"
}-
Speech-to-Text (STT)
- User speaks into microphone
- Audio recorded as WebM format
- Faster Whisper transcribes to text
-
Retrieval-Augmented Generation (RAG)
- User query embedded using Sentence Transformers
- FAISS searches vector database for top-k similar hospitals
- Retrieved hospital data provides context
-
LLM Processing
- Conversation history retrieved from session memory
- Google Gemini generates natural language response
- Context includes: conversation history + retrieved hospitals + user query
-
Text-to-Speech (TTS)
- Response text cleaned of markdown/formatting
- gTTS converts to MP3 audio
- Audio encoded as base64 and sent to frontend
-
Frontend Playback
- Audio decoded and played automatically
- Text displayed in response box
- Session ID maintained for follow-up questions
- Each conversation has a unique session ID
- Last 5 exchanges (10 messages) stored per session
- Enables context-aware follow-up questions
- Automatic disambiguation for multi-city hospitals
The system uses the GIPSA Hospitals dataset containing:
- Hospital names
- Cities and locations
- Addresses
- Contact information
Data is pre-processed and indexed using FAISS for fast semantic search.
Edit backend/.env:
GEMINI_API_KEY=your_gemini_api_keyIn backend/main.py:
- Whisper Model:
base(can upgrade tosmall,medium,large) - LLM Model:
gemini-2.5-flash(free tier friendly) - Embedding Model:
all-MiniLM-L6-v2(384 dimensions) - Max Conversation History: 5 exchanges
In frontend/src/App.jsx:
- Backend URL:
http://localhost:8000 - Audio Format: WebM for recording, MP3 for playback
cd backend
# Test API endpoints
curl http://localhost:8000/introduction
# Test voice endpoint (requires audio file)
curl -X POST http://localhost:8000/voice \
-F "file=@test_audio.webm" \
-H "X-Session-ID: test-session"cd frontend
npm testIssue: GEMINI_API_KEY not found
- Solution: Ensure
.envfile exists inbackend/directory with valid API key
Issue: ModuleNotFoundError
- Solution: Activate virtual environment and reinstall dependencies
Issue: FAISS index not found
- Solution: Run
python rag/build_index.pyto rebuild the index
Issue: Cannot connect to backend
- Solution: Ensure backend is running on port 8000
Issue: Microphone not working
- Solution: Grant browser microphone permissions
Issue: Audio not playing
- Solution: Check browser console for errors, ensure audio autoplay is allowed
Contributions are welcome! Please follow these steps:
- Fork the repository
- Create a feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- OpenAI Whisper - Speech recognition model
- Google Gemini - Large language model
- Facebook FAISS - Vector similarity search
- Sentence Transformers - Semantic embeddings
- GIPSA - Hospital dataset
For questions or support, please open an issue on GitHub.
Built with β€οΈ for better healthcare accessibility