Skip to content

TzurV/streamlit-multimodal-RAG

Repository files navigation

Streamlit Multimodal RAG

This Streamlit application implements a multimodal Question Answering (QA) system using the LangChain library.

Key Features

  • Interactive Streamlit UI for file uploads, DB build, and QA
  • Accepts input files in PDF, audio (WAV, MP3, opus), and text formats
  • Transcribes audio to text using HuggingFace DistilWhisper models
  • Audio transcription runs in close to real-time on CPU
  • Background loading of models takes time, notice top-right running indicator
  • Requires HuggingFace API key
  • Docker container exposes port 8001, access UI with browser localhost:8001

Flowchart

Models Used

Installation

docker build -t streamlit-app .
docker run -p 8001:8001 --rm streamlit-app

GUI access localhost:8001

Note

Please be aware that this is only a Proof of Concept system and may contain bugs or unfinished features.

Resources

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

No packages published