Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
data		data
txt		txt
.gitattributes		.gitattributes
.gitignore		.gitignore
BuildAndRun.bat		BuildAndRun.bat
Dockerfile		Dockerfile
Multimodal RAG.png		Multimodal RAG.png
README.md		README.md
local_test.py		local_test.py
requirements.txt		requirements.txt
streamlit_app.py		streamlit_app.py

Repository files navigation

Streamlit Multimodal RAG

This Streamlit application implements a multimodal Question Answering (QA) system using the LangChain library.

Key Features

Interactive Streamlit UI for file uploads, DB build, and QA
Accepts input files in PDF, audio (WAV, MP3, opus), and text formats
Transcribes audio to text using HuggingFace DistilWhisper models
Audio transcription runs in close to real-time on CPU
Background loading of models takes time, notice top-right running indicator
Requires HuggingFace API key
Docker container exposes port 8001, access UI with browser localhost:8001

Flowchart

Models Used

Sentence Embeddings: huggingface/sentence-transformers/all-mpnet-base-v2)

Note: input text longer than 384 word pieces is truncated.
STT: distil-whisper/distil-medium.en
LLM: declare-lab/flan-alpaca-large

Installation

docker build -t streamlit-app .
docker run -p 8001:8001 --rm streamlit-app

GUI access localhost:8001

Note

Please be aware that this is only a Proof of Concept system and may contain bugs or unfinished features.

Resources

About

No description, website, or topics provided.

Report repository

Releases

Packages

No packages published

Languages