This repository contains a full Retrieval-Augmented Generation (RAG) app with:
- A backend in
backend-func/(FastAPI app hosted through Azure Functions, plus direct Uvicorn support for local development). - A Streamlit frontend in
frontend/with pages for chat, document management, session history, and settings.
flowchart LR
UI[Streamlit Frontend] --> API["/api endpoints"]
API --> RAG[RAG Service]
RAG --> EMB[Azure OpenAI Embeddings]
RAG --> DI[Document Intelligence]
RAG --> SRCH[Azure AI Search]
RAG --> CHAT[Azure OpenAI Chat]
backend-func/
app/
config/ # Settings + credentials loader
models/ # Pydantic schemas
routes/ # /api routes
services/ # OpenAI, Search, RAG services
utils/ # Prompt templates, file processing, logging
function_app.py # Azure Functions v2 ASGI bridge
host.json
requirements.txt
frontend/
app.py # Chat page
pages/
documents.py
session_history.py
settings.py
components/
utils/
requirements.txt
tests/
tests/
test_config.py # legacy config test
sequenceDiagram
participant U as User
participant UI as Streamlit Frontend
participant API as FastAPI
participant DI as Document Intelligence
participant EMB as Azure OpenAI Embeddings
participant SRCH as Azure AI Search
U->>UI: Upload file (PDF / TXT / DOCX)
UI->>API: POST /api/documents/upload
alt PDF file
API->>DI: Extract text (prebuilt-layout)
DI-->>API: Extracted text
else TXT or DOCX
API->>API: Extract text locally
end
API->>API: Chunk text (500-word segments)
API->>EMB: Generate embeddings (batch)
EMB-->>API: Embedding vectors
API->>SRCH: Upload chunk documents
SRCH-->>API: Indexing result
API-->>UI: Success + file/chunk counts
UI-->>U: Upload confirmed
sequenceDiagram
participant U as User
participant UI as Streamlit Frontend
participant API as FastAPI
participant RAG as RAG Service
participant EMB as Azure OpenAI Embeddings
participant SRCH as Azure AI Search
participant CHAT as Azure OpenAI Chat
U->>UI: Type question + press Enter
UI->>API: POST /api/chat
API->>RAG: process_query()
RAG->>EMB: Embed query
EMB-->>RAG: Query vector
RAG->>SRCH: Hybrid search (keyword + vector)
SRCH-->>RAG: Top-k chunks + relevance scores
alt Score >= MINIMUM_RELEVANCE_SCORE
RAG->>CHAT: Strict RAG prompt + context chunks
CHAT-->>RAG: Answer text
RAG-->>API: Answer + sources
API-->>UI: Response (markdown answer + sources footer)
UI-->>U: Rendered answer with sources
else No relevant context
RAG-->>API: No-context response + suggested actions
API-->>UI: Suggested actions
UI-->>U: "Please upload relevant documents"
end
- Hybrid retrieval (keyword + vector) against Azure AI Search.
- Optional semantic query mode in Azure Search.
- Strict RAG answering with source footer normalization.
- PDF extraction via Azure Form Recognizer (
prebuilt-layout), plus TXT and DOCX support. - Upload chunking with section detection (500-word chunks).
- SSE streaming endpoint (
/api/chat/stream) for compatibility. - Streamlit multi-page UI with custom chat rendering and markdown sanitization.
- Python 3.10+
- Azure AI Search
- Azure OpenAI (chat + embedding deployments)
- Azure Form Recognizer (required for PDF uploads)
- Azure Functions Core Tools (
func) for local Functions hosting (optional)
From repository root:
python -m venv .venv
source .venv/bin/activate
pip install -r backend-func/requirements.txt
pip install -r frontend/requirements.txt
pip install pytestThe backend currently reads settings from a Python dictionary in:
backend-func/app/config/credentials.py
backend-func/app/config/config.py builds settings from that dict. There is no .env loading in backend code right now.
Required setting keys:
AZURE_SEARCH_SERVICE_ENDPOINTAZURE_SEARCH_ADMIN_KEYAZURE_SEARCH_INDEX_NAMEAZURE_OPENAI_ENDPOINTAZURE_OPENAI_API_KEYAZURE_OPENAI_DEPLOYMENT_NAMEAZURE_OPENAI_API_VERSIONAZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME
Optional setting keys:
AZURE_FORM_RECOGNIZER_ENDPOINTAZURE_FORM_RECOGNIZER_KEYEMBEDDING_DIMENSIONS(default parsing fallback:1536)AZURE_SEARCH_USE_SEMANTIC(true/false)AZURE_SEARCH_AUTO_CREATE_INDEX(true/false)MINIMUM_RELEVANCE_SCORE(float)ENABLE_STREAMING(true/false)ALLOWED_ORIGINS(comma-separated list or*)
Frontend configuration:
BACKEND_URL(loaded from environment viapython-dotenv)- Default in frontend code is currently
https://ragchatbotbackend.azurewebsites.net.
For local backend testing, set:
BACKEND_URL=http://127.0.0.1:8000when running Uvicorn.BACKEND_URL=http://127.0.0.1:7071when running Azure Functions host.
Option 1 (direct FastAPI via Uvicorn):
uvicorn app.main:app --reload --app-dir backend-funcOption 2 (Azure Functions host):
cd backend-func
func startFrom repository root:
streamlit run frontend/app.pyReturns service health and latency fields:
statusazure_searchazure_search_latency_msazure_openaiazure_openai_latency_ms
Body:
{
"message": "Question text",
"history": [{"role": "user", "content": "Hi"}],
"top_k": 5,
"temperature": 0.2,
"max_tokens": 512
}Response:
{
"answer": "**Answer** ...",
"sources": [
{
"id": "doc-id",
"title": "filename.pdf",
"relevance_score": 0.87,
"excerpt": "snippet",
"metadata": {}
}
],
"has_sufficient_context": true,
"tokens_used": 123,
"suggested_actions": null
}Server-sent events stream; falls back to non-stream answer when needed.
- Multipart upload field:
file - Supported:
.pdf,.txt,.docx - Max upload size: 50 MB
- Aggregated by uploaded file (
parent_id) with file and chunk counts. - Optional query param:
top(limits search results path).
Deletes all chunks matching a file-level parent_id when found.
- Query embedding is generated with Azure OpenAI (with retry).
- Hybrid search is executed in Azure AI Search.
- Results are filtered by
MINIMUM_RELEVANCE_SCOREand reranked by score. - If no relevant context remains, response is a no-context answer with suggested actions.
- When context exists, a strict prompt is built (
STRICT_RAG_SYSTEM_PROMPT). - The answer is normalized to markdown and prefixed with
**Answer**. - Inline source citations are removed, then one consolidated
Sources:footer is appended. historyis accepted in request payloads, but current/api/chatroute logic does not pass it into generation.
frontend/app.py(Chat): chat input, rendered user/assistant bubbles, sanitized markdown, token usage tracked in message metadata.frontend/pages/documents.py: upload file, display file/chunk metrics, bar chart by chunk count, delete documents.frontend/pages/session_history.py: shows current in-session message history and counts.frontend/pages/settings.py: controls fortemperature,max_tokens,top_k, plus health check button.
Install test dependency:
pip install pytestRun frontend tests:
python -m pytest frontend/tests -qRun all tests:
python -m pytest -qNote:
tests/test_config.pyreferencesbackend/import paths, while this repo usesbackend-func/. If you run the full suite, update that test orPYTHONPATHaccordingly.
400: unsupported file type or no extractable text.413: uploaded file too large.502: Form Recognizer request failure.503: Azure Search connectivity issue or missing Form Recognizer config for PDF extraction.409: index configuration conflict (for example embedding dimension mismatch with existing index).
- The current backend configuration uses hardcoded credential values in
backend-func/app/config/credentials.py. - Do not commit real secrets in source control.
- Rotate any exposed keys and move to secure secret storage (for example environment variables, Azure Key Vault, or App Settings) before production use.