Skip to content

AmithaMahesh/ds-rpc-01

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

21 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

FinSolve RAG Assistant with Role-Based Access Control

A sophisticated AI-powered chatbot system designed for FinSolve Technologies that provides role-based access to company documents using Retrieval-Augmented Generation (RAG).

🎯 Project Overview

This system enables employees across different departments to query company information through a conversational AI interface, with strict role-based access controls ensuring users only see information relevant to their department and clearance level.

πŸ—οΈ Architecture

Core Components

  • Backend: FastAPI application with RAG pipeline
  • Frontend: React-based chat interface
  • Vector Store: ChromaDB for document embeddings and retrieval
  • LLM: HuggingFace API for natural language generation
  • Authentication: Role-based access control system
  • Database: MongoDB for user management (if needed for persistence)

Technology Stack

Component Technology
Backend Framework FastAPI
Frontend Framework React 18
Vector Database ChromaDB
LLM Provider HuggingFace
Embeddings sentence-transformers/all-MiniLM-L6-v2
Authentication Basic Auth with RBAC
Styling Tailwind CSS
HTTP Client Axios

πŸ‘₯ User Roles & Permissions

Role Username Password Access Permissions
C-Level Executive tony_sharma password123 πŸ”“ ALL departments (Finance, Marketing, HR, Engineering, General)
Engineering peter_pandey engineer123 πŸ”§ Engineering documents + General employee handbook
Finance finance_user finance123 πŸ’° Finance documents + General employee handbook
Marketing marketing_user marketing123 πŸ“Š Marketing documents + General employee handbook
HR hr_user hr123 πŸ‘₯ HR documents + General employee handbook
Employee employee_user employee123 πŸ“‹ ONLY General employee handbook

πŸ“ Document Categories

The system processes and provides access to the following document types:

πŸ”§ Engineering Department

  • engineering_master_doc.md: Complete technical architecture, development processes, technology stack, security frameworks, testing methodologies, and operational guidelines

πŸ’° Finance Department

  • financial_summary.md: Annual financial performance, expense analysis, cash flow data
  • quarterly_financial_report.md: Detailed Q1-Q4 2024 financial results and projections

πŸ‘₯ HR Department

  • hr_data.csv: Employee dataset with demographics, compensation, leave, attendance, and performance data (100 employee records)

πŸ“Š Marketing Department

  • marketing_report_2024.md: Annual marketing performance overview
  • marketing_report_q1_2024.md: Q1 marketing campaigns and metrics
  • marketing_report_q2_2024.md: Q2 marketing campaigns and metrics
  • marketing_report_q3_2024.md: Q3 marketing campaigns and metrics
  • market_report_q4_2024.md: Q4 marketing campaigns and metrics

πŸ“‹ General (All Employees)

  • employee_handbook.md: Company policies, benefits, procedures, code of conduct, and general employee information

πŸš€ Installation & Setup

Prerequisites

  • Python 3.10+
  • Node.js 16+
  • Yarn package manager

1. Clone and Install Dependencies

# Install Python dependencies
cd /app
pip install -e .

# Install frontend dependencies  
cd frontend
yarn install

2. Environment Configuration

Backend .env file is already configured:

HUGGINGFACE_API_KEY=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
MONGO_URL=mongodb://localhost:27017/finsolve_rag
CHROMA_PERSIST_DIRECTORY=./chroma_db

Frontend .env file:

REACT_APP_BACKEND_URL=http://localhost:8000

3. Start the Services

Backend Server:

cd /app
uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload

Frontend Server:

cd /app/frontend  
yarn start

πŸ§ͺ Testing

Automated Testing

Run the comprehensive test suite:

cd /app
python backend_test.py

This tests:

  • βœ… Authentication for all user roles
  • βœ… Invalid login attempts
  • βœ… Role-based document access restrictions
  • βœ… Chat functionality with proper RBAC
  • βœ… Cross-role access prevention
  • βœ… API endpoint functionality

Manual Testing Examples

Test Finance User Access:

curl -X POST -u finance_user:finance123 \
  -H "Content-Type: application/json" \
  -d '{"message":"What was our revenue in Q4 2024?"}' \
  http://localhost:8000/api/chat

Test Employee Access Restriction:

curl -X POST -u employee_user:employee123 \
  -H "Content-Type: application/json" \
  -d '{"message":"What was our revenue in Q4 2024?"}' \
  http://localhost:8000/api/chat

πŸ’¬ Sample Queries by Role

C-Level Executive (Full Access)

  • "What was our total revenue for 2024?"
  • "What is our current technology stack?"
  • "How many employees do we have and what's their performance distribution?"
  • "What were our most successful marketing campaigns?"

Engineering Team

  • "What is our technology stack and architecture?"
  • "How do we handle CI/CD and deployment processes?"
  • "What security frameworks and compliance standards do we follow?"
  • "What are our monitoring and maintenance procedures?"

Finance Team

  • "What was our Q4 2024 revenue and how did it compare to previous quarters?"
  • "What are our main expense categories and cost breakdowns?"
  • "How did our gross margins perform throughout 2024?"
  • "What are our cash flow trends and financial risks?"

Marketing Team

  • "What was our marketing spend in 2024 and how was it allocated?"
  • "Which marketing campaigns performed best in terms of ROI?"
  • "What is our customer acquisition cost and conversion rates?"
  • "How did our brand awareness and engagement metrics change?"

HR Team

  • "How many employees do we have across different departments?"
  • "What is the average performance rating and attendance percentage?"
  • "What are our leave utilization patterns and balance trends?"
  • "What is our salary distribution and compensation structure?"

General Employees

  • "What are the company leave policies and how do I apply?"
  • "What benefits and perquisites are available to employees?"
  • "What are the working hours and attendance requirements?"
  • "How do performance reviews and career development work?"

πŸ”’ Security Features

Role-Based Access Control (RBAC)

  • Strict Permission Enforcement: Users can only access documents from their authorized departments
  • Cross-Role Access Prevention: Finance users cannot see engineering data, employees cannot see executive information
  • Source Attribution: All responses include clear source document references
  • Authentication Required: All endpoints (except health check) require valid authentication

Data Protection

  • Document Metadata: Each document chunk includes department classification
  • Query Filtering: Vector store searches are filtered by user permissions before retrieval
  • Secure Authentication: Basic auth with encrypted password storage
  • Error Handling: Graceful degradation without information leakage

πŸƒβ€β™‚οΈ Usage Guide

Using the Web Interface

  1. Access the Frontend: Navigate to http://localhost:3000

  2. Login: Use any of the provided demo accounts or click the quick login buttons

  3. Chat Interface:

    • Type questions in natural language
    • Get AI-powered responses with source citations
    • Use suggested questions for your role
    • View conversation history with timestamps
  4. Role-Specific Experience:

    • See only relevant sample questions for your role
    • Responses limited to your authorized document access
    • Clear indication of your current role and permissions

Using the API Directly

Health Check:

curl http://localhost:8000/api/health

Login:

curl -X POST -H "Content-Type: application/json" \
  -d '{"username":"tony_sharma","password":"password123"}' \
  http://localhost:8000/api/login

Chat:

curl -X POST -u username:password \
  -H "Content-Type: application/json" \
  -d '{"message":"Your question here"}' \
  http://localhost:8000/api/chat

Get Available Documents:

curl -u username:password http://localhost:8000/api/documents

πŸ“Š System Statistics

  • Total Documents Processed: 134 chunks across 10 files
  • Document Types: Markdown (.md) and CSV (.csv) files
  • Vector Embeddings: 384-dimensional embeddings using sentence-transformers
  • Supported Users: 6 different roles with varying permission levels
  • Test Coverage: 28 comprehensive test cases with 100% pass rate

πŸ” How It Works

RAG Pipeline

  1. Document Processing:

    • Documents are chunked into 1000-character segments with 200-character overlap
    • Each chunk is embedded using sentence-transformers
    • Metadata includes department classification and source file information
  2. Query Processing:

    • User query is embedded using the same model
    • Vector similarity search finds relevant document chunks
    • Results are filtered by user's role permissions
  3. Response Generation:

    • Retrieved documents provide context for the LLM
    • HuggingFace API generates natural language responses
    • Response includes source citations and department attributions

Permission System

ROLE_PERMISSIONS = {
    "finance": ["finance", "general"],
    "marketing": ["marketing", "general"], 
    "hr": ["hr", "general"],
    "engineering": ["engineering", "general"],
    "c_level": ["finance", "marketing", "hr", "engineering", "general"],
    "employee": ["general"]
}

🚧 Future Enhancements

  • Advanced Authentication: Integration with LDAP/Active Directory
  • Audit Logging: Track all user queries and accessed documents
  • Document Versioning: Handle document updates and change tracking
  • Advanced Analytics: Usage patterns and popular query analysis
  • Multi-language Support: Internationalization for global teams
  • Mobile Application: Native mobile apps for on-the-go access

πŸ› Troubleshooting

Common Issues

Backend won't start:

  • Check if port 8000 is available
  • Verify all Python dependencies are installed
  • Check logs: tail -f server.log

Frontend won't connect:

  • Ensure backend is running on port 8000
  • Verify REACT_APP_BACKEND_URL in frontend/.env
  • Check if port 3000 is available

Authentication failures:

  • Verify username/password combinations
  • Check network connectivity between frontend and backend
  • Review browser console for CORS errors

No documents found:

  • Verify documents exist in /app/resources/data/
  • Check vector store initialization in backend logs
  • Confirm user has permissions for the queried department

πŸ“ž Support

For technical issues or questions:


Built with ❀️ for FinSolve Technologies
Empowering teams with secure, intelligent access to company knowledge

About

Starter repository for the RPC-01: Internal Chatbot with Role Based Access Control

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 76.6%
  • JavaScript 21.4%
  • CSS 1.1%
  • HTML 0.9%