Skip to content

4shivv/DocLens

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

9 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🧾 DocuLens - AI-Powered Tax Document Analysis

Transform tax confusion into clarity with AI-powered document analysis

DocuLens is an intelligent web application that scans uploaded tax documents (PDFs, photos, or scanned images), detects potential red flags like missing data or inconsistencies, and visually explains what's wrong in a simple, interactive format.

🌟 Features

  • πŸ” Smart Document Analysis: Uses Google Gemini AI to understand tax documents
  • πŸ“‹ Issue Detection: Identifies missing fields, inconsistencies, and formatting errors
  • πŸ’‘ Plain English Explanations: Converts complex tax jargon into simple language
  • 🎯 Visual Feedback: Highlights problem areas directly on your document
  • πŸ“Š Risk Assessment: Provides completeness scores and risk levels
  • πŸ“± Multiple Formats: Supports PDFs and images (JPG, PNG)

🧠 What Problem Does It Solve?

  • Understanding: Most people don't understand what's inside their W2, 1099, or 1040 forms
  • Accuracy: They often file with missing data, misclassified deductions, or formatting issues
  • Consequences: This leads to rejections, audits, or financial penalties
  • Accessibility: Many can't afford a CPA or don't know what questions to ask
  • Solution: DocuLens provides a visual, AI-guided walkthroughβ€”like having a tax assistant beside you

πŸ” How It Works

  1. πŸ“€ Upload: User uploads a scanned or photographed tax document
  2. πŸ” OCR: Tesseract.js extracts text and spatial layout from images
  3. πŸ—οΈ Parse: Text parser extracts key tax-relevant fields into structured JSON
  4. ⚠️ Validate: Rule-based engine identifies common issues (missing EIN, line mismatches)
  5. πŸ€– Analyze: Gemini API simplifies jargon, ranks red flags, and generates summaries
  6. 🎨 Visualize: Fabric.js overlays colored boxes and tooltips on the original document
  7. πŸ“‹ Report: User gets interactive visual experience and downloadable report

πŸ—οΈ Project Structure

DocuLens/
β”œβ”€β”€ backend/                 # Node.js + Express API
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ controllers/     # API endpoint handlers
β”‚   β”‚   β”œβ”€β”€ services/        # Business logic (OCR, AI, processing)
β”‚   β”‚   β”œβ”€β”€ middleware/      # Auth, validation, rate limiting
β”‚   β”‚   β”œβ”€β”€ models/          # Data schemas and types
β”‚   β”‚   β”œβ”€β”€ routes/          # API route definitions
β”‚   β”‚   β”œβ”€β”€ utils/           # Helper functions and utilities
β”‚   β”‚   └── config/          # Firebase, Gemini, and app config
β”‚   β”œβ”€β”€ tests/               # Unit and integration tests
β”‚   β”œβ”€β”€ uploads/             # Temporary file storage
β”‚   └── docs/                # API documentation
β”œβ”€β”€ frontend/                # React web application (future)
β”œβ”€β”€ shared/                  # Shared types and constants
└── docker/                  # Container configuration

πŸš€ Quick Start

Prerequisites

  • Node.js 18+
  • Firebase project with Firestore & Storage
  • Google Gemini API key

Backend Setup

# Navigate to backend
cd backend

# Install dependencies
npm install

# Configure environment
cp .env.example .env
# Edit .env with your Firebase and Gemini credentials

# Start development server
npm run dev

API Testing

# Health check
curl http://localhost:3000/api/health

# Upload document
curl -X POST -F "document=@your-tax-form.pdf" \
  http://localhost:3000/api/documents/upload

πŸ“š Tech Stack

Backend

  • Runtime: Node.js 18+ with Express.js
  • Database: Firebase Firestore (document storage)
  • File Storage: Firebase Storage (file persistence)
  • AI/ML: Google Gemini 2.5 Flash API
  • OCR: Tesseract.js (fallback text extraction)
  • Processing: Bull.js + Redis (async job queues)
  • Security: Helmet, CORS, rate limiting, input validation

Frontend (Planned)

  • Framework: React 18 with TypeScript
  • Styling: Tailwind CSS
  • Document Viewer: PDF.js + Fabric.js for annotations
  • State Management: Redux Toolkit or Zustand
  • File Upload: react-dropzone

πŸ“‹ API Endpoints

Document Management

POST   /api/documents/upload        # Upload tax document
GET    /api/documents/{id}          # Get document details  
GET    /api/documents/{id}/status   # Check processing status
GET    /api/documents/{id}/results  # Get analysis results
GET    /api/documents/{id}/report   # Download formatted report
DELETE /api/documents/{id}          # Delete document

System

GET /api/health                     # Health check
GET /api/documents/supported-formats # Supported file types

πŸ§ͺ Development

Running Tests

cd backend

# Run all tests
npm test

# Run with coverage
npm run test:coverage

# Run specific test
npm test -- documentService.test.js

Code Quality

# Lint code
npm run lint

# Fix linting issues
npm run lint:fix

# Format code
npm run format

Docker Development

# Build and run with Docker Compose
docker-compose -f docker/docker-compose.yml up --build

# Backend will be available at http://localhost:3000
# Redis will be available at localhost:6379

πŸ”’ Security Features

  • File Validation: Magic number checking, size limits, type restrictions
  • Rate Limiting: Configurable per endpoint to prevent abuse
  • Input Sanitization: XSS prevention and data validation
  • Error Handling: Structured responses without sensitive data exposure
  • API Security: CORS configuration, security headers

πŸ“Š Monitoring & Analytics

Logging

  • Structured logging with Winston
  • Separate log files for errors, HTTP requests, and general application events
  • Log rotation and retention policies

Metrics

  • Processing performance tracking
  • Gemini API usage monitoring
  • Error rate analysis
  • File upload statistics

🀝 Contributing

Development Workflow

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Write tests for new functionality
  4. Ensure all tests pass (npm test)
  5. Commit changes (git commit -m 'Add amazing feature')
  6. Push to branch (git push origin feature/amazing-feature)
  7. Open a Pull Request

Code Standards

  • Follow ESLint configuration
  • Maintain test coverage above 80%
  • Document new API endpoints
  • Use conventional commit messages

πŸ“ˆ Roadmap

Phase 1: MVP Backend (Current)

  • File upload and validation
  • Gemini AI integration
  • Basic OCR fallback
  • Document processing pipeline
  • API endpoints and documentation

Phase 2: Frontend Development

  • React application setup
  • Document upload interface
  • Processing status tracking
  • Results visualization
  • Issue highlighting with Fabric.js

Phase 3: Enhanced Features

  • User authentication and accounts
  • Document history and management
  • Advanced tax form support
  • Batch processing capabilities
  • Mobile-responsive design

Phase 4: Production Ready

  • Performance optimization
  • Advanced monitoring
  • Deployment automation
  • User feedback system
  • Analytics and insights

πŸ”§ Configuration

Environment Variables

See backend/.env.example for all required configuration options.

Firebase Setup

  1. Create Firebase project
  2. Enable Firestore Database and Storage
  3. Generate service account key
  4. Configure security rules

Gemini API Setup

  1. Get API key from Google AI Studio
  2. Configure rate limits and quotas
  3. Set up usage monitoring

πŸ“ž Support

Common Issues

  • File Upload Errors: Check file size limits and supported formats
  • Processing Failures: Verify Gemini API key and network connectivity
  • Database Issues: Confirm Firebase credentials and permissions

Getting Help

  • πŸ“– Check the API Documentation
  • πŸ› Report bugs via GitHub Issues
  • πŸ’¬ Join discussions in GitHub Discussions
  • πŸ“§ Contact support for critical issues

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • Google Gemini: For powerful document understanding capabilities
  • Firebase: For reliable backend infrastructure
  • Tesseract.js: For OCR functionality
  • Open Source Community: For the amazing tools and libraries

Made with ❀️ for taxpayers everywhere who deserve clarity in their financial documents.

About

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors