ClauseDiff is a client-side web application designed to help users compare two documents (.docx, .pdf, .txt) and identify differences between them. It visually highlights additions and deletions, provides a summary of changes, and allows users to export the comparison report in PDF format or a list of changes in CSV format. A key feature of ClauseDiff is its commitment to privacy: all document processing and comparison tasks are performed directly in the user's browser, meaning no files are uploaded to or stored on any server.
- Document Upload: Supports uploading two documents of types DOCX, PDF, or TXT.
- Client-Side Processing: All file parsing and text extraction happen in the browser.
.docxfiles are processed using Mammoth.js..pdffiles are processed using PDF.js..txtfiles are read as plain text.
- Text Comparison: Utilizes the
diff-match-patchlibrary to perform a robust comparison of the extracted text content from the two documents. - Visual Difference Highlighting: Displays the content of both documents side-by-side, with insertions highlighted in green and deletions in red (with a strikethrough).
- Difference Summary: Provides a concise summary of changes, including the number of characters added/deleted and the total number of differing blocks. It also lists the most significant changes.
- Export to PDF: Allows users to export the side-by-side comparison view as a PDF document using jsPDF and html2canvas.
- Export to CSV: Enables exporting a list of identified additions and deletions in CSV format.
ClauseDiff is a single-page application (SPA) built with React and TypeScript.
- Frontend:
- React: For building the user interface components.
- TypeScript: For static typing and improved code quality.
- Tailwind CSS: For utility-first styling.
- Core Logic Libraries:
diff-match-patch: Google's library for text differencing and patch application.mammoth.js: Converts.docxdocuments to HTML and extracts raw text.pdf.js(by Mozilla): Parses.pdffiles and extracts text content.jspdf&html2canvas: Used in combination to generate PDF reports from HTML content.
- Application Structure:
index.html: The main HTML file that loads necessary CDN libraries and the React application.index.tsx: The entry point for the React application, mounting theAppcomponent.App.tsx: The main application component, managing state, file uploads, comparison logic, and orchestrating UI updates.components/: Contains reusable React components for UI elements like file uploads, comparison views, toolbar, difference summary, and icons.utils/: Houses utility functions for:fileProcessor.ts: Logic for reading and extracting text/HTML from different file types.diffEngine.ts: Wrapper arounddiff-match-patchto generate comparison results.exportHandler.ts: Functions for PDF and CSV export.
types.ts: Defines TypeScript interfaces and types used throughout the application.constants.ts: Stores shared constants like color palettes and text sizes.metadata.json: Contains metadata about the application.
- Client-Side Processing: All document processing and comparison logic runs entirely in the user's browser. This ensures user privacy as documents are not transmitted to any external server.
- React 19
- TypeScript
- Tailwind CSS
- Diff-Match-Patch
- Mammoth.js
- PDF.js
- jsPDF
- html2canvas
/
├── App.tsx # Main application component
├── index.tsx # React entry point
├── components/ # UI components
│ ├── icons/ # SVG icon components
│ ├── ComparisonView.tsx
│ ├── DifferenceSummary.tsx
│ ├── FileUpload.tsx
│ ├── LoadingSpinner.tsx
│ └── Toolbar.tsx
├── utils/ # Utility functions
│ ├── diffEngine.ts
│ ├── exportHandler.ts
│ └── fileProcessor.ts
├── constants.ts # Application-wide constants
├── types.ts # TypeScript type definitions
├── index.html # Main HTML page
├── metadata.json # Application metadata
└── README.md # This file
ClauseDiff is a Next.js application with authentication and database features. Follow these steps to set up your local development environment:
- Node.js: Version 18.0 or higher
- npm: Version 8.0 or higher
- Git: For version control
# Clone the repository
git clone <repository-url>
cd ClauseDiff
# Install dependencies
npm install🚀 Quick Setup (Automated): For a one-command setup, run:
# Run the automated setup script
./scripts/setup-dev.shThis script will handle steps 1-3 automatically. If you prefer manual setup, continue with the steps below.
Create your local environment file (.env.local) to override production settings:
# Create your local environment file from the example
cp .env.local.example .env.localEdit .env.local with your local development settings:
# Local Development Database Override
DATABASE_URL="file:./dev.db"
DIRECT_URL="file:./dev.db"
# Google OAuth Configuration (use your own or keep these for testing)
GOOGLE_CLIENT_ID=your-google-client-id
GOOGLE_CLIENT_SECRET=your-google-client-secret
# NextAuth Configuration
NEXTAUTH_URL=http://localhost:3000
NEXTAUTH_BASE_PATH=/auth
NEXTAUTH_SECRET=your-nextauth-secret-key
# Optional: Gemini API for advanced features
GEMINI_API_KEY=your-gemini-api-key
# Other development settings
PORT=3000
AUDIT_LOGGING_ENABLED=offNote: The
.env.localfile overrides settings from.envfor local development only. This allows you to use a local SQLite database while keeping production PostgreSQL settings in.env.
ClauseDiff uses SQLite for local development and PostgreSQL for production.
# Generate Prisma client for local development
npx prisma generate --schema=schema.dev.prisma
# Create and initialize the local SQLite database
npx prisma db push --schema=schema.dev.prismaThis will:
- Generate the Prisma client based on the SQLite-compatible schema
- Create a
dev.dbSQLite database file - Apply all database tables and relationships
# Start the Next.js development server
npm run devThe application will be available at http://localhost:3000.
To test authentication features:
- Google OAuth:
- Go to Google Cloud Console
- Create a new project or select existing one
- Enable Google+ API
- Create OAuth 2.0 credentials
- Add
http://localhost:3000/api/auth/callback/googleto authorized redirect URIs - Update
GOOGLE_CLIENT_IDandGOOGLE_CLIENT_SECRETin.env.local
# Run tests
npm test
# Run linting
npm run lint
# Type checking
npm run type-check
# Build for production (to test)
npm run build/
├── app/ # Next.js App Router pages
├── src/ # Source code
│ ├── components/ # React components
│ ├── infrastructure/ # Database, external services
│ ├── domain/ # Business logic entities
│ ├── lib/ # Utilities and configuration
│ └── utils/ # Helper functions
├── prisma/ # Database migrations (production)
├── schema.dev.prisma # SQLite schema for development
├── dev.db # Local SQLite database (auto-generated)
├── .env # Production environment variables
├── .env.local # Local development overrides
└── README.md # This file
Database Connection Issues:
- Ensure
dev.dbfile exists (runnpx prisma db push --schema=schema.dev.prisma) - Check that
DATABASE_URLin.env.localpoints tofile:./dev.db
Authentication Issues:
- Verify
NEXTAUTH_URLmatches your local server URL - Check Google OAuth credentials and redirect URIs
Port Conflicts:
- If port 3000 is busy, Next.js will automatically use the next available port
- Or specify a different port:
PORT=3001 npm run dev
Prisma Client Issues:
- Re-generate the client:
npx prisma generate --schema=schema.dev.prisma - Reset database:
rm dev.db && npx prisma db push --schema=schema.dev.prisma
- Development: Uses SQLite database (
dev.db) via.env.local - Production: Uses PostgreSQL (Supabase) via
.env - Switching: Delete/rename
.env.localto use production database locally
This setup ensures a clean separation between your local development environment and production configuration.
ClauseDiff enforces code quality and test coverage automatically:
- Linting: All code is checked with ESLint (see
.eslintrc.js). Linting runs on every build and pre-commit (via Husky). - Test Coverage: Jest collects coverage on every test run. Coverage reports are output to the
coverage/directory in both text and lcov formats. - Build/Deploy Automation: On Netlify (and locally), the build process runs
npm run lintandnpm testbefore building. If linting or tests (including coverage thresholds) fail, the build and deploy are blocked. - Viewing Coverage: After running
npm test, opencoverage/lcov-report/index.htmlin your browser for a detailed coverage report. - Thresholds: The minimum coverage threshold is currently set to 10% (for incremental migration) and can be raised as coverage improves (see
jest.config.cjs).
Best Practices:
- Always check the output of
npm run lintandnpm testbefore pushing changes. - Review coverage reports to identify untested code.
- All contributors are expected to maintain or improve code quality and coverage.