Skip to content

viniciusalbino/ClauseDiff

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

58 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ClauseDiff - Document Comparison Tool

1. Project Overview

ClauseDiff is a client-side web application designed to help users compare two documents (.docx, .pdf, .txt) and identify differences between them. It visually highlights additions and deletions, provides a summary of changes, and allows users to export the comparison report in PDF format or a list of changes in CSV format. A key feature of ClauseDiff is its commitment to privacy: all document processing and comparison tasks are performed directly in the user's browser, meaning no files are uploaded to or stored on any server.

2. Core Functionality

  • Document Upload: Supports uploading two documents of types DOCX, PDF, or TXT.
  • Client-Side Processing: All file parsing and text extraction happen in the browser.
    • .docx files are processed using Mammoth.js.
    • .pdf files are processed using PDF.js.
    • .txt files are read as plain text.
  • Text Comparison: Utilizes the diff-match-patch library to perform a robust comparison of the extracted text content from the two documents.
  • Visual Difference Highlighting: Displays the content of both documents side-by-side, with insertions highlighted in green and deletions in red (with a strikethrough).
  • Difference Summary: Provides a concise summary of changes, including the number of characters added/deleted and the total number of differing blocks. It also lists the most significant changes.
  • Export to PDF: Allows users to export the side-by-side comparison view as a PDF document using jsPDF and html2canvas.
  • Export to CSV: Enables exporting a list of identified additions and deletions in CSV format.

3. Architecture

ClauseDiff is a single-page application (SPA) built with React and TypeScript.

  • Frontend:
    • React: For building the user interface components.
    • TypeScript: For static typing and improved code quality.
    • Tailwind CSS: For utility-first styling.
  • Core Logic Libraries:
    • diff-match-patch: Google's library for text differencing and patch application.
    • mammoth.js: Converts .docx documents to HTML and extracts raw text.
    • pdf.js (by Mozilla): Parses .pdf files and extracts text content.
    • jspdf & html2canvas: Used in combination to generate PDF reports from HTML content.
  • Application Structure:
    • index.html: The main HTML file that loads necessary CDN libraries and the React application.
    • index.tsx: The entry point for the React application, mounting the App component.
    • App.tsx: The main application component, managing state, file uploads, comparison logic, and orchestrating UI updates.
    • components/: Contains reusable React components for UI elements like file uploads, comparison views, toolbar, difference summary, and icons.
    • utils/: Houses utility functions for:
      • fileProcessor.ts: Logic for reading and extracting text/HTML from different file types.
      • diffEngine.ts: Wrapper around diff-match-patch to generate comparison results.
      • exportHandler.ts: Functions for PDF and CSV export.
    • types.ts: Defines TypeScript interfaces and types used throughout the application.
    • constants.ts: Stores shared constants like color palettes and text sizes.
    • metadata.json: Contains metadata about the application.
  • Client-Side Processing: All document processing and comparison logic runs entirely in the user's browser. This ensures user privacy as documents are not transmitted to any external server.

4. Key Technologies Used

  • React 19
  • TypeScript
  • Tailwind CSS
  • Diff-Match-Patch
  • Mammoth.js
  • PDF.js
  • jsPDF
  • html2canvas

5. File Structure Overview

/
├── App.tsx                  # Main application component
├── index.tsx                # React entry point
├── components/              # UI components
│   ├── icons/               # SVG icon components
│   ├── ComparisonView.tsx
│   ├── DifferenceSummary.tsx
│   ├── FileUpload.tsx
│   ├── LoadingSpinner.tsx
│   └── Toolbar.tsx
├── utils/                   # Utility functions
│   ├── diffEngine.ts
│   ├── exportHandler.ts
│   └── fileProcessor.ts
├── constants.ts             # Application-wide constants
├── types.ts                 # TypeScript type definitions
├── index.html               # Main HTML page
├── metadata.json            # Application metadata
└── README.md                # This file

6. Local Development Setup

ClauseDiff is a Next.js application with authentication and database features. Follow these steps to set up your local development environment:

Prerequisites

  • Node.js: Version 18.0 or higher
  • npm: Version 8.0 or higher
  • Git: For version control

1. Clone and Install Dependencies

# Clone the repository
git clone <repository-url>
cd ClauseDiff

# Install dependencies
npm install

🚀 Quick Setup (Automated): For a one-command setup, run:

# Run the automated setup script
./scripts/setup-dev.sh

This script will handle steps 1-3 automatically. If you prefer manual setup, continue with the steps below.

2. Environment Configuration

Create your local environment file (.env.local) to override production settings:

# Create your local environment file from the example
cp .env.local.example .env.local

Edit .env.local with your local development settings:

# Local Development Database Override
DATABASE_URL="file:./dev.db"
DIRECT_URL="file:./dev.db"

# Google OAuth Configuration (use your own or keep these for testing)
GOOGLE_CLIENT_ID=your-google-client-id
GOOGLE_CLIENT_SECRET=your-google-client-secret

# NextAuth Configuration
NEXTAUTH_URL=http://localhost:3000
NEXTAUTH_BASE_PATH=/auth
NEXTAUTH_SECRET=your-nextauth-secret-key

# Optional: Gemini API for advanced features
GEMINI_API_KEY=your-gemini-api-key

# Other development settings
PORT=3000
AUDIT_LOGGING_ENABLED=off

Note: The .env.local file overrides settings from .env for local development only. This allows you to use a local SQLite database while keeping production PostgreSQL settings in .env.

3. Database Setup

ClauseDiff uses SQLite for local development and PostgreSQL for production.

# Generate Prisma client for local development
npx prisma generate --schema=schema.dev.prisma

# Create and initialize the local SQLite database
npx prisma db push --schema=schema.dev.prisma

This will:

  • Generate the Prisma client based on the SQLite-compatible schema
  • Create a dev.db SQLite database file
  • Apply all database tables and relationships

4. Start Development Server

# Start the Next.js development server
npm run dev

The application will be available at http://localhost:3000.

5. OAuth Setup (Optional)

To test authentication features:

  1. Google OAuth:
    • Go to Google Cloud Console
    • Create a new project or select existing one
    • Enable Google+ API
    • Create OAuth 2.0 credentials
    • Add http://localhost:3000/api/auth/callback/google to authorized redirect URIs
    • Update GOOGLE_CLIENT_ID and GOOGLE_CLIENT_SECRET in .env.local

6. Development Workflow

# Run tests
npm test

# Run linting
npm run lint

# Type checking
npm run type-check

# Build for production (to test)
npm run build

File Structure for Development

/
├── app/                     # Next.js App Router pages
├── src/                     # Source code
│   ├── components/          # React components
│   ├── infrastructure/      # Database, external services
│   ├── domain/             # Business logic entities
│   ├── lib/                # Utilities and configuration
│   └── utils/              # Helper functions
├── prisma/                 # Database migrations (production)
├── schema.dev.prisma       # SQLite schema for development
├── dev.db                  # Local SQLite database (auto-generated)
├── .env                    # Production environment variables
├── .env.local              # Local development overrides
└── README.md               # This file

Troubleshooting

Database Connection Issues:

  • Ensure dev.db file exists (run npx prisma db push --schema=schema.dev.prisma)
  • Check that DATABASE_URL in .env.local points to file:./dev.db

Authentication Issues:

  • Verify NEXTAUTH_URL matches your local server URL
  • Check Google OAuth credentials and redirect URIs

Port Conflicts:

  • If port 3000 is busy, Next.js will automatically use the next available port
  • Or specify a different port: PORT=3001 npm run dev

Prisma Client Issues:

  • Re-generate the client: npx prisma generate --schema=schema.dev.prisma
  • Reset database: rm dev.db && npx prisma db push --schema=schema.dev.prisma

Production vs Development

  • Development: Uses SQLite database (dev.db) via .env.local
  • Production: Uses PostgreSQL (Supabase) via .env
  • Switching: Delete/rename .env.local to use production database locally

This setup ensures a clean separation between your local development environment and production configuration.

7. Code Quality & Coverage Monitoring

ClauseDiff enforces code quality and test coverage automatically:

  • Linting: All code is checked with ESLint (see .eslintrc.js). Linting runs on every build and pre-commit (via Husky).
  • Test Coverage: Jest collects coverage on every test run. Coverage reports are output to the coverage/ directory in both text and lcov formats.
  • Build/Deploy Automation: On Netlify (and locally), the build process runs npm run lint and npm test before building. If linting or tests (including coverage thresholds) fail, the build and deploy are blocked.
  • Viewing Coverage: After running npm test, open coverage/lcov-report/index.html in your browser for a detailed coverage report.
  • Thresholds: The minimum coverage threshold is currently set to 10% (for incremental migration) and can be raised as coverage improves (see jest.config.cjs).

Best Practices:

  • Always check the output of npm run lint and npm test before pushing changes.
  • Review coverage reports to identify untested code.
  • All contributors are expected to maintain or improve code quality and coverage.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages