A FastAPI-based backend that provides intelligent document comparison and merging capabilities using AI.
- Document Management: Upload, store, and retrieve text documents
- Basic & Enhanced Diff: Calculate differences between documents
- AI-Powered Analysis: Generate intelligent summaries of document changes
- Smart Merge: Apply AI to intelligently merge documents
- Conflict Resolution: Multiple strategies for resolving conflicting changes
- Asynchronous Processing: Background tasks for handling large documents
- RESTful API: Well-documented endpoints with comprehensive validation
- Authentication: JWT-based authentication system
- Python 3.9+
- PostgreSQL database
- Anthropic API key
Set up PostgreSQL using Docker:
docker run --name mypostgres \
-e POSTGRES_USER=myuser \
-e POSTGRES_PASSWORD=mypassword \
-e POSTGRES_DB=mydatabase \
-p 5432:5432 -d postgres-
Clone the repository:
git clone <repository-url> cd diffai-backend
-
Create and activate a virtual environment:
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
-
Create a
.envfile in the project root with the following contents:DATABASE_URL=postgresql://myuser:mypassword@localhost:5432/mydatabase SECRET_KEY=your-secret-key-for-jwt-tokens ALGORITHM=HS256 ACCESS_TOKEN_EXPIRE_MINUTES=30 AI_API_KEY=your-anthropic-api-key AI_MODEL_NAME=claude-3-5-sonnet-20241022 AI_MAX_TOKENS=4096 AI_ENABLED=true
Start the development server:
uvicorn backend.app.main:app --reloadThe API will be available at http://localhost:8000.
API documentation will be available at:
- Swagger UI: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc
backend/
├── app/
│ ├── endpoints/
│ │ ├── auth.py # Authentication endpoints
│ │ ├── diffs.py # Diff and merge endpoints
│ │ └── documents.py # Document management endpoints
│ ├── models/
│ │ └── document.py # Database models
│ ├── services/
│ │ ├── ai_service.py # AI integration service
│ │ └── diff_merge.py # Diff and merge logic
│ ├── tasks/
│ │ └── background_tasks.py # Asynchronous task processing
│ ├── config.py # Application configuration
│ ├── database.py # Database connection
│ ├── db_init.py # Database initialization
│ └── main.py # Application entry point
└── tests/ # Test suite
- FastAPI: High-performance web framework for building APIs
- SQLAlchemy: SQL toolkit and ORM
- Pydantic: Data validation and settings management
- Anthropic Claude: AI model for intelligent document analysis
- PostgreSQL: Relational database for data storage
- PyJWT: JWT token handling for authentication
POST /auth/login- Authenticate user and get access tokenGET /auth/me- Get current user info
POST /documents/upload- Upload new documentGET /documents/- List all documentsGET /documents/{doc_id}- Get specific documentDELETE /documents/{doc_id}- Delete document
GET /diffs/- Get diff between two documentsPOST /diffs/merge- Merge two documents (synchronous)POST /diffs/async-diff- Create asynchronous diff taskPOST /diffs/async-merge- Create asynchronous merge taskGET /diffs/task-status/{task_id}- Get task statusGET /diffs/merge-result/{task_id}- Get merge result
For large documents, the application uses background task processing:
- Client submits an asynchronous request
- Server creates a background task and returns a task ID
- Client polls task status endpoint for progress updates
- When complete, client retrieves the final result
This approach prevents timeouts and provides progress feedback for long-running operations.
The application uses Anthropic's Claude model for:
- Diff Analysis: Generate human-readable summaries of document differences
- Smart Merge: Apply intelligent conflict resolution based on document context
- Custom Rules: Process natural language merge guidance
Adjust AI parameters in .env:
AI_API_KEY: Your Anthropic API keyAI_MODEL_NAME: Model to use (default: claude-3-5-sonnet-20241022)AI_MAX_TOKENS: Maximum tokens for responsesAI_ENABLED: Toggle AI functionality
Run the test suite:
pytestThe tests use a separate test database configured in pytest.ini.
Build and run with Docker:
docker build -t diffai-backend .
docker run -p 8000:8000 --env-file .env diffai-backend- Use a proper task queue (Celery, RQ) for production deployments
- Implement rate limiting for API endpoints
- Configure proper database connection pooling
- Set up monitoring and logging
Database Connection Problems:
- Verify PostgreSQL is running
- Check database credentials in
.env - Ensure database user has proper permissions
AI Service Errors:
- Validate your Anthropic API key
- Check AI service availability
- Review rate limits and quotas
- Fork the repository
- Create a feature branch:
git checkout -b feature/my-feature - Commit changes:
git commit -am 'Add new feature' - Push to branch:
git push origin feature/my-feature - Submit a pull request