Skip to content

pdf-tools/smart-redact-samples

Repository files navigation

Pdftools AI Smart Redact samples

Sample configurations and examples for deploying and using AI Smart Redact by Pdftools.

AI Smart Redact automatically detects and redacts personally identifiable information (PII) in PDF documents using pattern matching, keyword detection, and ML-based named entity recognition (GLiNER).

Architecture

AI Smart Redact consists of four services:

flowchart LR
    Browser["Browser User"]
    Client["API Client<br/>(curl / Python / C#)"]
    HITL["HITL Web UI<br/>port 3000"]
    Orchestrator["Orchestrator API<br/>port 9983<br/>User mgmt, JWT auth"]
    Manager["Manager API<br/>port 9982<br/>Files, Jobs, Orchestration"]
    Worker["Worker API<br/>port 4885 (internal)<br/>PII detection, redaction"]
    OrchDB[("Orchestrator DB<br/>PostgreSQL")]
    ManagerDB[("Manager DB<br/>PostgreSQL")]

    Browser -- HTTP --> HITL
    Client -- HTTP --> Orchestrator
    HITL -- HTTP --> Orchestrator
    Orchestrator -- HTTP --> Manager
    Manager -- HTTP --> Worker
    Orchestrator --- OrchDB
    Manager --- ManagerDB
Loading
Service Port Description
HITL Web UI 3000 Human-in-the-loop review interface for detection results and redaction jobs
Orchestrator 9983 Web UI backend with user management and JWT authentication
Manager 9982 Client-facing API for file uploads and detection/redaction jobs
Worker 4885 Internal service that performs PII detection and redaction

For detailed architecture documentation, refer to AI Smart Redact Architecture.

Prerequisites

Windows users

The startup flow and API examples use bash. On Windows, use one of:

  • WSL2 (recommended) — full Linux environment. Docker Desktop integrates with WSL2 natively, so docker commands just work.
  • Git Bash — bundled with Git for Windows. Sufficient for all Docker-based scripts in this repo, provided Docker Desktop is running and python3 is available on PATH (needed by the curl API examples).

The repository also includes a standalone PowerShell key-generation helper for manual key generation; smart-redact.sh setup generates keys automatically.

Quick start

The fastest way to get AI Smart Redact running:

Prerequisites: A valid AI Smart Redact license key — register at portal.pdf-tools.com to generate a free trial key, or see the licensing docs for production keys.

# 1. Clone this repository
git clone https://github.com/pdf-tools/smart-redact-samples.git
cd smart-redact-samples

# 2. Create your .env file with generated secrets
./smart-redact.sh setup --license-key "<RDCTSRV,...>"

# 3. Start all services and wait for Docker health checks
./smart-redact.sh up

# 4. Show Compose-managed service status
./smart-redact.sh health

# Optional: stream logs
./smart-redact.sh logs

Once running:

Default HITL / Orchestrator login:

  • Email: admin@example.com
  • Password: Admin@1234!Tmp

Repository structure

smart-redact-samples/
├── samples/                 # Sample documents for testing
│
├── docker-compose/          # Docker Compose deployments
│   ├── cpu/                 #   Full stack (CPU inference)
│   ├── gpu/                 #   Full stack (GPU inference, NVIDIA CUDA)
│   └── minimal/             #   Manager + Worker only (no Orchestrator)
│
├── docker-run/              # Individual docker run scripts
│
├── api-examples/            # API usage examples
│   ├── curl/                #   Shell scripts (step-by-step)
│   ├── python/              #   Python examples
│   └── csharp/              #   C# / .NET example
│
└── scripts/                 # Utility scripts

Deployment options

Option Best For Guide
Docker Compose (CPU) Quick start, development, evaluation Guide
Docker Compose (GPU) Production with GPU acceleration Guide
Docker Compose (Minimal) API-only usage without Orchestrator Guide
Docker Run Manual control over each container Guide

API examples

For complete usage examples, refer to api-examples/, which covers:

  • Uploading PDF files
  • Running PII detection
  • Downloading detection results
  • Running PII redaction
  • End-to-end workflows

For the full API reference, refer to AI Smart Redact API Documentation.

Configuration reference

Configure all AI Smart Redact services through environment variables:

Variable Required Description
PDFTOOLS_LICENSE_KEY Yes AI Smart Redact license key
ENCRYPTION_KEY Yes 32-byte Base64-encoded AES-256-GCM key
ORCHESTRATOR_JWT_SECRET Yes* JWT signing secret (min 32 chars). *Only for Orchestrator.
VERSION No Docker image tag (default: latest)
HITL_WEB_PORT No Host port for the HITL Web UI (default: 3000)
HITL_ORCHESTRATOR_URL No Browser-facing Orchestrator API URL used by the HITL Web UI (default: http://localhost:9983)

For all configuration options, refer to AI Smart Redact Configuration Guide.

Security notes

The Docker Compose files and docker-run/ scripts in this repository hardcode default credentials for local demonstration only:

Service User Password
PostgreSQL smartredact smartredact
RabbitMQ guest guest

For any non-local deployment, replace these with strong values.

Documentation

License

This repository contains sample configurations for AI Smart Redact, a commercial product by PDF Tools AG. Running the service requires a valid license key.

About

Samples on how to use Pdftools Smart Redact.

Resources

License

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors