Skip to content

Pacamelo/purge-core

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

@pacamelo/core

Core PII detection and redaction engine for PURGE.

Why Open Source?

This package contains all the code that touches your data. We open-source it so you can verify:

  • What patterns we search for - See exactly which regex patterns detect PII
  • How files are parsed - Verify no data is exfiltrated during processing
  • How masking works - Check that sensitive data is properly hidden
  • Security measures - Audit ReDoS protection, memory cleanup, etc.

The PURGE UI remains proprietary, but the data-processing core is transparent.

Installation

npm install @pacamelo/core

What's Included

Detection Engine

  • Regex patterns for: emails, SSNs, phone numbers, credit cards, addresses, names
  • Configurable sensitivity levels
  • Web Worker isolation for ReDoS protection

Document Processors

  • XLSX parser with column selection
  • Metadata stripping on output
  • Formula detection warnings

Utilities

  • PII masking (partial masks, full redaction)
  • Secure random ID generation
  • File type validation via magic bytes

React Hooks

  • useDocumentProcessor - Main processing orchestration
  • useOfflineEnforcement - Privacy mode state machine
  • useFileEntropy - Before/after verification

Usage

import {
  regexDetectionEngine,
  getProcessorForFile,
  useDocumentProcessor,
  type Detection,
} from '@pacamelo/core';

// Detect PII in text
const result = await regexDetectionEngine.detect(content, config);
console.log(result.detections); // Array of Detection objects

// Process a file
const processor = getProcessorForFile(file);
const parsed = await processor.parse(file);
const blob = await processor.applyRedactions(parsed, redactions);

Security Features

  • ReDoS Protection: Regex runs in Web Workers with enforced timeouts
  • Memory Cleanup: Buffers zeroed after processing (best-effort in JS)
  • Metadata Stripping: Output files rebuilt without hidden content
  • Offline Mode: Encourages airplane mode during sensitive processing

License

MIT - See LICENSE

Contributing

Issues and PRs welcome at github.com/Pacamelo/purge-core.

About

Core PII detection and redaction engine for PURGE

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published