Skip to content

Text Anonymizer - Protect sensitive data before sharing with AI tools

License

Notifications You must be signed in to change notification settings

BrunnoML/OctoMask

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OctoMask - Text Anonymizer

OctoMask Logo

Protect sensitive data before sharing with AI tools

FeaturesQuick StartHow It WorksSupported FormatsPrivacyPortuguês


Why OctoMask?

When using AI assistants like ChatGPT, Claude, or Gemini, you often need to share documents containing sensitive information. OctoMask (like an octopus with many arms masking your sensitive data) helps you:

  • Anonymize personal data before sending to external AIs
  • De-anonymize the AI's response to restore original values
  • Keep everything local - your data never leaves your computer

Demo

Basic Version (REGEX)

OctoMask Basic Version

AI-Powered Version (Ollama)

OctoMask AI Version

Features

Two Versions Available

Version File Best For
Basic index.html Quick anonymization with REGEX patterns
AI-Powered index_ai.html Smart name/entity detection with local LLM

Core Features

  • 100% offline - no internet connection required
  • Auto-detection of 20+ entity types (SSN, emails, phones, credit cards, etc.)
  • AI-powered name detection using local Ollama (no cloud API!)
  • Automatic language detection (English/Portuguese) for optimal AI prompts
  • Manual marking for names and addresses
  • Support for PDF and DOCX file loading
  • Mapping history saved locally in your browser
  • Export/Import mappings as JSON files
  • Works on any modern browser (Chrome, Firefox, Safari, Edge)
  • GDPR/LGPD compliant - data never leaves your machine

Quick Start

Basic Version (No Setup Required)

  1. Download the latest release
  2. Extract the ZIP file
  3. Open web/index.html in your browser
  4. That's it! No installation needed.

AI Version (Requires Ollama)

  1. Download and extract the release
  2. Install Ollama and a model:
    • Windows: Run web/install_ollama.bat
    • Mac: Run bash web/install_ollama_mac.sh in Terminal
    • Linux: Run bash web/install_ollama_linux.sh in Terminal
  3. Open web/index_ai.html in your browser
  4. The AI will automatically detect names and other entities!

Recommended Model: qwen2.5:3b - Best balance of speed and Portuguese/English quality (~3-4GB RAM)

Clone Repository

git clone https://github.com/BrunnoML/octomask.git
cd octomask
# Open web/index.html (basic) or web/index_ai.html (AI version)

How It Works

┌─────────────────────────────────────────────────────────────────┐
│  1. PASTE TEXT                                                  │
│     "John Smith (SSN: 123-45-6789) lives at 123 Main St..."    │
└─────────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────────┐
│  2. DETECT & ANONYMIZE                                          │
│     "PERSON_01 (SSN: SSN_01) lives at ADDR_01..."              │
└─────────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────────┐
│  3. USE WITH EXTERNAL AI                                        │
│     Send anonymized text to ChatGPT, Claude, etc.               │
└─────────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────────┐
│  4. DE-ANONYMIZE RESPONSE                                       │
│     Restore original values in the AI's response                │
└─────────────────────────────────────────────────────────────────┘

Supported Entity Types

Automatic Detection

Category Entity Types
ID Documents SSN, CPF, CNPJ, Passport
Contact Email, Phone (US/BR/International)
Financial Credit Card, IBAN, Bank Account
Location ZIP/Postal Code (US/UK/BR), Address
Vehicle License Plate (US/BR)
Date/Time Dates (US/ISO format), Time
Digital IP Address, MAC Address, URL

Manual Marking

  • Names (people, companies)
  • Addresses
  • Custom entities

Privacy & Security

Your data stays on YOUR computer.

OctoMask is designed with privacy as the core principle:

  • No server - Everything runs in your browser
  • No analytics - We don't track anything
  • No external requests - Network tab will show zero external connections
  • Open source - Audit the code yourself

To verify: Press F12Network tab → Use OctoMask → See there are NO external requests.

Browser Compatibility

Browser Supported
Chrome 80+ Yes
Firefox 75+ Yes
Safari 13+ Yes
Edge 80+ Yes

Roadmap

  • Python API version with Docker support
  • AI-powered name detection (local LLM) Done! See index_ai.html
  • Automatic language detection Done! Detects PT-BR/EN automatically
  • More document formats (ODT, RTF)
  • Browser extension
  • Custom regex patterns

Contributing

Contributions are welcome! Please read our Contributing Guidelines first.

# Fork the repository
# Create your feature branch
git checkout -b feature/amazing-feature

# Commit your changes
git commit -m 'Add amazing feature'

# Push to the branch
git push origin feature/amazing-feature

# Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • Ollama - Local LLM runtime for AI-powered detection
  • PDF.js - PDF parsing library by Mozilla
  • Mammoth.js - DOCX to text converter

Made with care for privacy
GitHubReport BugRequest Feature

About

Text Anonymizer - Protect sensitive data before sharing with AI tools

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •