A deterministic privacy boundary between your data and AI.
Intercepts query results before the model sees them — rule-driven, reproducible, and audit-ready.
AI agents increasingly access internal databases and APIs through CLI tools, scripts, and MCP servers. Without safeguards, sensitive data such as emails, phone numbers, tax identifiers, and payment details can be unintentionally exposed to LLM context windows.
gate intercepts query results before they reach the model and automatically redacts detected PII fields without requiring changes to existing agent workflows or prompts.
Note: Currently,
gatesupports Bash-based tooling. MCP server interception support is planned.
The agent asked for all users in plain English; gate intercepted the query and returned all columns with full_name and email masked before they reached the model context.
Also works with OpenCode — see the full list of supported harnesses.
Before installing the hook, use gate scan to assess how much PII your database schema exposes. Pipe the output of a schema query — one that returns TABLE_NAME and COLUMN_NAME — and gate prints a risk report across every table.
# PostgreSQL (toolkit-managed)
tkpsql query --sql "SELECT table_name, column_name FROM information_schema.columns WHERE table_schema = 'public' ORDER BY table_name, ordinal_position" | gate scan
# PostgreSQL (direct)
psql -U <user> -h <host> -d <dbname> -c "SELECT table_name, column_name FROM information_schema.columns WHERE table_schema = 'public' ORDER BY table_name, ordinal_position" | gate scan
# Databricks (toolkit-managed)
tkdbr query --conn dev --sql "SELECT TABLE_NAME, COLUMN_NAME FROM system.INFORMATION_SCHEMA.COLUMNS WHERE TABLE_SCHEMA = '<schema>' ORDER BY TABLE_NAME, COLUMN_NAME" --limit 1000 | gate scan
# MS SQL Server (toolkit-managed)
tkmsql query --sql "SELECT TABLE_NAME, COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS ORDER BY TABLE_NAME, ORDINAL_POSITION" | gate scanExample output:
Gate PII Scan
───────────────────────────────────────────────────────────
Summary
Tables scanned 12
Columns scanned 87
PII columns 34 (39.1%)
Non-PII columns 53 (60.9%)
Risk level CRITICAL
Detected Categories
───────────────────────────────────────────────────────────
Names 8 23.5%
Contact 6 17.6%
Employment 5 14.7%
Government IDs 4 11.8%
Financial 3 8.8%
Top Findings
───────────────────────────────────────────────────────────
Names
users.full_name
patients.first_name
patients.last_name
... and 5 more
Contact
users.email
customers.phone_number
orders.contact_email
Government IDs
patients.ssn
employees.tax_id
employees.national_id
Hint
Use --verbose to show all detected columns
Risk levels: CRITICAL (>25% of columns are PII), HIGH (>10%), LOW (≤10%). The command exits with code 1 if any PII columns are found, making it scriptable in CI audits. Pass --verbose to show the full list of detected columns in each category instead of a truncated preview.
If you have not yet created a config, run gate config --init-only first to generate a starter config. No tools need to be configured to use gate scan — it only uses built-in column-name detection.
-
Install gate
# Homebrew (recommended) brew tap GaaraZhu/gate && brew install gate # Or via cargo cargo install --git https://github.com/GaaraZhu/gate
-
Create your config (opens
~/.config/gate/config.yamlin your editor):gate config
-
Register the hook with your agent harness:
# Claude Code gate init # OpenCode — global gate init --harness opencode # OpenCode — project-scoped gate init --harness opencode --scope project
Restart your opencode session after running
gate initto load the plugin. -
Start your AI session —
gateintercepts query commands automatically. No changes to your prompts or tools required.
Run gate validate to confirm your config is valid before the first session.
gate currently covers the Bash tooling path: every Bash command the AI tries to run passes through gate hook first. Commands that match a configured tool are silently rewritten to gate run -- <original command>, which applies two sequential detection gates and returns sanitized JSON. The AI sees the same JSON structure as before, with PII values replaced by typed placeholders like [PII:email].
The rewrite is enforcing in both supported harnesses — the AI cannot bypass it:
- Claude Code — registered as a
PreToolUsehook in~/.claude/settings.json; Claude Code replaces the command viaupdatedInputbefore running it. - OpenCode — a TypeScript plugin's
tool.execute.beforehandler mutatesoutput.args.commandbefore the subprocess spawns; same guarantee as Claude Code.
Humans and CI scripts running outside the agent harness are unaffected — no wrapper scripts are installed on PATH.
AI asks to run: tkpsql query --sql "SELECT * FROM users"
│
harness hook fires (PreToolUse / tool.execute.before)
│
gate hook rewrites to: gate run -- tkpsql query --sql "..."
│
┌──────────────┴──────────────┐
│ Gate 1: SQL inspection │ SELECT * → no column hints, defer to Gate 2
│ Gate 2: Value scanning │ regex + column-name heuristics + Luhn check
└──────────────┬──────────────┘
│
{"id": 1, "full_name": "[PII:name]", "email": "[PII:email]", "status": "active", ..., "_gate_summary": {...}}
What the tool returns (never reaches the model):
{
"rows": [
{
"id": 1,
"full_name": "Alice Johnson",
"email": "alice.johnson@example.com",
"status": "active",
"created_at": "2023-01-15 10:30:00",
"last_login_at": "2024-05-06 14:22:00"
},
...
],
"count": 5
}What the AI sees:
{
"rows": [
{
"id": 1,
"full_name": "[PII:name]",
"email": "[PII:email]",
"status": "active",
"created_at": "2023-01-15 10:30:00",
"last_login_at": "2024-05-06 14:22:00"
},
...
],
"count": 5,
"_gate_summary": {
"redacted": 10,
"types": ["name", "email"],
"warnings": ["SELECT * used — consider listing columns explicitly"]
}
}gate ships with two layers of built-in detection that require no configuration.
Gate 1 — column-name inference from SQL. When a sql_arg is configured, gate parses the SELECT list and marks any column whose name matches a PII pattern as a forced-redact target — even if the raw value would not trigger a regex.
Gate 2 — value scanning and column-name heuristics. Every string field in the JSON output is evaluated against regex patterns and a column-name classifier. The classifier tokenises column names (handling snake_case, camelCase, PascalCase, and UPPER_CASE) so userEmail, user_email, and USER_EMAIL all resolve to the same detection rule.
| Category | Detected columns (representative examples) |
|---|---|
| Names | first_name, last_name, full_name, given_name, family_name, surname, preferred_name, middle_name, maiden_name, salutation; <entity>_name where entity is one of: contact, customer, client, employee, patient, member, owner, recipient, sender, spouse, parent, guardian, manager, sibling, children |
| Demographics | gender, sex, nationality, citizenship |
| Government IDs | passport, license / driver_license_number, ssn / social_security_number, national_id, tax_number / tax_id / ird_number, visa_number, resident_id, immigration_id |
| Contact | email / email_address / mail, phone / phone_number / mobile, fax |
| Date of birth | dob, birth, birthday, date_of_birth, birth_date, dateOfBirth |
| Location of birth | birth_country, birth_place, birth_city, country_of_birth, place_of_birth, city_of_birth, state_of_birth |
| Address & location | address / addr, street, city, state, province, country, postcode, zip, suburb, latitude, longitude, gps, coordinates |
| Financial | bank_account, account_number, iban, swift, routing_number, bsb, credit_card / card_number, cvv / cvc, expiry |
| Employment | salary, wage, job_title, employee_id, staff_id, student_id, manager_id, and any <entity>_id / <entity>_number where entity is: employee, staff, student, member, client, customer, consumer, cust, crm, person, manager, user, device, session, cookie, advertising, external |
| Health & medical | medical, health, diagnosis, prescription, disability, vaccination, vaccine, npi |
| Online & technical | username / user_name, ip_address, mac_address, auth_token, user_id, device_id, session_id, cookie_id, advertising_id |
| Biometric | biometric, fingerprint, voiceprint, retina, face_scan |
| Family & relationships | next_of_kin, emergency_contact, spouse_name, parent_name, guardian_name, children_names |
| Pattern | Detection | Example values caught |
|---|---|---|
| Email address | Regex (confidence 0.95) | alice@example.com, user+tag@company.co.uk |
| Social Security Number | Regex (confidence 0.90) | 123-45-6789 |
| Phone number | Regex (confidence 0.70) | +1 555-123-4567, (555) 123-4567, 555.123.4567 |
| Credit / debit card | Regex + Luhn algorithm (confidence 1.0) | 4111 1111 1111 1111, 5500-0055-5555-5559 |
When a column name also matches the denylist, Gate 2 adds a 0.15 confidence boost to any value hit in that column, pushing borderline matches over the redaction threshold.
Add your own columns or patterns in config — see Configuration below.
Any command that returns JSON can be configured as a gate target — database clients, internal API calls via curl, or any other tool your AI agent uses to fetch data. The AI sees the same structured response it always did, with PII values replaced in-place.
The tk* commands are managed by toolkit, a credential-injecting CLI wrapper for database clients. gate works with any JSON-returning command — toolkit is not required.
| Command | Type | Notes |
|---|---|---|
tkpsql |
PostgreSQL (toolkit-managed) | sql_arg: "--sql" |
tkmsql |
MS SQL Server (toolkit-managed) | sql_arg: "--sql" |
tkdbr |
Databricks (toolkit-managed) | sql_arg: "--sql" |
psql |
PostgreSQL (direct) | sql_arg: "-c", extra_args: ["--csv"], pipe: "python3 ..." — gate injects --csv automatically and converts output to JSON |
mysql |
MySQL (direct) | sql_arg: "-e" |
curl |
HTTP data sources | pipe: "jq -c ." — wraps output through jq so Gate 2 receives JSON |
| Any JSON-returning command | — | Add it to tools: in config |
Config lives at ~/.config/gate/config.yaml (override with GATE_CONFIG).
# Set to false to disable all PII redaction (or use GATE_DISABLED=1 for a session).
enabled: true
# Tools whose Bash invocations are intercepted and piped through `gate run`.
# Only tools listed here are intercepted; everything else passes through unchanged.
tools:
tkpsql:
sql_arg: "--sql" # Gate 1 parses this SQL to extract column names for targeted redaction
tkdbr:
sql_arg: "--sql"
tkmsql:
sql_arg: "--sql"
psql:
sql_arg: "-c"
extra_args: ["--csv"] # injected automatically; switches psql to CSV output for the pipe
pipe: "python3 -c \"import sys,csv,json; r=csv.DictReader(sys.stdin); print(json.dumps(list(r)))\""
mysql:
sql_arg: "-e"
curl:
pipe: "jq -c ." # wraps curl output through jq so Gate 2 always receives JSON
pii:
action: redact # redact | warn | reject
wildcard_policy: warn # warn | reject — applies when the AI uses SELECT *
# Add column names beyond the built-in denylist (see Built-in PII detection above).
# column_names:
# - secret_token
# - api_key
# Override or add PII regex patterns.
# patterns:
# internal_id:
# regex: '\bEMP-\d{6}\b'
# confidence: 0.85
# Added to a pattern's base confidence when the JSON key also matches the column denylist.
# Final score is capped at 1.0.
column_name_boost: 0.15
# Values matched below this threshold are flagged in _gate_summary but not redacted.
confidence_threshold: 0.8
# Redaction placeholder template; {type} is replaced with the pattern name.
redaction: "[PII:{type}]"
include_summary: true
# When true, redacted values include a deterministic 8-char hex suffix derived
# from the original value (e.g. [PII:email:7f83b165]). The same raw value always
# produces the same suffix, so the AI can correlate records across rows without
# seeing the underlying data. Set hash_salt to a fixed secret for consistent
# hashes across runs; leave empty for zero-config determinism.
hash_values: false
hash_salt: ""| Command | Purpose |
|---|---|
gate init [--harness claude-code|opencode] [--scope global|project] |
Register the hook in the agent harness. claude-code (default) writes ~/.claude/settings.json; opencode writes a TypeScript plugin at the chosen scope. |
gate uninstall |
Remove the hook, config directory, and gate-generated opencode plugins (with confirmation) |
gate enable |
Enable PII redaction (sets enabled: true in config) |
gate disable |
Disable PII redaction (sets enabled: false in config) |
gate config [--init-only] |
Create and edit the config file. --init-only creates ~/.config/gate/config.yaml without opening the editor — useful in scripts. |
gate list |
Show configured tools and their SQL flags |
gate validate |
Check config for errors and warnings |
gate version |
Print version |
gate scan [--verbose] |
Pipe schema query output (SELECT TABLE_NAME, COLUMN_NAME ...) into this to get a PII risk report across all tables. --verbose shows all detected columns without truncation. Exits 1 if any PII columns are found — scriptable in CI audits. |
gate run [--verbose] [-- <cmd>] |
Run a command through the redaction pipeline, or pipe JSON from stdin for direct Gate 2 inspection. Normally invoked by the hook; run manually to test. --verbose prints each field's Gate 2 decision to stderr. |
gate hook |
(internal) Hook entry point — invoked by the harness, not directly |
To disable redaction for a single shell session without editing the config file, set the GATE_DISABLED environment variable:
export GATE_DISABLED=1 # disable for this session
unset GATE_DISABLED # re-enableThe env var takes precedence over the config file, so it works even when enabled: true is set.
gate intercepts the output of configured tools and redacts PII before it reaches the model context. It is not a sandbox — it only applies to commands explicitly listed under tools: in config. Commands outside that list pass through the harness unchanged.
What gate covers:
PII in query results returned by configured tools.
What gate does not cover:
- Commands not listed in
tools:— the AI can invoke them freely - Write operations (INSERT, UPDATE, DELETE) — gate does not inspect or block them
- Credential exposure — gate holds no credentials; that is the responsibility of the underlying tool
For a stronger boundary, combine gate with harness-level tool restrictions (e.g. limiting which Bash commands the agent is permitted to run) and database-level read-only roles.
Redacted output preserves the original JSON structure. PII values are replaced with [PII:<type>] placeholders. A _gate_summary field is appended reporting what was redacted. All other fields (including count, rows, etc.) are passed through from the underlying tool unchanged.
{
"rows": [{"id": 1, "email": "[PII:email]", "ssn": "[PII:ssn]"}],
"count": 1,
"_gate_summary": {"redacted": 2, "types": ["email", "ssn"], "warnings": []}
}With hash_values: true, each placeholder gains an 8-char hex suffix derived from the original value. The same raw value always produces the same suffix, so the AI can join or deduplicate across rows without ever seeing the underlying data.
{
"rows": [{"id": 1, "email": "[PII:email:7f83b165]", "ssn": "[PII:ssn:3c2a1b0e]"}],
"count": 1,
"_gate_summary": {"redacted": 2, "types": ["email", "ssn"], "warnings": []}
}Error responses from the underlying tool pass through unchanged.
gate uninstall
brew uninstall gategate uninstall removes everything gate added to your system — the hook from ~/.claude/settings.json, the config directory at ~/.config/gate/, and any gate-generated opencode plugins. It shows you exactly what will be deleted and asks for confirmation before touching anything.
Commands are passing through unredacted.
Run gate validate to check for config errors. Confirm the hook is registered by checking that ~/.claude/settings.json contains a gate hook entry — if not, re-run gate init. Then restart your agent session so the harness picks up the updated settings.
gate: command not found inside the agent session.
The shell PATH inside the harness may differ from your login shell. Find the full path with which gate in a normal terminal, then set GATE_BIN to that path or add the directory to the harness's PATH in your shell profile.
OpenCode isn't intercepting commands after gate init.
The plugin is loaded at session start — restart your opencode session after running gate init --harness opencode.
Config file not found.
Run gate config to create ~/.config/gate/config.yaml. If you store the config elsewhere, set GATE_CONFIG=/path/to/config.yaml in your environment.
Certain fields are not being masked (false negatives).
Run gate run --verbose -- <your-command> to see exactly why each field was passed or redacted. For each string field, verbose mode prints which step triggered (forced column, column-name classifier, Luhn, regex) or passed (no match) if nothing fired. You can also pipe a sample payload directly: echo '<json>' | gate run --verbose. Common fixes: add the column name to column_names: in config, or lower confidence_threshold if a pattern is matching below the threshold.
gate run --verbose shows "input is not JSON — redaction skipped".
The tool's output is not valid JSON, so Gate 2 cannot inspect it and passes the raw bytes through unchanged. This is expected for tools that return plain text or binary output. If you expect JSON, check the tool's output format — some CLIs require a --json or --output json flag to produce structured output.
Non-PII values are being redacted (false positives).
Raise confidence_threshold (e.g. to 0.9) to reduce over-redaction, or narrow the regex for the offending pattern in the patterns block. Run gate validate after editing to catch syntax errors.
_gate_summary warns about SELECT *.
Gate 1 can't infer column types from a wildcard query, so every value is passed to Gate 2's regex scanner. Use an explicit column list (SELECT id, status, created_at FROM users) to skip the warning and avoid scanning non-PII columns.
MCP server interception — gate currently covers the Bash tooling path (CLI commands the AI runs via the shell). The other common access pattern is MCP: the AI calls a Model Context Protocol server directly, bypassing the shell entirely. MCP support will bring the same two-gate redaction pipeline to MCP tool responses, with no changes required to the MCP server itself.
GitHub Copilot CLI — deferred to a future release. Copilot CLI's preToolUse hook only supports deny-with-suggestion (no transparent rewrite), which makes the integration advisory — strictly safer than no hook, but the AI could in principle ignore the suggested rewrite. We're holding the integration until either Copilot CLI gains an updatedInput equivalent or the user demand justifies shipping the advisory-only mode.
Bug reports and pull requests are welcome. For significant changes, open an issue first to discuss the proposal. See CLAUDE.md for the full dev setup and pre-commit checklist.
MIT — see LICENSE.
See DISCLAIMER.md.

