Local-first Python CLI for sanitizing logs, stack traces, config snippets, and terminal output before you paste them into GitHub issues, support tickets, Slack, or AI chats.
ShareClean detects common sensitive values, replaces only the risky portion, and reports safe metadata without storing or printing the original secret. It makes no network calls and sends no telemetry.
Try the interactive browser playground to see the redaction rules before installing.
Browser playground shown for illustration; real workflows run locally through the CLI.
With pipx:
pipx install sharecleanFrom a local checkout:
python -m pip install -e .Run without installing from the repository root:
python -m shareclean --helpshareclean app.log
shareclean app.log --output app.cleaned.log
shareclean app.log --report
shareclean app.log --report --report-format json
shareclean app.log --check
shareclean app.log --check --fail-on severity:high
shareclean app.log --check --fail-on category:token,rule:SC004
shareclean app.log --check --ignore-for-check category:pii_email
shareclean app.log --private-ip
shareclean app.log --phone
shareclean app.log --custom-pattern "EMP-[0-9]{6}"--check exits 1 only for findings selected by the check policy and never writes sanitized text to stdout.
Configured fail_on and ignore_for_check policies from config files, profiles, or environment variables apply only in --check mode. Normal sanitization still redacts and reports findings, but those policies do not change the exit decision unless --check is present.
ShareClean supports committed project policy in either pyproject.toml or .shareclean.toml.
[tool.shareclean]
redact_email = true
redact_private_ip = false
redact_phone = false
redact_mac_address = false
redaction_label = "[REDACTED]"
profile = "default"
custom_patterns = [
{ name = "Employee ID", pattern = "EMP-[0-9]{6}" },
{ name = "Tenant", pattern = "tenant=(?P<value>[a-z0-9-]+)" },
]
[tool.shareclean.profiles.ci]
redact_email = true
redact_private_ip = true
fail_on = ["severity:high"]For .shareclean.toml, omit the tool.shareclean prefix:
redact_email = true
redact_private_ip = false
redact_phone = false
redact_mac_address = false
custom_patterns = [
{ name = "Employee ID", pattern = "EMP-[0-9]{6}" },
]
[profiles.ci]
redact_private_ip = true
fail_on = ["severity:high"]Config location:
--config PATH- Nearest project directory containing
.shareclean.tomlor apyproject.tomlwith[tool.shareclean] - Defaults
Auto-discovery walks upward from the current directory until the Git root or filesystem root. It uses only the nearest config directory and never merges parent configs. If .shareclean.toml and ShareClean config in pyproject.toml exist in the same selected directory, ShareClean exits 2.
Config precedence:
- CLI flags
- Environment variables
- Selected profile values
- Base project config
- Defaults
Environment variables:
SHARECLEAN_REDACT_EMAILSHARECLEAN_REDACT_PRIVATE_IPSHARECLEAN_REDACT_PHONESHARECLEAN_REDACT_MAC_ADDRESSSHARECLEAN_REDACTION_LABELSHARECLEAN_PROFILESHARECLEAN_FAIL_ONSHARECLEAN_IGNORE_FOR_CHECK
Boolean environment values accept true, 1, yes, on, false, 0, no, and off.
Inspect effective configuration without reading input:
shareclean config show| Rule ID | Detector | Category | Severity |
|---|---|---|---|
SC001 |
Key-value secret | credential |
high |
SC002 |
Bearer token | token |
high |
SC003 |
JWT-like token | token |
high |
SC004 |
Connection-string password | connection_string |
critical |
SC005 |
Email address | pii_email |
medium |
SC006 |
Local user path | pii_path |
medium |
SC007 |
Private IP address | internal_network |
medium |
SC008 |
PEM private-key block | private_key |
critical |
SC009 |
Known provider API token | token |
high |
SC010 |
Webhook URL | token |
critical |
SC011 |
URL query secret | credential |
high |
SC012 |
Cookie secret | token |
high |
SC013 |
CLI secret argument | credential |
high |
SC014 |
XML secret element | credential |
high |
SC015 |
SAML assertion | token |
critical |
SC016 |
Docker or Kubernetes secret | credential |
high |
SC017 |
Phone number | pii_phone |
medium |
SC018 |
MAC address | pii_hardware |
low |
SC019 |
Generic sensitive key-value | credential |
high |
Private IP detection is off by default; enable it with --redact-private-ip, --private-ip, or config. When enabled, it covers private IPv4 and IPv6 addresses.
Provider-aware token detection covers high-confidence shapes such as OpenAI, Anthropic, GitHub, GitLab, Hugging Face, Stripe, Slack, Telegram, SendGrid, AWS access keys, Google API keys, npm, PyPI, Docker Hub, Netlify, DigitalOcean, and Terraform Cloud tokens. Structured key-value detection preserves JSON, YAML, TOML, INI, and environment-file formatting where possible. A generic sensitive-key fallback redacts assigned values when keys contain secret-bearing segments such as password, token, api_key, or client_secret.
Phone and MAC address detection are off by default to avoid noisy matches; enable them with --redact-phone, --phone, --redact-mac-address, or config.
Custom regex rules are reported as CUSTOM001, CUSTOM002, and so on. If a custom regex defines a named group called value, only that group is redacted; otherwise the whole match is replaced.
Programmatic use:
from shareclean import add_custom_regex
from shareclean.detectors import get_rules
from shareclean.redactor import sanitize
rules = add_custom_regex(get_rules(), r"employee=(?P<value>EMP-[0-9]{6})")
result = sanitize("employee=EMP-123456", rules)When detectors overlap on the same text range, ShareClean emits one finding using the highest-severity rule. If severities match, it uses the most specific detector.
JSON reports use schema version 1.0 and do not include filenames, paths, matched values, hashes, source snippets, or masked previews.
{
"schema_version": "1.0",
"source": "file",
"summary": {
"findings": 1,
"by_category": {
"credential": 1
},
"by_severity": {
"high": 1
}
},
"findings": [
{
"rule_id": "SC001",
"category": "credential",
"severity": "high",
"location": {
"start": {
"line": 1,
"column": 10
},
"end": {
"line": 1,
"column": 27
}
},
"replacement": "[REDACTED]"
}
]
}Locations are 1-based. End positions are exclusive. Columns count Unicode code points after treating CRLF as one LF newline for location purposes.
usage: shareclean [-h] [--version] [--check] [--output FILE] [--report]
[--report-format {text,json}] [--config FILE]
[--profile NAME] [--redact-email] [--no-redact-email]
[--redact-private-ip] [--private-ip]
[--no-redact-private-ip] [--no-private-ip]
[--redact-phone] [--phone] [--no-redact-phone]
[--no-phone]
[--redact-mac-address] [--no-redact-mac-address]
[--redaction-label TEXT] [--fail-on SELECTORS]
[--ignore-for-check SELECTORS]
[--custom-pattern REGEX]
[FILE]
--no-email remains as a deprecated alias for --no-redact-email.
Exit codes:
| Code | Meaning |
|---|---|
0 |
Completed successfully |
1 |
Selected findings detected in --check mode |
2 |
User, I/O, config, or selector error |
3 |
Unexpected internal error |
ShareClean is intentionally local and transparent:
- No network calls
- No cloud processing
- No telemetry
- No account or API key required
- Original matched secret values are not stored in findings or reports
- Input files are never modified in place
ShareClean is pattern-based. It can miss unusual formats and can redact benign text that resembles a secret. It is not a replacement for repository secret scanners, source-history scanning, or DLP systems.
The test corpus under tests/fixtures/ uses only fake values and is split into generic, cloud, database, CI/CD, SaaS, log, YAML/JSON/env, provider-token, cookie/URL/CLI, and false-positive packs. Bug reports that change detection should add a regression fixture using clearly fake data.
Run the test suite:
python -m unittest discover -s tests -vRun packaging checks:
python -m compileall -q src tests
python -m build
python -m twine check dist/*ShareClean is released under the MIT License.
