Curated profanity lists with provenance, classification, and simple matching helpers, merged from 41 public wordlists across 15 languages.
pip install expletives
>>> import expletives
>>> expletives.is_expletive("fuck")
True
>>> expletives.is_expletive("asian") # curated false-positive
False
>>> expletives.contains_expletive("hello world")
False
>>> expletives.find_expletives("this shit is fucked")
[Match(word='shit', start=5, end=9), Match(word='fucked', start=13, end=19)]
>>> expletives.censor("oh shit, fuck!")
'oh ****, ****!'
>>> expletives.censor("oh shit", mask="[BEEP]")
'oh [BEEP]'Rich metadata for every entry:
>>> expletives.catalog["fuck"]["sources"][:3]
['2600-googleblacklist', 'bannedwordlist', 'biglou-bad-words']
>>> expletives.catalog["fuck"]["classification"]
'term'
>>> expletives.explain_okayish("asian")
'nationality / ethnicity / geographic descriptor'Filter by source or language:
>>> expletives.load(sources=["seven-dirty-words"])
{'cocksucker', 'cunt', 'fuck', 'motherfucker', 'piss', 'shit', 'tits'}
>>> len(expletives.load(language="ja"))
180| name | type | description |
|---|---|---|
badwords |
set[str] |
merged words, wildcards excluded |
patterns |
set[str] |
wildcard entries (*fuck*, …) |
okayish |
list[str] |
curated false-positive allow-list |
catalog |
dict[str, dict] |
full entry metadata (see SOURCES.md) |
sources |
dict[str, dict] |
per-source metadata |
classifications |
tuple[str, ...] |
valid classification ids |
| function | returns |
|---|---|
is_expletive(word, allow_okayish=True) |
bool |
contains_expletive(text) |
bool |
find_expletives(text) |
list[Match] |
censor(text, mask="*") |
str |
load(sources=None, language=None, ...) |
set[str] |
describe(word) / classify(word) |
str | None |
explain_okayish(word) |
str | None |
See SOURCES.md for every bundled wordlist, its origin, and sources we evaluated but didn't include.
Apache 2.0. The bundled source files retain their upstream licenses as noted in each file's header.