Anonymize. Protect. Restore.
Flexible and reversible anonymization for modern Python workflows.
Install Camouflage with pip:
pip install camouflageCamouflage lets you easily anonymize sensitive data, store reversible mappings, and restore the original dataset when needed β all while being fast, lightweight, and fully customizable.
- π₯ Anonymize large datasets quickly.
- π οΈ Add your own anonymizers easily (your data, your rules).
- π Reversible by design β restore original values without headaches.
- π§ͺ 100% test coverage for maximum trust.
- ποΈ Tested on datasets with over 100,000 rows across 6 columns β handles big data smoothly.
Camouflage uses a one-to-one mapping to anonymize data. It generates a unique, consistent, and reversible mapping for each value. See Bijection on Wikipedia.
Camouflage guarantees that every anonymized value is unique, consistent, and traceable back β only when you need it. There are predefined anonymizers for common data types (facets), and you can easily add your own.
| facet | description | example |
|---|---|---|
| age | An int representing the age of a person. |
25 |
| amount | A float representing a monetary amount. |
100.50 |
| country | A str representing a country name. |
Germany |
| datetime | A datetime.datetime object. |
2023-02-01 00:00:00 |
| ipv4 | A str representing an IPv4 address. |
213.209.12.210 |
| user_agent | A str representing a user agent. |
Mozilla/5.0 (Windows) |
| ... | Coming soon... | ... |
from camouflage import anonymize
original_value = "192.168.1.1"
anonymized_value = anonymize("ipv4", original_value)from camouflage import anonymize, deanonymize, Transform
original_value = "192.168.1.1"
transform = Transform()
# Anonymize
anonymized_value = anonymize("ipv4", original_value, transform)
# Do something with the anonymized value
# ...
# De-anonymize
deanonymized_value = deanonymize("ipv4", anonymized_value, transform)import pandas as pd
from camouflage import PandasAdapter
df = pd.DataFrame({
"ip": ["192.168.1.1", "10.0.0.1"],
"joined_at": [pd.Timestamp("2023-01-01"), pd.Timestamp("2023-02-01")],
"revenue": [1234.56, 7890.12],
})
# | ip | joined_at | revenue |
# |:------------|:--------------------|----------:|
# | 192.168.1.1 | 2023-01-01 00:00:00 | 1234.56 |
# | 10.0.0.1 | 2023-02-01 00:00:00 | 7890.12 |
mapper = {
"ip": "ipv4",
"joined_at": "datetime",
"revenue": "amount",
}
pd_adapter = PandasAdapter(mapper)
df_safe = pd_adapter.anonymize(df)
# | ip | joined_at | revenue |
# |:---------------|:--------------------|----------:|
# | 137.224.91.30 | 2024-12-05 00:00:00 | 1279.97 |
# | 213.209.12.210 | 2023-06-27 00:00:00 | 5506.58 |
# Do something with the anonymized DataFrame
# ...
# When you want to restore:
original_df = pd_adapter.deanonymize(df_safe)
# | ip | joined_at | revenue |
# |:------------|:--------------------|----------:|
# | 192.168.1.1 | 2023-01-01 00:00:00 | 1234.56 |
# | 10.0.0.1 | 2023-02-01 00:00:00 | 7890.12 |Want to anonymize new types of data? Super easy:
import random
def anonymize_color(_): # It is crucial for the anonymizer to accept a single argument.
return random.choice(['red', 'green', 'blue'])
def anonymize_red_channel(original_hex):
hex_color = original_hex.lstrip('#')
green = hex_color[2:4]
blue = hex_color[4:6]
random_red = random.randint(0, 255)
return "#{:02X}{}{}".format(random_red, green, blue)from camouflage import register_anonymizer
register_anonymizer('color', anonymize_color)
register_anonymizer('red_channel', anonymize_red_channel)from camouflage import anonymize
original_value = "cyan"
anonymized_value = anonymize("color", original_value)
original_hex = "#00FF00"
anonymized_hex = anonymize("red_channel", original_hex)import pandas as pd
from camouflage import PandasAdapter
df = pd.DataFrame({
"color": ["cyan", "magenta", "yellow"],
"hex": ["#FF0000", "#00FF00", "#0000FF"],
})
# | color | hex |
# |:--------|:--------|
# | cyan | #FF0000 |
# | magenta | #00FF00 |
# | yellow | #0000FF |
mapper = {
"color": "color",
"hex": "red_channel",
}
pd_adapter = PandasAdapter(mapper)
df_safe = pd_adapter.anonymize(df)
# | color | hex |
# |:--------|:--------|
# | green | #B90000 |
# | blue | #96FF00 |
# | red | #FD00FF |β
That's it β now you can anonymize columns as "color" or "red_channel" either one-time or in adapters!
- β 100% code coverage (Pytest + Coverage)
- β PEP8 compliant, linted
- β Fast anonymization for datasets of 100,000+ rows
- β Extensible facet system
- β Tested and battle-ready
Run tests on your setup with:
pip install pytest
pytestMIT License β do whatever you want, but be cool. βοΈ
Camouflage is built to empower privacy-first applications without slowing you down.
Source Code: https://github.com/data-minder/camouflage
PyPI: https://pypi.org/project/camouflage/