# Privacy Scrubbing with Loclean

This notebook demonstrates how to scrub sensitive PII (Personally Identifiable Information) data locally using Loclean.

> **ðŸ“š Full Documentation:** [Privacy Scrubbing Guide](https://nxank4.github.io/loclean/guides/privacy/)

In [None]:
import loclean
import polars as pl

## Basic Usage

### Scrub Text

In [None]:
# Text with PII
text = "Contact John Doe at john@example.com or call 555-1234"

# Scrub all PII (default: mask mode)
cleaned = loclean.scrub(text)
print(f"Original: {text}")
print(f"Cleaned:  {cleaned}")

### Scrub DataFrame

In [None]:
df = pl.DataFrame({
    "text": [
        "Contact John Doe at john@example.com",
        "Call Mary Smith at 555-9876",
        "Email: admin@company.com"
    ]
})

print("Original DataFrame:")
print(df)

# Scrub PII in DataFrame column
result = loclean.scrub(df, target_col="text")

print("\nCleaned DataFrame:")
print(result["text"])

## Scrubbing Modes

### Mask Mode (Default)

Replaces PII with `[REDACTED]` or similar placeholders:

In [None]:
text = "John Doe: john@example.com"
cleaned = loclean.scrub(text, mode="mask")
print(f"Original: {text}")
print(f"Masked:   {cleaned}")

## Selective Scrubbing

Scrub only specific PII types:

In [None]:
# Only scrub emails and phone numbers
text = "John Doe: john@example.com, 555-1234"
cleaned = loclean.scrub(
    text,
    strategies=["email", "phone"]
)
print(f"Original: {text}")
print(f"Cleaned:  {cleaned}")