<a target="_parent" href="https://colab.research.google.com/github/gretelai/gretel-blueprints/blob/main/docs/notebooks/safe-synthetics/gdpr-transform-synthesize.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

# 🔐 Using Safe Synthetics to support GDPR compliance

This notebook leverages tailor-made configurations for Safe Synthetics to support GDPR compliance. You can try with the sample dataset, or test it out using your own dataset.

After specifying a dataset, this notebook will holdout 5% to use for calculating quality & privacy metrics at the end. It will then redact true identifiers in your dataset such as names and addresses, and synthesize your data to obfuscate quasi-identifiers. Finally, it will generate a report for you to measure the quality & privacy of your synthetic data.

## 💾 Install Gretel SDK

In [None]:
%%capture

%pip install -U gretel-client

## 🌐 Configure your Gretel Session

In [None]:
from gretel_client.navigator_client import Gretel

gretel = Gretel(api_key="prompt")

## 🔬 Preview input data

In [None]:
import pandas as pd
ds = "https://gretel-datasets.s3.us-west-2.amazonaws.com/ecommerce_customers.csv"
df = pd.read_csv(ds)

print(f"Number of rows: {len(df)}")
df.head()

## 🏃 Run Safe Synthetics

In [None]:
synthetic_dataset = gretel.safe_synthetic_dataset\
    .from_data_source(ds) \
    .transform() \
    .synthesize("tabular_ft", {"train": {"params": {"num_input_records_to_sample": 5000}}}, num_records=1000) \
    .create()

In [None]:
synthetic_dataset.wait_until_done()

## 🔬 Preview output data

In [None]:
synthetic_dataset.dataset.df.head()

## 📊 Evaluate quality & privacy results

In [None]:
synthetic_dataset.report.table

In [None]:
synthetic_dataset.report.display_in_notebook()