# Data Suppression in PostgreSQL
This notebook demonstrates how to implement **data suppression** techniques in **PostgreSQL** to protect sensitive financial information while maintaining usability.

Suppression removes or masks certain values based on predefined conditions, helping to enhance **data privacy** and **compliance with regulations** such as GDPR and HIPAA.


## 1️⃣ Setting Up the Database
First, we need to establish a connection to PostgreSQL and ensure that our database is created.

In [None]:
from sqlalchemy import create_engine
import psycopg2

# PostgreSQL connection settings
PG_ADDR = 'localhost'  # Server address
PG_PORT = '5432'       # PostgreSQL port
PG_USER = 'postgres'   # Admin user
PG_PASW = 'secure_password'  # Secure password
PG_DBNA = 'suppression_db'  # Database name

# Connect to the default PostgreSQL database to check and create if needed
default_engine = create_engine(f'postgresql://{PG_USER}:{PG_PASW}@{PG_ADDR}:{PG_PORT}/postgres')

with default_engine.connect().execution_options(autocommit=True) as conn:
    result = conn.execute("SELECT 1 FROM pg_database WHERE datname = %s;", (PG_DBNA,))
    exists = result.scalar()
    
    if not exists:
        conn.execute(f"CREATE DATABASE {PG_DBNA};")
        print(f"Database '{PG_DBNA}' created successfully!")

# Now connect to the newly created database
engine = create_engine(f'postgresql://{PG_USER}:{PG_PASW}@{PG_ADDR}:{PG_PORT}/{PG_DBNA}')
print(f"Connected to database '{PG_DBNA}'.")

## 2️⃣ Creating and Populating the Patients Table
Now, we will create a **patients** table containing financial data and insert some sample records.

In [None]:
with engine.connect().execution_options(autocommit=True) as conn:
    conn.execute('''DROP TABLE IF EXISTS patients;''')
    conn.execute('''
        CREATE TABLE patients (
            id SERIAL PRIMARY KEY,
            full_name VARCHAR(100),
            birth_date DATE,
            country VARCHAR(50),
            medical_condition VARCHAR(100),
            insurance_cost DECIMAL(10,2)
        );
    ''')
print("Table 'patients' created successfully!")

In [None]:
with engine.connect().execution_options(autocommit=True) as conn:
    conn.execute("INSERT INTO patients (full_name, birth_date, country, medical_condition, insurance_cost) VALUES ('Samantha Ford', '1970-01-01', 'Taiwan', 'Heart Condition', 337.38);")
    conn.execute("INSERT INTO patients (full_name, birth_date, country, medical_condition, insurance_cost) VALUES ('Rebecca Parker', '1944-07-25', 'Rwanda', 'Lung Disease', 270.66);")
    conn.execute("INSERT INTO patients (full_name, birth_date, country, medical_condition, insurance_cost) VALUES ('Donald Phillips', '1946-03-01', 'Jamaica', 'Diabetes', 408.03);")
    conn.execute("INSERT INTO patients (full_name, birth_date, country, medical_condition, insurance_cost) VALUES ('Natalie Long', '1983-12-27', 'Sao Tome and Principe', 'Kidney Failure', 978.66);")
    conn.execute("INSERT INTO patients (full_name, birth_date, country, medical_condition, insurance_cost) VALUES ('Stephanie Williams', '1945-07-25', 'Kuwait', 'Heart Condition', 694.81);")
    conn.execute("INSERT INTO patients (full_name, birth_date, country, medical_condition, insurance_cost) VALUES ('Dr. George Young', '1957-09-18', 'Ireland', 'Kidney Failure', 822.35);")
    conn.execute("INSERT INTO patients (full_name, birth_date, country, medical_condition, insurance_cost) VALUES ('John Baker', '1971-10-22', 'New Zealand', 'Diabetes', 782.03);")
    conn.execute("INSERT INTO patients (full_name, birth_date, country, medical_condition, insurance_cost) VALUES ('Amanda Johnson', '1958-06-13', 'Sri Lanka', 'Kidney Failure', 385.76);")
    conn.execute("INSERT INTO patients (full_name, birth_date, country, medical_condition, insurance_cost) VALUES ('Shaun Roberts', '2004-04-28', 'Saint Kitts and Nevis', 'Lung Disease', 466.35);")
    conn.execute("INSERT INTO patients (full_name, birth_date, country, medical_condition, insurance_cost) VALUES ('Mrs. Audrey Wolfe', '1945-05-26', 'Tajikistan', 'Heart Condition', 799.51);")
print("Sample patient data inserted successfully!")

## 3️⃣ Implementing Data Suppression
We will suppress **insurance cost values** above a certain threshold to ensure financial privacy.

In [None]:
def suppress_insurance_cost(cost):
    """Suppresses insurance cost if greater than 300."""
    return -1 if cost > 300 else cost

# Apply suppression to the 'insurance_cost' column
with engine.connect().execution_options(autocommit=True) as conn:
    conn.execute("UPDATE patients SET insurance_cost = -1 WHERE insurance_cost > 300;")
print("Suppression applied successfully!")

## 4️⃣ Verifying Suppressed Data
Let's check how the suppressed financial data looks.

In [None]:
import pandas as pd

with engine.connect() as conn:
    df = pd.read_sql("SELECT id, full_name, country, medical_condition, insurance_cost FROM patients;", conn)
print(df.head())

## 5️⃣ Key Takeaways and Insights
- **Suppression enhances financial privacy** by replacing sensitive values.
- **Insurance cost values above 300** are masked with `-1`.
- **This method ensures compliance** with privacy laws like GDPR and HIPAA.
- **Suppressed data can still be used** for aggregate financial analysis.

🚀 Next Steps:
- Implement **conditional suppression based on additional criteria**.
- Explore **data masking techniques** for better privacy control.
