# 📌 Fraud Detection with PostgreSQL in Python

## Scenario:

You work for a financial company and need to detect fraudulent transactions in a credit card dataset. Fraudulent transactions often have unusual spending patterns, such as large amounts, rapid transactions in a short period, or purchases in different locations within a short time



## Step 1: Running SQL & Connecting to PostgreSQL



In [11]:
## Install package to run SQL 
!pip install ipython-sql

Collecting ipython-sql
  Downloading ipython_sql-0.5.0-py3-none-any.whl.metadata (17 kB)
Collecting prettytable (from ipython-sql)
  Downloading prettytable-3.14.0-py3-none-any.whl.metadata (30 kB)
Collecting sqlparse (from ipython-sql)
  Downloading sqlparse-0.5.3-py3-none-any.whl.metadata (3.9 kB)
Downloading ipython_sql-0.5.0-py3-none-any.whl (20 kB)
Downloading prettytable-3.14.0-py3-none-any.whl (31 kB)
Downloading sqlparse-0.5.3-py3-none-any.whl (44 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.4/44.4 kB[0m [31m2.9 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: sqlparse, prettytable, ipython-sql
Successfully installed ipython-sql-0.5.0 prettytable-3.14.0 sqlparse-0.5.3


In [28]:
# Configure ipython-sql to return results as Pandas DataFrames
%config SqlMagic.autopandas = True

# Load the SQL extension
%load_ext sql


In [30]:
# Use psycopg2 to create the new database (outside of a transaction block)
import psycopg2

# Connect to the PostgreSQL server 
conn = psycopg2.connect(
    host="localhost",
    port="5432",
    user="postgres",
    password="965210"
)
conn.autocommit = True  # Disable transactions for CREATE DATABASE
cursor = conn.cursor()

# Create the new database
cursor.execute("CREATE DATABASE fraud_db;")
cursor.close()
conn.close()

# Connect to the new database using ipython-sql
%sql postgresql://postgres:965210@localhost:5432/fraud_db



DuplicateDatabase: database "fraud_db" already exists


## Step 2: Create a Fraud Detection Table

####  Create the schema and table

In [32]:
%%sql 

CREATE TABLE IF NOT EXISTS fraud_schema.transactions (
    transaction_id SERIAL PRIMARY KEY,
    customer_id INT,
    amount DECIMAL(10,2),
    transaction_date TIMESTAMP,
    location VARCHAR(255),
    is_fraud BOOLEAN DEFAULT FALSE
);

Traceback (most recent call last):
  File "/opt/anaconda3/lib/python3.12/site-packages/sql/magic.py", line 196, in execute
    conn = sql.connection.Connection.set(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/lib/python3.12/site-packages/sql/connection.py", line 82, in set
    raise ConnectionError(
sql.connection.ConnectionError: Environment variable $DATABASE_URL not set, and no connect string given.

Connection info needed in SQLAlchemy format, example:
               postgresql://username:password@hostname/dbname
               or an existing connection: dict_keys([])


#### Insert Sample Transactions

In [34]:
%%sql 
INSERT INTO fraud_schema.transactions (customer_id, amount, transaction_date, location, is_fraud)
VALUES 
    (101, 50.00, '2024-02-20 08:00:00', 'New York', FALSE),
    (102, 200.00, '2024-02-20 08:15:00', 'Los Angeles', FALSE),
    (103, 5000.00, '2024-02-20 08:30:00', 'Chicago', TRUE),  -- High amount fraud
    (101, 50.00, '2024-02-20 08:45:00', 'New York', FALSE),
    (101, 5000.00, '2024-02-20 08:50:00', 'New York', TRUE),  -- Unusual for this user
    (104, 1200.00, '2024-02-20 09:00:00', 'Miami', TRUE),  -- Suspicious amount
    (105, 75.00, '2024-02-20 09:10:00', 'Chicago', FALSE);


Traceback (most recent call last):
  File "/opt/anaconda3/lib/python3.12/site-packages/sql/magic.py", line 196, in execute
    conn = sql.connection.Connection.set(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/lib/python3.12/site-packages/sql/connection.py", line 82, in set
    raise ConnectionError(
sql.connection.ConnectionError: Environment variable $DATABASE_URL not set, and no connect string given.

Connection info needed in SQLAlchemy format, example:
               postgresql://username:password@hostname/dbname
               or an existing connection: dict_keys([])


## Step 3: Identify Suspicious Transactions



1️⃣ Rule-Based Fraud Detection
Look for high-value transactions (e.g., greater than $1000).

In [36]:
%%sql  
SELECT * 
FROM fraud_schema.transactions
WHERE amount > 1000;


Traceback (most recent call last):
  File "/opt/anaconda3/lib/python3.12/site-packages/sql/magic.py", line 196, in execute
    conn = sql.connection.Connection.set(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/lib/python3.12/site-packages/sql/connection.py", line 82, in set
    raise ConnectionError(
sql.connection.ConnectionError: Environment variable $DATABASE_URL not set, and no connect string given.

Connection info needed in SQLAlchemy format, example:
               postgresql://username:password@hostname/dbname
               or an existing connection: dict_keys([])


2️⃣ Multiple Transactions in a Short Time (Velocity Check)
Detect if a customer made more than 3 transactions in 30 minutes.

In [38]:
%%sql
SELECT 
    customer_id, 
    COUNT(*) AS num_transactions, 
    MIN(transaction_date) AS first_transaction, 
    MAX(transaction_date) AS last_transaction
FROM 
    fraud_schema.transactions
GROUP BY 
    customer_id
HAVING 
    COUNT(*) > 3 
    AND EXTRACT(EPOCH FROM (MAX(transaction_date) - MIN(transaction_date))) / 60 <= 30;

Traceback (most recent call last):
  File "/opt/anaconda3/lib/python3.12/site-packages/sql/magic.py", line 196, in execute
    conn = sql.connection.Connection.set(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/lib/python3.12/site-packages/sql/connection.py", line 82, in set
    raise ConnectionError(
sql.connection.ConnectionError: Environment variable $DATABASE_URL not set, and no connect string given.

Connection info needed in SQLAlchemy format, example:
               postgresql://username:password@hostname/dbname
               or an existing connection: dict_keys([])


3️⃣ Location Anomaly Detection
Find customers who made transactions from different locations within 1 hour.

In [40]:
%%sql
SELECT 
    t1.customer_id, 
    t1.transaction_date, 
    t1.location AS location_1, 
    t2.location AS location_2
FROM 
    fraud_schema.transactions t1
JOIN 
    fraud_schema.transactions t2
ON 
    t1.customer_id = t2.customer_id
    AND t1.transaction_id <> t2.transaction_id
    AND ABS(EXTRACT(EPOCH FROM (t1.transaction_date - t2.transaction_date)) / 60) <= 60
    AND t1.location <> t2.location;

Traceback (most recent call last):
  File "/opt/anaconda3/lib/python3.12/site-packages/sql/magic.py", line 196, in execute
    conn = sql.connection.Connection.set(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/lib/python3.12/site-packages/sql/connection.py", line 82, in set
    raise ConnectionError(
sql.connection.ConnectionError: Environment variable $DATABASE_URL not set, and no connect string given.

Connection info needed in SQLAlchemy format, example:
               postgresql://username:password@hostname/dbname
               or an existing connection: dict_keys([])


## Step 4: Mark Fraudulent Transactions
Updating the table to flag suspicious transactions.

In [42]:
%%sql
UPDATE fraud_schema.transactions
SET is_fraud = TRUE
WHERE amount > 1000
   OR customer_id IN (
       -- Subquery 1: Customers with more than 3 transactions within 30 minutes
       SELECT customer_id
       FROM (
           SELECT 
               customer_id, 
               transaction_date,
               LAG(transaction_date) OVER (PARTITION BY customer_id ORDER BY transaction_date) AS prev_transaction_date
           FROM fraud_schema.transactions
       ) AS subquery
       WHERE ABS(EXTRACT(EPOCH FROM (transaction_date - prev_transaction_date)) / 60) <= 30
       GROUP BY customer_id
       HAVING COUNT(*) > 3
   )
   OR customer_id IN (
       -- Subquery 2: Customers with transactions in different locations within 60 minutes
       SELECT t1.customer_id
       FROM fraud_schema.transactions t1
       JOIN fraud_schema.transactions t2
       ON t1.customer_id = t2.customer_id
       AND t1.transaction_id <> t2.transaction_id
       AND ABS(EXTRACT(EPOCH FROM (t1.transaction_date - t2.transaction_date)) / 60) <= 60
       AND t1.location <> t2.location
   );

Traceback (most recent call last):
  File "/opt/anaconda3/lib/python3.12/site-packages/sql/magic.py", line 196, in execute
    conn = sql.connection.Connection.set(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/lib/python3.12/site-packages/sql/connection.py", line 82, in set
    raise ConnectionError(
sql.connection.ConnectionError: Environment variable $DATABASE_URL not set, and no connect string given.

Connection info needed in SQLAlchemy format, example:
               postgresql://username:password@hostname/dbname
               or an existing connection: dict_keys([])


## Step 5: Automate Fraud Detection with a Stored Procedure
Creating a Stored Procedure to automate fraud detection

In [212]:
%%sql
CREATE OR REPLACE FUNCTION fraud_schema.detect_fraud()
RETURNS TEXT
LANGUAGE plpgsql
AS 
$$
BEGIN
    -- Update transactions to mark them as fraud based on the conditions
    UPDATE fraud_schema.transactions
    SET is_fraud = TRUE
    WHERE amount > 1000
       OR customer_id IN (
           -- Subquery 1: Customers with more than 3 transactions within 30 minutes
           SELECT customer_id
           FROM (
               SELECT 
                   customer_id, 
                   transaction_date,
                   LAG(transaction_date) OVER (PARTITION BY customer_id ORDER BY transaction_date) AS prev_transaction_date
               FROM fraud_schema.transactions
           ) AS subquery
           WHERE ABS(EXTRACT(EPOCH FROM (transaction_date - prev_transaction_date)) / 60) <= 30
           GROUP BY customer_id
           HAVING COUNT(*) > 3
       )
       OR customer_id IN (
           -- Subquery 2: Customers with transactions in different locations within 60 minutes
           SELECT t1.customer_id
           FROM fraud_schema.transactions t1
           JOIN fraud_schema.transactions t2
           ON t1.customer_id = t2.customer_id
           AND t1.transaction_id <> t2.transaction_id
           AND ABS(EXTRACT(EPOCH FROM (t1.transaction_date - t2.transaction_date)) / 60) <= 60
           AND t1.location <> t2.location
       );

    -- Return a success message
    RETURN 'Fraud Detection Completed';
END;
$$;


-- Call the function
SELECT fraud_schema.detect_fraud();

   postgresql://postgres:***@localhost:5432/fraud_db
 * postgresql://postgres:***@localhost:5432/postgres
Done.
1 rows affected.


KeyError: 'DEFAULT'

In [218]:
from IPython.display import display, Markdown

result = %sql SELECT fraud_schema.detect_fraud();

# Create a Markdown table
markdown_table = f"| detect_fraud |\n|--------------|\n| {result[0][0]} |"

# Display the Markdown table
display(Markdown(markdown_table))

   postgresql://postgres:***@localhost:5432/fraud_db
 * postgresql://postgres:***@localhost:5432/postgres
1 rows affected.


KeyError: 'DEFAULT'