# Snowflake ML Demo: FAERS Data Setup

This notebook creates the FDA Adverse Event Reporting System (FAERS) data structures and loads sample data for our ML demo.

## What is FAERS?
FAERS is the FDA's database for collecting adverse event reports, medication errors, and product quality complaints for drugs and therapeutic biologic products.

## What We're Creating
- **FAERS_ADVERSE_EVENTS**: Patient and event information
- **FAERS_DRUGS**: Drug information for each case
- **FAERS_REACTIONS**: Adverse reaction terms
- **FAERS_OUTCOMES**: Event outcomes (death, hospitalization, etc.)
- **FAERS_OUTCOME_CODES**: Reference table for outcome descriptions

## Data Source
Real FAERS data: https://fis.fda.gov/extensions/FPD-QDE-FAERS/FPD-QDE-FAERS.html

## Prerequisites
- Running in Snowflake Notebooks environment
- Environment setup completed (setup_environment.sql script)


In [None]:
# Initialize Snowflake Session
print("Initializing Snowflake session...")

# Import Snowpark session (available in Snowflake Notebooks)
from snowflake.snowpark.context import get_active_session

# Get the active Snowflake session
session = get_active_session()

print("SUCCESS: Snowflake session initialized!")

# Set context for FAERS data setup
print("Setting context for FAERS data...")
session.sql("USE DATABASE ADVERSE_EVENT_MONITORING").collect()
session.sql("USE SCHEMA FDA_FAERS").collect() 
session.sql("USE WAREHOUSE ADVERSE_EVENT_WH").collect()

# Verify context
current_context = session.sql("""
    SELECT 
        CURRENT_DATABASE() as database,
        CURRENT_SCHEMA() as schema,
        CURRENT_WAREHOUSE() as warehouse
""").collect()[0]

print(f"   Database: {current_context['DATABASE']}")
print(f"   Schema: {current_context['SCHEMA']}")
print(f"   Warehouse: {current_context['WAREHOUSE']}")
print("SUCCESS: Context set successfully!")


In [None]:
# Create FAERS_ADVERSE_EVENTS table
print("Creating FAERS_ADVERSE_EVENTS table...")

session.sql("""
    CREATE OR REPLACE TABLE FAERS_ADVERSE_EVENTS (
        PRIMARYID VARCHAR,
        CASEID VARCHAR,
        CASEVERSION VARCHAR,
        I_F_CODE VARCHAR,
        EVENT_DT VARCHAR,
        MFR_DT VARCHAR,
        INIT_FDA_DT VARCHAR,
        FDA_DT VARCHAR,
        REPT_COD VARCHAR,
        AUTH_NUM VARCHAR,
        MFR_NUM VARCHAR,
        MFR_SNDR VARCHAR,
        LIT_REF STRING,
        AGE VARCHAR,
        AGE_COD VARCHAR,
        AGE_GRP VARCHAR,
        SEX VARCHAR,
        E_SUB VARCHAR,
        WT VARCHAR,
        WT_COD VARCHAR,
        REPT_DT VARCHAR,
        TO_MFR VARCHAR,
        OCCP_COD VARCHAR,
        REPORTER_COUNTRY VARCHAR,
        OCCR_COUNTRY VARCHAR
    )
""").collect()

print("SUCCESS: FAERS_ADVERSE_EVENTS table created successfully!")


In [None]:
# Create FAERS_DRUGS table
print("Creating FAERS_DRUGS table...")

session.sql("""
    CREATE OR REPLACE TABLE FAERS_DRUGS (
        PRIMARYID VARCHAR,
        CASEID VARCHAR,
        DRUG_SEQ VARCHAR,
        ROLE_COD VARCHAR,
        DRUGNAME STRING,
        PROD_AI STRING,
        VAL_VBM VARCHAR,
        ROUTE VARCHAR,
        DOSE_VBM VARCHAR,
        CUM_DOSE_CHR VARCHAR,
        CUM_DOSE_UNIT VARCHAR,
        DECHAL VARCHAR,
        RECHAL VARCHAR,
        LOT_NUM VARCHAR,
        EXP_DT VARCHAR,
        NDA_NUM VARCHAR,
        DOSE_AMT VARCHAR,
        DOSE_UNIT VARCHAR,
        DOSE_FORM VARCHAR,
        DOSE_FREQ VARCHAR
    )
""").collect()

print("SUCCESS: FAERS_DRUGS table created successfully!")


In [None]:
# Create FAERS_REACTIONS table
print("Creating FAERS_REACTIONS table...")

session.sql("""
    CREATE TABLE IF NOT EXISTS FAERS_REACTIONS (
        PRIMARYID VARCHAR,
        CASEID VARCHAR,
        PT STRING,
        DRUG_REC_ACT STRING
    )
""").collect()

print("SUCCESS: FAERS_REACTIONS table created successfully!")

# Create FAERS_OUTCOMES table
print("Creating FAERS_OUTCOMES table...")

session.sql("""
    CREATE TABLE IF NOT EXISTS FAERS_OUTCOMES (
        PRIMARYID VARCHAR,
        CASEID VARCHAR,
        OUTC_COD VARCHAR
    )
""").collect()

print("SUCCESS: FAERS_OUTCOMES table created successfully!")


In [None]:
# Create FAERS_OUTCOME_CODES table
print("Creating FAERS_OUTCOME_CODES table...")

session.sql("""
    CREATE TABLE IF NOT EXISTS FAERS_OUTCOME_CODES (
        outc_cod VARCHAR(10),
        outcome_description VARCHAR(100)
    )
""").collect()

print("SUCCESS: FAERS_OUTCOME_CODES table created successfully!")

# Insert outcome code data
print("Inserting outcome code data...")

session.sql("""
    INSERT INTO FAERS_OUTCOME_CODES VALUES
    ('DE', 'Death'),
    ('LT', 'Life-Threatening'),
    ('HO', 'Hospitalization - Initial or Prolonged'),
    ('DS', 'Disability'),
    ('CA', 'Congenital Anomaly'),
    ('RI', 'Required Intervention to Prevent Permanent Impairment/Damage'),
    ('OT', 'Other Serious (Important Medical Event)')
""").collect()

print("SUCCESS: Outcome codes inserted successfully!")


In [None]:
# Create file format for FAERS CSV files
print("Creating FAERS file format...")

session.sql("""
    CREATE FILE FORMAT IF NOT EXISTS FAERS_FILE_FORMAT
        TYPE = 'CSV'
        FIELD_DELIMITER = '$'
        SKIP_HEADER = 1
        FIELD_OPTIONALLY_ENCLOSED_BY = NONE
        ENCODING = 'UTF8'
        TRIM_SPACE = TRUE
        EMPTY_FIELD_AS_NULL = TRUE
""").collect()

print("SUCCESS: FAERS file format created successfully!")

# Create internal stage for FAERS data
print("Creating FAERS stage...")

session.sql("""
    CREATE IF NOT EXISTS STAGE FAERS_STAGE ENCRYPTION = (TYPE='SNOWFLAKE_SSE')
""").collect()

print("SUCCESS: FAERS stage created successfully!")


## Data Loading

### Load Real FAERS Data

**Step 1: Download FAERS Data**
- Download FAERS quarterly data (2024 Q2) from the [FDA website](https://fis.fda.gov/extensions/FPD-QDE-FAERS/FPD-QDE-FAERS.html)
- Alternatively, download pre-processed files from the [GDrive folder](https://drive.google.com/drive/folders/1rzkNIWt-Or0HoRIA2SoTw6lg82RenpC3?usp=sharing)

**Step 2: File Format Preparation**
- If downloading directly from FDA website, convert files to UTF-8 format before uploading
- Pre-processed files from GDrive are already in UTF-8 format

**Step 3: File to Table Mapping**
The following file mapping should be used (from GDrive folder):
- `REAC24Q2_utf8` → `FAERS_REACTIONS`
- `DRUG24Q2_utf8` → `FAERS_DRUGS` 
- `OUTC24Q2_utf8` → `FAERS_OUTCOMES`
- `DEMO24Q2_utf8` → `FAERS_ADVERSE_EVENTS`

**Step 4: Data Upload**
1. Upload files to `@FAERS_STAGE` through the snowsight UI
2. Use `COPY INTO` commands below to programmatically load data into tables



## FAERS Stage
After uploading the files your stage should look like this:

![FAERS Stage Upload](../FAERS_STAGE_UPLOAD.png)


In [None]:
# Load FAERS Data into Tables
print("Loading FAERS data into tables...")

# Load FAERS_ADVERSE_EVENTS (from DEMO24Q2_utf8.txt)
print("Loading FAERS_ADVERSE_EVENTS...")
try:
    session.sql("""
        COPY INTO FAERS_ADVERSE_EVENTS
        FROM @FAERS_STAGE/DEMO24Q2_utf8.txt
        FILE_FORMAT = FAERS_FILE_FORMAT
    """).collect()
    print("SUCCESS: FAERS_ADVERSE_EVENTS loaded successfully!")
except Exception as e:
    print(f"ERROR loading FAERS_ADVERSE_EVENTS: {e}")

# Load FAERS_DRUGS (from DRUG24Q2_utf8.txt)
print("Loading FAERS_DRUGS...")
try:
    session.sql("""
        COPY INTO FAERS_DRUGS
        FROM @FAERS_STAGE/DRUG24Q2_utf8.txt
        FILE_FORMAT = FAERS_FILE_FORMAT
        ON_ERROR = 'CONTINUE'
    """).collect()
    print("SUCCESS: FAERS_DRUGS loaded successfully!")
except Exception as e:
    print(f"ERROR loading FAERS_DRUGS: {e}")

# Load FAERS_REACTIONS (from REAC24Q2_utf8.txt)
print("Loading FAERS_REACTIONS...")
try:
    session.sql("""
        COPY INTO FAERS_REACTIONS
        FROM @FAERS_STAGE/REAC24Q2_utf8.txt
        FILE_FORMAT = FAERS_FILE_FORMAT
    """).collect()
    print("SUCCESS: FAERS_REACTIONS loaded successfully!")
except Exception as e:
    print(f"ERROR loading FAERS_REACTIONS: {e}")

# Load FAERS_OUTCOMES (from OUTC24Q2_utf8.txt)
print("Loading FAERS_OUTCOMES...")
try:
    session.sql("""
        COPY INTO FAERS_OUTCOMES
        FROM @FAERS_STAGE/OUTC24Q2_utf8.txt
        FILE_FORMAT = FAERS_FILE_FORMAT
    """).collect()
    print("SUCCESS: FAERS_OUTCOMES loaded successfully!")
except Exception as e:
    print(f"ERROR loading FAERS_OUTCOMES: {e}")

print("Data loading complete! All FAERS tables should now contain data.")

In [None]:
# Explore the data: High-risk drug analysis
print("Analyzing top 10 high-risk drugs with serious adverse events...")

high_risk_analysis = session.sql("""
    SELECT 
        d.DRUGNAME,
        COUNT(DISTINCT o.CASEID) as serious_ae_cases,
        LISTAGG(DISTINCT oc.outcome_description, ', ') as outcome_types
    FROM FAERS_DRUGS d
    JOIN FAERS_OUTCOMES o ON d.CASEID = o.CASEID
    JOIN FAERS_OUTCOME_CODES oc ON o.OUTC_COD = oc.outc_cod
    WHERE o.OUTC_COD IN ('DE', 'LT', 'HO')  -- Death, Life-threatening, Hospitalization
    GROUP BY d.DRUGNAME
    ORDER BY serious_ae_cases DESC
    LIMIT 10
""").collect()

# Display results
print("\nHigh-risk drugs with serious adverse events:")
for row in high_risk_analysis:
    print(f"   🔸 {row[0]}: {row[1]} cases - {row[2]}")

print(f"\nAnalysis complete! Found {len(high_risk_analysis)} drugs with serious adverse events.")

## FAERS Data Setup Complete!

Your FAERS database infrastructure is now ready with:

- **FAERS_ADVERSE_EVENTS** table for patient demographics and event information
- **FAERS_DRUGS** table for drug information per case
- **FAERS_REACTIONS** table for adverse reaction terms
- **FAERS_OUTCOMES** table for event outcomes
- **FAERS_OUTCOME_CODES** reference table with outcome descriptions
- **FAERS_STAGE** and file format for data loading

## Key Features of FAERS Structure
- **Comprehensive patient data** including demographics and medical history
- **Drug-specific information** with dosage, route, and timing details
- **Standardized outcome codes** for consistent severity classification
- **Flexible data loading** infrastructure for real FAERS data

## Next Steps
Create analytics tables with **03_Analytics_Tables_Setup**

---
*FAERS data provides the regulatory context needed for comprehensive adverse event prediction.*
