## ETL for Patient Contact Data (MySQL)

### 1. Import Dependencies and Connections

This block imports the necessary libraries and loads both database engines:

- mysql_engine → used to read from the operational MySQL source system
- pg_engine → used to load data into the PostgreSQL data warehouse

In [7]:
import pandas as pd
from config import mysql_engine, pg_engine

**Explanation:**
The pipeline uses Pandas combined with SQLAlchemy engines to extract operational patient contact information from MySQL and load it into the dim_patient_contact table inside the PostgreSQL warehouse.

### 2. Extract: Read Patient Contact Data from MySQL

In [8]:
patient_contact_df = pd.read_sql("SELECT * FROM patient_contact", con=mysql_engine)
patient_contact_df["source_system"] = "MySQL_patient_contact"

**Explanation:**
The ETL process begins by pulling the entire patient_contact table from the MySQL operational database.
This table typically includes:
- patient_nbr
- phone
- city
- country

A source_system column is added to clearly document lineage and maintain traceability inside the warehouse.

### 3. Load Preparation & Load: Clear Existing Dimension Table & Insert MySQL Data into PostgreSQL Dimension

In [11]:
with pg_engine.begin() as conn:
    conn.exec_driver_sql("TRUNCATE TABLE dim_patient_contact RESTART IDENTITY;")
    patient_contact_df.to_sql(
        "dim_patient_contact",
        con=conn,
        if_exists="append",
        index=False,
    )

**Explanation:**

**1.**
Before loading updated contact data into PostgreSQL, the ETL workflow resets the target dimension table.
TRUNCATE ensures:
- old/duplicate records are removed
- surrogate keys restart cleanly
- loading becomes fully repeatable and consistent

This is standard practice in warehouse ETL when the source is small and cheap to reload.

**2.** The cleaned MySQL data is then appended into the dim_patient_contact dimension in the warehouse. Each record now serves as an extension of the dim_patient table (through patient_nbr), enriching the warehouse with:

- phone contact information
- city and country (linking later to dim_country)
- lineage metadata