#### STEP 1 ‚Äî Convert Datatypes (Overwrite SAME Silver table)

### ‚úÖ Fix timestamps and keep structure clean

In [0]:
%sql
CREATE OR REPLACE TABLE angad_kumar91.fhir_healthcare_analytics_silver.condition
USING DELTA
AS
SELECT
    condition_id,
    patient_id,
    encounter_id,
    diagnosis,
    clinical_status,

    -- Correct datetime types
    CAST(onset_time AS TIMESTAMP)      AS onset_time,
    CAST(recorded_date AS TIMESTAMP)   AS recorded_date,

    source_file,
    ingest_time

FROM angad_kumar91.fhir_healthcare_analytics_silver.condition;


#### STEP 2 ‚Äî Check NULL / Empty Values (Data Quality Report)

### ‚úÖ Count nulls & empty strings per column

In [0]:
%sql
SELECT
    COUNT(*) AS total_rows,

    SUM(CASE WHEN condition_id IS NULL OR condition_id = '' THEN 1 ELSE 0 END) AS condition_id_nulls,
    SUM(CASE WHEN patient_id IS NULL OR patient_id = '' THEN 1 ELSE 0 END) AS patient_id_nulls,
    SUM(CASE WHEN encounter_id IS NULL OR encounter_id = '' THEN 1 ELSE 0 END) AS encounter_id_nulls,
    SUM(CASE WHEN diagnosis IS NULL OR diagnosis = '' THEN 1 ELSE 0 END) AS diagnosis_nulls,
    SUM(CASE WHEN clinical_status IS NULL OR clinical_status = '' THEN 1 ELSE 0 END) AS clinical_status_nulls,
    SUM(CASE WHEN onset_time IS NULL THEN 1 ELSE 0 END) AS onset_time_nulls,
    SUM(CASE WHEN recorded_date IS NULL THEN 1 ELSE 0 END) AS recorded_date_nulls

FROM angad_kumar91.fhir_healthcare_analytics_silver.condition;


**STEP 4 ‚Äî Fix if NULL / Empty Values (UPDATE)**
#####  ‚úÖ Clean descriptive fields only
UPDATE angad_kumar91.fhir_healthcare_analytics_silver.condition
SET
    diagnosis = COALESCE(NULLIF(diagnosis, ''), 'Unknown Condition'),
    clinical_status = COALESCE(NULLIF(clinical_status, ''), 'unknown');


**For the Condition Silver table, we corrected all timestamp fields, quantified null and empty values, and applied controlled defaults only to descriptive attributes like diagnosis and clinical status. Critical identifiers and clinical timestamps were preserved to maintain medical accuracy.**

 **onset time** - _is simply the moment a disease or set of symptoms begins, or the time it takes for a medical treatment (like a medication) to start working_

 _For example: A patient might report that their headache started at onset_time yesterday at 3:00 PM. The doctor then documents this conversation in the electronic health record today at recorded_date today at 9:00 AM_

##### STEP 5 ‚Äî Post-Validation Check (Confirm Cleanliness)



In [0]:
%sql
SELECT
    SUM(CASE WHEN diagnosis IS NULL OR diagnosis = '' THEN 1 ELSE 0 END) AS diagnosis_remaining_nulls,
    SUM(CASE WHEN clinical_status IS NULL OR clinical_status = '' THEN 1 ELSE 0 END) AS clinical_status_remaining_nulls
FROM angad_kumar91.fhir_healthcare_analytics_silver.condition;


In [0]:
%sql
select * from angad_kumar91.fhir_healthcare_analytics_silver.condition
limit 10


STEP 1 ‚Äî Convert Datatypes (Overwrite SAME Silver table)

‚úÖ Fix report_time datatype

In [0]:
%sql
CREATE OR REPLACE TABLE angad_kumar91.fhir_healthcare_analytics_silver.diagnostic_report
USING DELTA
AS
SELECT
    diagnostic_report_id,
    patient_id,
    encounter_id,
    report_name,
    status,

    -- Correct datetime type
    CAST(report_time AS TIMESTAMP) AS report_time,

    source_file,
    ingest_time
FROM angad_kumar91.fhir_healthcare_analytics_silver.diagnostic_report;


STEP 2 ‚Äî Check NULL / Empty Values (Data Quality Check)

‚úÖ Count nulls and empty strings

In [0]:
%sql
SELECT
    COUNT(*) AS total_rows,

    SUM(CASE WHEN diagnostic_report_id IS NULL OR diagnostic_report_id = '' THEN 1 ELSE 0 END) AS diagnostic_report_id_nulls,
    SUM(CASE WHEN patient_id IS NULL OR patient_id = '' THEN 1 ELSE 0 END) AS patient_id_nulls,
    SUM(CASE WHEN encounter_id IS NULL OR encounter_id = '' THEN 1 ELSE 0 END) AS encounter_id_nulls,
    SUM(CASE WHEN report_name IS NULL OR report_name = '' THEN 1 ELSE 0 END) AS report_name_nulls,
    SUM(CASE WHEN status IS NULL OR status = '' THEN 1 ELSE 0 END) AS status_nulls,
    SUM(CASE WHEN report_time IS NULL THEN 1 ELSE 0 END) AS report_time_nulls
FROM angad_kumar91.fhir_healthcare_analytics_silver.diagnostic_report;


STEP 4 ‚Äî Fix NULL / Empty Values (UPDATE)

‚úÖ Update only descriptive columns

In [0]:
%sql
UPDATE angad_kumar91.fhir_healthcare_analytics_silver.diagnostic_report
SET
    report_name = COALESCE(NULLIF(report_name, ''), 'Unknown Diagnostic Report'),
    status = COALESCE(NULLIF(status, ''), 'unknown');


STEP 5 ‚Äî Post-Validation (Confirm Cleanliness)

In [0]:
%sql
SELECT
    SUM(CASE WHEN report_name IS NULL OR report_name = '' THEN 1 ELSE 0 END) AS report_name_remaining_nulls,
    SUM(CASE WHEN status IS NULL OR status = '' THEN 1 ELSE 0 END) AS status_remaining_nulls
FROM angad_kumar91.fhir_healthcare_analytics_silver.diagnostic_report;


**For the Diagnostic Report Silver table, we standardized timestamp fields, quantified null and empty values, and applied controlled defaults to descriptive attributes like report name and status while preserving clinical timestamps and identifiers for data integrity.**

In [0]:
%sql
select * from angad_kumar91.fhir_healthcare_analytics_silver.diagnostic_report
limit 10

## 3. Silver table Encounter 

üü† ONE-TIME JSON FLATTEN (RUN ONLY ONCE)

‚ö†Ô∏è Run this only once ‚Äî and only if encounter_type exists

‚úî This permanently converts JSON into a business column

‚úî Raw JSON column is removed intentionally

‚úî Table is now clean and analytics-ready

**Run only once code :**

CREATE OR REPLACE TABLE angad_kumar91.fhir_healthcare_analytics_silver.encounter
USING DELTA
AS
SELECT
    encounter_id,
    patient_id,

    CAST(admit_time AS TIMESTAMP)     AS admit_time,
    CAST(discharge_time AS TIMESTAMP) AS discharge_time,

    status,

    from_json(
        encounter_type,
        'array<struct<
            coding:array<struct<
                system:string,
                code:string,
                display:string
            >>,
            text:string
        >>'
    )[0].coding[0].display AS admission_type,

    source_file,
    ingest_time
FROM angad_kumar91.fhir_healthcare_analytics_silver.encounter;


üü¢ SAFE RE-RUNNABLE CODE (USE ALWAYS code)

This is what goes in your daily pipeline üëá

This code never touches JSON again ‚Äî so it can run forever safely

In [0]:
%sql
CREATE OR REPLACE TABLE angad_kumar91.fhir_healthcare_analytics_silver.encounter
USING DELTA
AS
SELECT
    encounter_id,
    patient_id,
    CAST(admit_time AS TIMESTAMP)     AS admit_time,
    CAST(discharge_time AS TIMESTAMP) AS discharge_time,
    status,
    admission_type,
    source_file,
    ingest_time
FROM angad_kumar91.fhir_healthcare_analytics_silver.encounter;


In [0]:
%sql
SELECT
    COUNT(*) AS total_rows,

    SUM(CASE WHEN encounter_id IS NULL OR encounter_id = '' THEN 1 ELSE 0 END) AS encounter_id_nulls,
    SUM(CASE WHEN patient_id IS NULL OR patient_id = '' THEN 1 ELSE 0 END) AS patient_id_nulls,
    SUM(CASE WHEN admit_time IS NULL THEN 1 ELSE 0 END) AS admit_time_nulls,
    SUM(CASE WHEN discharge_time IS NULL THEN 1 ELSE 0 END) AS discharge_time_nulls,
    SUM(CASE WHEN status IS NULL OR status = '' THEN 1 ELSE 0 END) AS status_nulls,
    SUM(CASE WHEN admission_type IS NULL OR admission_type = '' THEN 1 ELSE 0 END) AS admission_type_nulls

FROM angad_kumar91.fhir_healthcare_analytics_silver.encounter;



**In the Silver layer, we implemented data quality checks to identify null and empty values. We quantified null counts per column and applied controlled default values only for descriptive fields like status and admission_type, while preserving critical identifiers and timestamps to avoid data corruption.**

#### Fix NULL / empty values (UPDATE Silver table)

UPDATE angad_kumar91.fhir_healthcare_analytics_silver.encounter
SET
    status = COALESCE(NULLIF(status, ''), 'unknown'),
    admission_type = COALESCE(NULLIF(admission_type, ''), 'Unspecified Encounter');


#### Re-check to confirm data is clean (POST-VALIDATION)
SELECT
    SUM(CASE WHEN status IS NULL OR status = '' THEN 1 ELSE 0 END) AS status_remaining_nulls,
    SUM(CASE WHEN admission_type IS NULL OR admission_type = '' THEN 1 ELSE 0 END) AS admission_type_remaining_nulls
FROM angad_kumar91.fhir_healthcare_analytics_silver.encounter;


In [0]:
%sql
SELECT
    SUM(CASE WHEN status IS NULL OR status = '' THEN 1 ELSE 0 END) AS status_remaining_nulls,
    SUM(CASE WHEN admission_type IS NULL OR admission_type = '' THEN 1 ELSE 0 END) AS admission_type_remaining_nulls
FROM angad_kumar91.fhir_healthcare_analytics_silver.encounter;


In [0]:
%sql
DESCRIBE HISTORY angad_kumar91.fhir_healthcare_analytics_silver.encounter;


### restore the previous data if any error we got

`restore table angad_kumar91.fhir_healthcare_analytics_silver.encounter
TO VERSION AS OF 2`

In [0]:
%sql
select * from angad_kumar91.fhir_healthcare_analytics_silver.encounter
limit 10

STEP 1 ‚Äî Convert Datatypes (Overwrite SAME Silver table)

‚úÖ Fix all date/time columns

In [0]:
%sql
CREATE OR REPLACE TABLE angad_kumar91.fhir_healthcare_analytics_silver.explanation_of_benefit
USING DELTA
AS
SELECT
    eob_id,
    patient_id,
    claim_type,
    claim_use,

    -- Correct datetime types
    CAST(bill_start AS TIMESTAMP)    AS bill_start,
    CAST(bill_end AS TIMESTAMP)      AS bill_end,
    CAST(created_date AS TIMESTAMP)  AS created_date,

    insurer,
    total_amount,
    claim_status,
    source_file,
    ingest_time
FROM angad_kumar91.fhir_healthcare_analytics_silver.explanation_of_benefit;


STEP 2 ‚Äî Check NULL / Empty Values (Data Quality Check)

‚úÖ Count nulls & empty strings per column

In [0]:
%sql
SELECT
    COUNT(*) AS total_rows,

    SUM(CASE WHEN eob_id IS NULL OR eob_id = '' THEN 1 ELSE 0 END) AS eob_id_nulls,
    SUM(CASE WHEN patient_id IS NULL OR patient_id = '' THEN 1 ELSE 0 END) AS patient_id_nulls,
    SUM(CASE WHEN claim_type IS NULL OR claim_type = '' THEN 1 ELSE 0 END) AS claim_type_nulls,
    SUM(CASE WHEN claim_use IS NULL OR claim_use = '' THEN 1 ELSE 0 END) AS claim_use_nulls,
    SUM(CASE WHEN bill_start IS NULL THEN 1 ELSE 0 END) AS bill_start_nulls,
    SUM(CASE WHEN bill_end IS NULL THEN 1 ELSE 0 END) AS bill_end_nulls,
    SUM(CASE WHEN created_date IS NULL THEN 1 ELSE 0 END) AS created_date_nulls,
    SUM(CASE WHEN insurer IS NULL OR insurer = '' THEN 1 ELSE 0 END) AS insurer_nulls,
    SUM(CASE WHEN total_amount IS NULL THEN 1 ELSE 0 END) AS total_amount_nulls,
    SUM(CASE WHEN claim_status IS NULL OR claim_status = '' THEN 1 ELSE 0 END) AS claim_status_nulls

FROM angad_kumar91.fhir_healthcare_analytics_silver.explanation_of_benefit;


STEP 4 ‚Äî Fix if NULL / Empty Values (UPDATE)

‚úÖ Update descriptive columns only

UPDATE angad_kumar91.fhir_healthcare_analytics_silver.explanation_of_benefit
SET
    claim_type   = COALESCE(NULLIF(claim_type, ''), 'unknown'),
    claim_use    = COALESCE(NULLIF(claim_use, ''), 'unknown'),
    insurer      = COALESCE(NULLIF(insurer, ''), 'UNKNOWN_INSURER'),
    claim_status = COALESCE(NULLIF(claim_status, ''), 'unknown');



STEP 5 ‚Äî Post-Validation (Confirm Clean Data)

_For the Explanation of Benefit Silver table, we standardized all billing and claim timestamps, validated null and empty values, and applied controlled defaults only to descriptive claim attributes while preserving financial and temporal accuracy._

üí∞ Explanation Of Benefit (EOB)

üëâ Insurance claim record

Definition: 
Shows how much hospital charged, how much insurance paid, and how much patient must pay.

Example:

Hospital bill = ‚Çπ10,000    Insurance paid = ‚Çπ7,000    Patient pays = ‚Çπ3,000

In [0]:
%sql
SELECT
    SUM(CASE WHEN claim_type IS NULL OR claim_type = '' THEN 1 ELSE 0 END) AS claim_type_remaining_nulls,
    SUM(CASE WHEN claim_use IS NULL OR claim_use = '' THEN 1 ELSE 0 END) AS claim_use_remaining_nulls,
    SUM(CASE WHEN insurer IS NULL OR insurer = '' THEN 1 ELSE 0 END) AS insurer_remaining_nulls,
    SUM(CASE WHEN claim_status IS NULL OR claim_status = '' THEN 1 ELSE 0 END) AS claim_status_remaining_nulls
FROM angad_kumar91.fhir_healthcare_analytics_silver.explanation_of_benefit;


In [0]:
%sql
select * from angad_kumar91.fhir_healthcare_analytics_silver.explanation_of_benefit
limit 10

STEP 1 ‚Äî Convert Datatypes (Overwrite SAME Silver table)

‚úÖ Fix vaccination_date datatype

In [0]:
%sql
CREATE OR REPLACE TABLE angad_kumar91.fhir_healthcare_analytics_silver.immunization
USING DELTA
AS
SELECT
    immunization_id,
    patient_id,
    encounter_id,
    vaccine_name,
    status,

    -- Correct datetime type
    CAST(vaccination_date AS TIMESTAMP) AS vaccination_date,

    source_file,
    ingest_time
FROM angad_kumar91.fhir_healthcare_analytics_silver.immunization;


STEP 2 ‚Äî Check NULL / Empty Values (Data Quality Check)

‚úÖ Count nulls & empty strings per column

In [0]:
%sql
SELECT
    COUNT(*) AS total_rows,

    SUM(CASE WHEN immunization_id IS NULL OR immunization_id = '' THEN 1 ELSE 0 END) AS immunization_id_nulls,
    SUM(CASE WHEN patient_id IS NULL OR patient_id = '' THEN 1 ELSE 0 END) AS patient_id_nulls,
    SUM(CASE WHEN encounter_id IS NULL OR encounter_id = '' THEN 1 ELSE 0 END) AS encounter_id_nulls,
    SUM(CASE WHEN vaccine_name IS NULL OR vaccine_name = '' THEN 1 ELSE 0 END) AS vaccine_name_nulls,
    SUM(CASE WHEN status IS NULL OR status = '' THEN 1 ELSE 0 END) AS status_nulls,
    SUM(CASE WHEN vaccination_date IS NULL THEN 1 ELSE 0 END) AS vaccination_date_nulls
FROM angad_kumar91.fhir_healthcare_analytics_silver.immunization;


STEP 4 ‚Äî Fix NULL / Empty Values (UPDATE)

‚úÖ Update descriptive columns only

UPDATE angad_kumar91.fhir_healthcare_analytics_silver.immunization
SET
    vaccine_name = COALESCE(NULLIF(vaccine_name, ''), 'Unknown Vaccine'),
    status = COALESCE(NULLIF(status, ''), 'unknown');



STEP 5 ‚Äî Post-Validation (Confirm Clean Data)

_For the Immunization Silver table, we standardized vaccination timestamps, validated null and empty values, and applied controlled defaults to descriptive fields like vaccine name and status while preserving clinical accuracy._

In [0]:
%sql
SELECT
    SUM(CASE WHEN vaccine_name IS NULL OR vaccine_name = '' THEN 1 ELSE 0 END) AS vaccine_name_remaining_nulls,
    SUM(CASE WHEN status IS NULL OR status = '' THEN 1 ELSE 0 END) AS status_remaining_nulls
FROM angad_kumar91.fhir_healthcare_analytics_silver.immunization;


In [0]:
%sql
select * from angad_kumar91.fhir_healthcare_analytics_silver.immunization
limit 10

STEP 1 ‚Äî Convert Datatypes (Overwrite SAME Silver table)

‚úÖ Fix prescribed_date datatype

In [0]:
%sql
CREATE OR REPLACE TABLE angad_kumar91.fhir_healthcare_analytics_silver.medication_request
USING DELTA
AS
SELECT
    medication_request_id,
    patient_id,
    encounter_id,
    medication_name,
    status,

    -- Correct datetime type
    CAST(prescribed_date AS TIMESTAMP) AS prescribed_date,

    source_file,
    ingest_time
FROM angad_kumar91.fhir_healthcare_analytics_silver.medication_request;


STEP 2 ‚Äî Check NULL / Empty Values (Data Quality Check)

‚úÖ Count nulls & empty strings per column

In [0]:
%sql
SELECT
    COUNT(*) AS total_rows,

    SUM(CASE WHEN medication_request_id IS NULL OR medication_request_id = '' THEN 1 ELSE 0 END) AS medication_request_id_nulls,
    SUM(CASE WHEN patient_id IS NULL OR patient_id = '' THEN 1 ELSE 0 END) AS patient_id_nulls,
    SUM(CASE WHEN encounter_id IS NULL OR encounter_id = '' THEN 1 ELSE 0 END) AS encounter_id_nulls,
    SUM(CASE WHEN medication_name IS NULL OR medication_name = '' THEN 1 ELSE 0 END) AS medication_name_nulls,
    SUM(CASE WHEN status IS NULL OR status = '' THEN 1 ELSE 0 END) AS status_nulls,
    SUM(CASE WHEN prescribed_date IS NULL THEN 1 ELSE 0 END) AS prescribed_date_nulls
FROM angad_kumar91.fhir_healthcare_analytics_silver.medication_request;


STEP 4 ‚Äî Fix if NULL / Empty Values (UPDATE)

‚úÖ Update descriptive columns only

In [0]:
%sql
UPDATE angad_kumar91.fhir_healthcare_analytics_silver.medication_request
SET
    medication_name = COALESCE(NULLIF(medication_name, ''), 'Unknown Medication'),
    status = COALESCE(NULLIF(status, ''), 'unknown');


STEP 5 ‚Äî Post-Validation (Confirm Clean Data)

_For the Medication Request Silver table, we standardized prescription timestamps, quantified null and empty values, and applied controlled defaults to descriptive medication fields while preserving clinical accuracy._

In [0]:
%sql
SELECT
    SUM(CASE WHEN medication_name IS NULL OR medication_name = '' THEN 1 ELSE 0 END) AS medication_name_remaining_nulls,
    SUM(CASE WHEN status IS NULL OR status = '' THEN 1 ELSE 0 END) AS status_remaining_nulls
FROM angad_kumar91.fhir_healthcare_analytics_silver.medication_request;


In [0]:
%sql
select * from angad_kumar91.fhir_healthcare_analytics_silver.medication_request
limit 10

## 3. Silver table Observation 

### **Restoring the deleted json file usin verson as of method **

In [0]:
%sql
DESCRIBE HISTORY angad_kumar91.fhir_healthcare_analytics_silver.observation;


#### Run when there is need to restore privious version.

`RESTORE TABLE angad_kumar91.fhir_healthcare_analytics_silver.observation
TO VERSION AS OF 1;`


In [0]:
%sql
SELECT COUNT(*) 
FROM angad_kumar91.fhir_healthcare_analytics_silver.observation;


### STEP 1 ‚Äî Parse JSON + Fix Datatypes (Overwrite SAME Silver table)

‚úÖ This does JSON flattening + datatype correction in one step

SAFE CODE ‚Äî VERSION YOU SHOULD USE

üîπ STEP 1: Create a TEMP VIEW (safe transformation)

### ONE-TIME JSON FLATTEN (RUN ONLY ONCE)

### ‚ö†Ô∏è Run this ONLY if observation_code_raw EXISTS

**Run only when you have to explode the json column**

CREATE OR REPLACE TABLE angad_kumar91.fhir_healthcare_analytics_silver.observation
USING DELTA
AS
SELECT
    observation_id,
    patient_id,
    encounter_id,

    from_json(
        observation_code_raw,
        'struct<
            coding:array<struct<
                system:string,
                code:string,
                display:string
            >>,
            text:string
        >'
    ).coding[0].display AS observation_name,

    value,
    unit,
    CAST(observation_time AS TIMESTAMP) AS observation_time,
    source_file,
    ingest_time
FROM angad_kumar91.fhir_healthcare_analytics_silver.observation;


### SAFE RE-RUNNABLE CODE (USE THIS ALWAYS)

This is the code you should keep in your pipeline/notebook üëá
It does NOT reference the raw JSON column at all.

In [0]:
%sql
CREATE OR REPLACE TABLE angad_kumar91.fhir_healthcare_analytics_silver.observation
USING DELTA
AS
SELECT
    observation_id,
    patient_id,
    encounter_id,
    observation_name,
    value,
    unit,
    CAST(observation_time AS TIMESTAMP) AS observation_time,
    source_file,
    ingest_time
FROM angad_kumar91.fhir_healthcare_analytics_silver.observation;


STEP 2 ‚Äî Check NULL / Empty Values (Data Quality Check)

‚úÖ Count nulls & empty strings

In [0]:
%sql
SELECT
    COUNT(*) AS total_rows,

    SUM(CASE WHEN observation_id IS NULL OR observation_id = '' THEN 1 ELSE 0 END) AS observation_id_nulls,
    SUM(CASE WHEN patient_id IS NULL OR patient_id = '' THEN 1 ELSE 0 END) AS patient_id_nulls,
    SUM(CASE WHEN encounter_id IS NULL OR encounter_id = '' THEN 1 ELSE 0 END) AS encounter_id_nulls,
    SUM(CASE WHEN observation_name IS NULL OR observation_name = '' THEN 1 ELSE 0 END) AS observation_name_nulls,
    SUM(CASE WHEN value IS NULL THEN 1 ELSE 0 END) AS value_nulls,
    SUM(CASE WHEN unit IS NULL OR unit = '' THEN 1 ELSE 0 END) AS unit_nulls,
    SUM(CASE WHEN observation_time IS NULL THEN 1 ELSE 0 END) AS observation_time_nulls

FROM angad_kumar91.fhir_healthcare_analytics_silver.observation;


STEP 4 ‚Äî Fix NULL / Empty Values (UPDATE)

‚úÖ Update descriptive columns only

1. Get Mean(Value) and Mode(Unit)

2. Correct Null Handling Update (Final)


**ALT code for NULL handling**

UPDATE angad_kumar91.fhir_healthcare_analytics_silver.observation
SET
    observation_name = COALESCE(NULLIF(observation_name, ''), 'Unknown Observation'),
    unit = COALESCE(NULLIF(unit, ''), 'unknown');


In [0]:
%sql
WITH stats AS (
    SELECT
        AVG(value) AS mean_value,
        MODE() WITHIN GROUP (ORDER BY unit) AS mode_unit
    FROM angad_kumar91.fhir_healthcare_analytics_silver.observation
)
SELECT * FROM stats;


In [0]:
%sql
UPDATE angad_kumar91.fhir_healthcare_analytics_silver.observation
SET
    observation_name = COALESCE(NULLIF(observation_name, ''), 'Unknown Observation'),
    value = COALESCE(value, (SELECT AVG(value) FROM angad_kumar91.fhir_healthcare_analytics_silver.observation)),
    unit = COALESCE(NULLIF(unit, ''), 
                    (SELECT MODE() WITHIN GROUP (ORDER BY unit)
                     FROM angad_kumar91.fhir_healthcare_analytics_silver.observation))
WHERE
    observation_name IS NULL OR observation_name = ''
    OR value IS NULL
    OR unit IS NULL OR unit = '';


In [0]:
%sql
-- Re-check Nulls After Fix

SELECT
    COUNT(*) AS total_rows,
    SUM(CASE WHEN observation_name IS NULL OR observation_name = '' THEN 1 ELSE 0 END) AS observation_name_nulls,
    SUM(CASE WHEN value IS NULL THEN 1 ELSE 0 END) AS value_nulls,
    SUM(CASE WHEN unit IS NULL OR unit = '' THEN 1 ELSE 0 END) AS unit_nulls
FROM angad_kumar91.fhir_healthcare_analytics_silver.observation;


In [0]:
%sql
select * from angad_kumar91.fhir_healthcare_analytics_silver.observation
limit 10

## 8. Silver table Patient

**STEP 1 ‚Äî SAFE Datatype Conversion (Overwrite once)**

This step is safe and repeatable because we are not dropping required source columns.

In [0]:
%sql
CREATE OR REPLACE TABLE angad_kumar91.fhir_healthcare_analytics_silver.patient
USING DELTA
AS
SELECT
    patient_id,
    gender,

    -- Convert date & datetime correctly
    CAST(birth_date AS DATE)            AS birth_date,
    CAST(deceased_datetime AS TIMESTAMP) AS deceased_datetime,

    marital_status,
    city,
    state,
    country,
    postal_code,
    preferred_language,
    language_code,
    phone_number,

    source_file,
    ingest_time
FROM angad_kumar91.fhir_healthcare_analytics_silver.patient;


**STEP 2 ‚Äî Check NULL / Empty Values (Data Quality Check)**

‚úÖ Count nulls and empty strings

In [0]:
%sql
SELECT
    COUNT(*) AS total_rows,

    SUM(CASE WHEN patient_id IS NULL OR patient_id = '' THEN 1 ELSE 0 END) AS patient_id_nulls,
    SUM(CASE WHEN gender IS NULL OR gender = '' THEN 1 ELSE 0 END) AS gender_nulls,
    SUM(CASE WHEN birth_date IS NULL THEN 1 ELSE 0 END) AS birth_date_nulls,
    SUM(CASE WHEN marital_status IS NULL OR marital_status = '' THEN 1 ELSE 0 END) AS marital_status_nulls,
    SUM(CASE WHEN city IS NULL OR city = '' THEN 1 ELSE 0 END) AS city_nulls,
    SUM(CASE WHEN postal_code IS NULL OR postal_code = '' THEN 1 ELSE 0 END) AS postal_code_nulls,
    SUM(CASE WHEN preferred_language IS NULL OR preferred_language = '' THEN 1 ELSE 0 END) AS preferred_language_nulls,
    SUM(CASE WHEN phone_number IS NULL OR phone_number = '' THEN 1 ELSE 0 END) AS phone_number_nulls
FROM angad_kumar91.fhir_healthcare_analytics_silver.patient;


STEP 4 ‚Äî Fix NULL / Empty Values (UPDATE)

‚úÖ Update only descriptive columns

In [0]:
%sql
UPDATE angad_kumar91.fhir_healthcare_analytics_silver.patient

SET
    gender = COALESCE(NULLIF(gender, ''), 'unknown'),
    marital_status = COALESCE(NULLIF(marital_status, ''), 'unknown'),
    postal_code = COALESCE(NULLIF(postal_code, ''), 'UNKNOWN'),
    preferred_language = COALESCE(NULLIF(preferred_language, ''), 'unknown');


**STEP 5 ‚Äî Post-Validation (Confirm Clean Data)**

_For the Patient Silver table, we standardized date and datetime fields, validated null and empty values, and applied controlled defaults to demographic attributes while preserving clinical and identity accuracy_

In [0]:
%sql
SELECT
    SUM(CASE WHEN gender IS NULL OR gender = '' THEN 1 ELSE 0 END) AS gender_remaining_nulls,
    SUM(CASE WHEN marital_status IS NULL OR marital_status = '' THEN 1 ELSE 0 END) AS marital_status_remaining_nulls,
    SUM(CASE WHEN postal_code IS NULL OR postal_code = '' THEN 1 ELSE 0 END) AS postal_code_remaining_nulls,
    SUM(CASE WHEN preferred_language IS NULL OR preferred_language = '' THEN 1 ELSE 0 END) AS preferred_language_remaining_nulls
FROM angad_kumar91.fhir_healthcare_analytics_silver.patient;


In [0]:
%sql
select * from angad_kumar91.fhir_healthcare_analytics_silver.patient
limit 10

**STEP 1 ‚Äî SAFE Datatype Conversion (Re-runnable)**

This overwrite is safe because:

We are not removing any required source column

No JSON parsing

Can be executed multiple times

In [0]:
%sql
CREATE OR REPLACE TABLE angad_kumar91.fhir_healthcare_analytics_silver.procedure
USING DELTA
AS
SELECT
    procedure_id,
    patient_id,
    encounter_id,
    procedure_name,
    procedure_status,

    -- Convert performed_time correctly
    CAST(performed_time AS TIMESTAMP) AS performed_time,

    source_file,
    ingest_time
FROM angad_kumar91.fhir_healthcare_analytics_silver.procedure;


**STEP 2 ‚Äî Check if NULL / Empty Values (Data Quality Check)**

In [0]:
%sql
SELECT
    COUNT(*) AS total_rows,

    SUM(CASE WHEN procedure_id IS NULL OR procedure_id = '' THEN 1 ELSE 0 END) AS procedure_id_nulls,
    SUM(CASE WHEN patient_id IS NULL OR patient_id = '' THEN 1 ELSE 0 END) AS patient_id_nulls,
    SUM(CASE WHEN encounter_id IS NULL OR encounter_id = '' THEN 1 ELSE 0 END) AS encounter_id_nulls,
    SUM(CASE WHEN procedure_name IS NULL OR procedure_name = '' THEN 1 ELSE 0 END) AS procedure_name_nulls,
    SUM(CASE WHEN procedure_status IS NULL OR procedure_status = '' THEN 1 ELSE 0 END) AS procedure_status_nulls,
    SUM(CASE WHEN performed_time IS NULL THEN 1 ELSE 0 END) AS performed_time_nulls
FROM angad_kumar91.fhir_healthcare_analytics_silver.procedure;


STEP 4 ‚Äî Fix if NULL / Empty Values (UPDATE)

In [0]:
%sql
UPDATE angad_kumar91.fhir_healthcare_analytics_silver.procedure
SET
    procedure_name = COALESCE(NULLIF(procedure_name, ''), 'Unknown Procedure'),
    procedure_status = COALESCE(NULLIF(procedure_status, ''), 'unknown');


STEP 5 ‚Äî Post-Validation (Confirm Clean Data)

_For the Procedure Silver table, we standardized the performed timestamp, validated null and empty values, and applied controlled defaults only to descriptive attributes while preserving clinical accuracy_

In [0]:
%sql
SELECT
    SUM(CASE WHEN procedure_name IS NULL OR procedure_name = '' THEN 1 ELSE 0 END) AS procedure_name_remaining_nulls,
    SUM(CASE WHEN procedure_status IS NULL OR procedure_status = '' THEN 1 ELSE 0 END) AS procedure_status_remaining_nulls
FROM angad_kumar91.fhir_healthcare_analytics_silver.procedure;


In [0]:
%sql
select * from angad_kumar91.fhir_healthcare_analytics_silver.procedure
limit 10

#### 9. Silver layer coverage ‚Äî COMPLETE üèÅ

I have now safely standardized every major Silver table:

Patient

Encounter

Condition

Observation

Procedure

Medication Request

Immunization

Diagnostic Report

Explanation of Benefit
