# Markdown Fix - Testing Notebook

This notebook contains the updated `combined_txns_query` with the new `TOTAL_MARKDOWN` column.

**Changes Made:**
- Added `(tf.REVENUE - tf.NET_SALES) AS TOTAL_MARKDOWN` to the SELECT clause

**Validation:**
- `TOTAL_MARKDOWN` should equal `REVENUE - NET_SALES`
- This calculation inherits the same Register filters as REVENUE and NET_SALES


In [None]:
# Run this cell to import utilities (adjust path as needed for your environment)
%run /Workspace/Projects/Experimentation/aaml-experimentation-coe/exp_coe_utils


In [None]:
# Data Source Constants
TXN_FACTS = 'gcp-abs-udco-bqvw-prod-prj-01.udco_ds_retl.txn_facts'
TXN_HDR_COMBINED = 'gcp-abs-udco-bqvw-prod-prj-01.udco_ds_retl.TXN_HDR_COMBINED'
LU_DAY_MERGE = 'gcp-abs-udco-bqvw-prod-prj-01.udco_ds_edw.LU_DAY_MERGE'
SMV_RETAIL_CUSTOMER_LOYALTY_PROGRAM_HOUSEHOLD = 'gcp-abs-udco-bqvw-prod-prj-01.udco_ds_cust.SMV_RETAIL_CUSTOMER_LOYALTY_PROGRAM_HOUSEHOLD'


In [None]:
# Test Date Range - Adjust as needed
start_date = "DATE('2024-01-01')"
end_date = "DATE('2024-01-08')"


In [None]:
## Combined Transactions Query - WITH TOTAL_MARKDOWN (NEW)
combined_txns_query = f"""
WITH filtered_tf AS (
 SELECT
   TXN_ID,
   TXN_DTE,
   CARD_NBR,
   SUM(GROSS_AMT) AS REVENUE,
   SUM(NET_AMT+MKDN_WOD_ALLOC_AMT+MKDN_POD_ALLOC_AMT) AS NET_SALES,
   SUM(ITEM_QTY) AS ITEMS
 FROM {TXN_FACTS}
 WHERE
   TXN_DTE >= {start_date} 
   AND TXN_DTE < {end_date}
   AND MISC_ITEM_QTY = 0
   AND DEPOSIT_ITEM_QTY = 0
   AND REV_DTL_SUBTYPE_ID IN (0, 6, 7)
 GROUP BY TXN_ID, TXN_DTE, CARD_NBR
)
SELECT
 smv.HOUSEHOLD_ID,
 tf.TXN_DTE,
 tf.TXN_ID,
 CASE
   WHEN f.REGISTER_NBR IN (99, 104, 173, 174, 999) THEN 'ECOMM'
   WHEN f.REGISTER_NBR IN (
     1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
     11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
     49, 50, 51, 52, 53, 54, 93, 94, 95, 96, 97, 98,
     116, 117, 118, 119, 120, 121, 122, 123, 124, 125,
     151, 152, 153, 154, 175, 176, 177, 178, 179, 180,
     181, 182, 195, 196, 197, 198
   ) THEN 'STORE'
   ELSE NULL
 END AS TXN_LOCATION,
 tf.REVENUE,
 tf.NET_SALES,
 (tf.REVENUE - tf.NET_SALES) AS TOTAL_MARKDOWN,
 tf.ITEMS,
 f.TENDER_AMT_FOODSTAMPS + f.TENDER_AMT_EBT AS SNAP_TENDER
FROM filtered_tf AS tf
JOIN {TXN_HDR_COMBINED} AS f
 ON tf.TXN_ID = f.TXN_ID AND tf.TXN_DTE = f.TXN_DTE
JOIN {LU_DAY_MERGE} AS b
 ON CAST(f.TXN_DTE AS DATE) = b.D_DATE
JOIN (
 SELECT DISTINCT HOUSEHOLD_ID, LOYALTY_PROGRAM_CARD_NBR
 FROM {SMV_RETAIL_CUSTOMER_LOYALTY_PROGRAM_HOUSEHOLD}
) AS smv
 ON SAFE_CAST(tf.CARD_NBR AS BIGNUMERIC) = SAFE_CAST(smv.LOYALTY_PROGRAM_CARD_NBR AS BIGNUMERIC)
WHERE
 f.TXN_HDR_SRC_CD = 0
 AND f.REGISTER_NBR IN (
   99, 104, 173, 174, 999,
   1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
   11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
   49, 50, 51, 52, 53, 54, 93, 94, 95, 96, 97, 98,
   116, 117, 118, 119, 120, 121, 122, 123, 124, 125,
   151, 152, 153, 154, 175, 176, 177, 178, 179, 180,
   181, 182, 195, 196, 197, 198
 )
"""

print(combined_txns_query)


In [None]:
# Execute the query (uncomment when running in Databricks)
# combined_txns_sp = bc.read_gcp_table(combined_txns_query)
# display(combined_txns_sp.limit(100))


---
## Validation Query

Run this to verify the math balances:
- `REVENUE - NET_SALES = TOTAL_MARKDOWN` (should be 0 difference)


In [None]:
## Validation: Check that REVENUE - NET_SALES = TOTAL_MARKDOWN
validation_query = f"""
WITH filtered_tf AS (
 SELECT
   TXN_ID,
   TXN_DTE,
   CARD_NBR,
   SUM(GROSS_AMT) AS REVENUE,
   SUM(NET_AMT+MKDN_WOD_ALLOC_AMT+MKDN_POD_ALLOC_AMT) AS NET_SALES,
   SUM(ITEM_QTY) AS ITEMS
 FROM {TXN_FACTS}
 WHERE
   TXN_DTE >= {start_date} 
   AND TXN_DTE < {end_date}
   AND MISC_ITEM_QTY = 0
   AND DEPOSIT_ITEM_QTY = 0
   AND REV_DTL_SUBTYPE_ID IN (0, 6, 7)
 GROUP BY TXN_ID, TXN_DTE, CARD_NBR
)
SELECT
 'VALIDATION' AS CHECK_TYPE,
 SUM(tf.REVENUE) AS TOTAL_REVENUE,
 SUM(tf.NET_SALES) AS TOTAL_NET_SALES,
 SUM(tf.REVENUE - tf.NET_SALES) AS TOTAL_MARKDOWN_CALC,
 SUM(tf.REVENUE) - SUM(tf.NET_SALES) - SUM(tf.REVENUE - tf.NET_SALES) AS DIFFERENCE_SHOULD_BE_ZERO
FROM filtered_tf AS tf
JOIN {TXN_HDR_COMBINED} AS f
 ON tf.TXN_ID = f.TXN_ID AND tf.TXN_DTE = f.TXN_DTE
WHERE
 f.TXN_HDR_SRC_CD = 0
 AND f.REGISTER_NBR IN (
   99, 104, 173, 174, 999,
   1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
   11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
   49, 50, 51, 52, 53, 54, 93, 94, 95, 96, 97, 98,
   116, 117, 118, 119, 120, 121, 122, 123, 124, 125,
   151, 152, 153, 154, 175, 176, 177, 178, 179, 180,
   181, 182, 195, 196, 197, 198
 )
"""

print(validation_query)


In [None]:
# Execute validation (uncomment when running in Databricks)
# validation_result = bc.read_gcp_table(validation_query)
# display(validation_result)


---
## Expected Results

After running the validation:
- `DIFFERENCE_SHOULD_BE_ZERO` should be `0.00`
- `TOTAL_MARKDOWN_CALC` should match approximately **$214,582,505** (from our earlier validation with filtered registers)

If the numbers match, this fix is correct and can be applied to the main `metric_workflow.ipynb`.
