### Use Case 21

Description: Demonstrate the creation of a custom data quality rule, measure that rule against a dataset, and show the results.


### **Overview**
This worksheet demonstrates **Snowflake's built-in data quality monitoring and governance features**, enabling **proactive data integrity checks**.

Snowflake supports **custom data metric functions** to validate and monitor data quality **natively**.

### **How It Differs from Competitors:**
✅ **No External Tools Needed** – Data quality checks are built directly into Snowflake.  
✅ **Runs Natively in Snowflake** – Reduces complexity by avoiding external processing.  
✅ **Fully Automated Scheduling** – No need for manual intervention; metrics can be scheduled automatically.  

These features ensure **high-quality, well-governed data** without relying on additional monitoring frameworks.

In [None]:
CREATE OR REPLACE DATA METRIC FUNCTION
  invalid_email_count (ARG_T table(ARG_C1 STRING))
  RETURNS NUMBER AS
  'SELECT COUNT_IF(FALSE = (
    ARG_C1 REGEXP ''^[A-Za-z0-9.~*_%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,4}$''))
    FROM ARG_T';

ALTER TABLE frostbyte_tasty_bytes.raw_customer.customer_loyalty SET DATA_METRIC_SCHEDULE = '5 MINUTES';

ALTER TABLE frostbyte_tasty_bytes.raw_customer.customer_loyalty ADD DATA METRIC FUNCTION
  invalid_email_count ON (e_mail);

### Retrieving Data Quality Metrics

Snowflake allows querying metadata to track **applied metrics** and **validation results** in real time.  
Unlike other solutions that require **manual logging**, Snowflake provides **built-in** tracking for seamless monitoring.

With these capabilities, NGSC can efficiently **audit, validate, and enforce** data quality standards **without additional overhead**.


In [None]:
SELECT * FROM TABLE(INFORMATION_SCHEMA.DATA_METRIC_FUNCTION_REFERENCES(
  REF_ENTITY_NAME => 'frostbyte_tasty_bytes.raw_customer.customer_loyalty',
  REF_ENTITY_DOMAIN => 'TABLE'));

In [None]:
SELECT scheduled_time, measurement_time, table_name, metric_name, value
FROM SNOWFLAKE.LOCAL.DATA_QUALITY_MONITORING_RESULTS
WHERE METRIC_NAME = 'INVALID_EMAIL_COUNT'
AND METRIC_DATABASE = 'FROSTBYTE_TASTY_BYTES'
LIMIT 100;

### Use Case 22

Desription: Demonstrate the measuring of data quality using the platforms built-in data quality rules and show the results.

Solution: Snowflake provides **out-of-the-box** data quality metrics, eliminating the need for additional scripts.  
Unlike other platforms, Snowflake allows applying these functions **directly at query time** for real-time monitoring.

### Available Metrics:
- **BLANK_COUNT, BLANK_PERCENT** – Track empty string values.
- **NULL_COUNT, NULL_PERCENT** – Monitor missing data.
- **Freshness** – Measure data timeliness.
- **AVG, MAX, MIN, STDDEV** – Compute statistical insights.
- **DUPLICATE_COUNT, UNIQUE_COUNT** – Identify duplicates and unique values.
- **ROW_COUNT** – Count total records.

These built-in metrics ensure **automated and scalable** data quality enforcement within Snowflake.


In [None]:
-- Example: Check the percentage of NULL values in the 'favourite_brand' column
SELECT snowflake.core.row_count (SELECT favourite_brand FROM frostbyte_tasty_bytes.raw_customer.customer_loyalty);



In [None]:
-- Example: Execute a full data quality scan on a specific metric
SELECT *
  FROM TABLE(SYSTEM$DATA_METRIC_SCAN(
    REF_ENTITY_NAME  => 'frostbyte_tasty_bytes.raw_customer.customer_loyalty',
    METRIC_NAME  => 'snowflake.core.null_percent',
    ARGUMENT_NAME => 'favourite_brand'
   ));

### **Summary**

Snowflake provides **built-in** capabilities for **data quality monitoring** and **governance**, eliminating the need for external tools.

### **Key Features:**
- ✅ **Integrated Data Metric Functions** – Perform data quality checks directly in Snowflake without third-party tools.  
- ✅ **Automated Scheduling** – Data quality validations can be scheduled natively for continuous enforcement.  
- ✅ **Built-in System Metrics** – Monitor **completeness, uniqueness, and data freshness** in real-time.  
- ✅ **Fully Queryable Governance Metadata** – Enables proactive monitoring and auditing of data quality.  

With these features, Snowflake ensures **scalable, automated, and enterprise-ready data quality enforcement.**
