# Monitoring: Examine data drift
* In this stage, the new input data seen after model deployment is examined for data drift based on the original data quality definitions, saved as GX Expectation Suites.
* New input data is based on data from the UCI ML Repository [Statlog (Heart) dataset](https://archive.ics.uci.edu/dataset/145/statlog+heart).

In [None]:
import great_expectations as gx

import demo_code as demo

## Load new incoming patient data

Load the "new" patient data: live data that has arrived since the model was deployed.

In [None]:
df_new_patient_data = demo.data.get_new_patient_data()
df_new_patient_data.head(n=10)

Persist the new patient data in a Postgres table for data validation with GX.

In [None]:
rows_written = demo.data.write_df_to_postgres(
    table_name="heart_disease_new_data", df=df_new_patient_data
)
print(f"{rows_written} rows written to Postgres.")

## Retrieve Expectation Suite that defined quality for original data

Use a Cloud Data Context to retrieve the original definitions of (data preparation and model development) data quality, or Expectation Suites, from GX Cloud.

In [None]:
# Retrieve existing Expectation Suites from GX Cloud.
cloud_context = gx.get_context(mode="cloud")

distribution_suite = cloud_context.suites.get(name="Heart disease data: distribution")

schema_and_validity_suite = cloud_context.suites.get(
    name="Heart disease data: schema and validity"
)

## Review comparision of training and live data quality using Data Docs

Create a Ephemeral Data Context to run data validation locally and write the results of validation to Data Docs. The code below uses the same entities and workflow as shown in `01_Prep_data.ipynb`.

In [None]:
# Code to create a containerized Data Docs site for demo.
local_context = gx.get_context(mode="ephemeral")

# Code to create a containerized Data Docs site for demo.
local_context.add_data_docs_site(
    site_config={
        "class_name": "SiteBuilder",
        "show_how_to_buttons": False,
        "store_backend": {
            "class_name": "TupleFilesystemStoreBackend",
            "base_directory": "/gx/gx_volume/data_docs",
        },
        "site_index_builder": {"class_name": "DefaultSiteIndexBuilder"},
    },
    site_name="GX in the ML pipeline demo",
)

data_source = local_context.data_sources.add_postgres(
    "postgres", connection_string=demo.data.POSTGRES_CONNECTION_STRING
)

data_asset = data_source.add_table_asset(
    name="New heart disease data", table_name="heart_disease_new_data"
)

batch_definition = data_asset.add_batch_definition_whole_table(
    "new data batch definition"
)

local_context.suites.add(schema_and_validity_suite)
local_context.suites.add(distribution_suite)

schema_and_validity_validation_definition = gx.ValidationDefinition(
    name="schema and validity validation definition",
    data=batch_definition,
    suite=schema_and_validity_suite,
)

distribution_validation_definition = gx.ValidationDefinition(
    name="distribution validation definition",
    data=batch_definition,
    suite=distribution_suite,
)

local_context.validation_definitions.add(schema_and_validity_validation_definition)
local_context.validation_definitions.add(distribution_validation_definition)

checkpoint = local_context.checkpoints.add(
    gx.Checkpoint(
        name="checkpoint",
        validation_definitions=[
            schema_and_validity_validation_definition,
            distribution_validation_definition,
        ],
        actions=[gx.checkpoint.actions.UpdateDataDocsAction(name="update_data_docs")],
    )
)

results = checkpoint.run()

### View Expectation Suite and Validation Results in [Data Docs](http://localhost:3000)

## Review comparision of training and live data quality using GX Cloud

Create a Cloud Data Context. Add a Data Asset for the new data, and validate it using the original Expectation Suites developed in the data preparation phase. The code below uses the same entities and workflow as shown in `01_Prep_data.ipynb`.

In [None]:
cloud_context = gx.get_context(mode="cloud")

data_source = cloud_context.data_sources.get("demo database")

data_asset = data_source.add_table_asset(
    name="New heart disease data", table_name="heart_disease_new_data"
)

batch_definition = data_asset.add_batch_definition_whole_table(
    "new data batch definition"
)

cloud_schema_and_validity_validation_definition = gx.ValidationDefinition(
    name="new data schema and validity validation definition",
    data=batch_definition,
    suite=cloud_context.suites.get(name="Heart disease data: schema and validity"),
)

cloud_distribution_validation_definition = gx.ValidationDefinition(
    name="new data distribution validation definition",
    data=batch_definition,
    suite=cloud_context.suites.get(name="Heart disease data: distribution"),
)

cloud_context.validation_definitions.add(
    cloud_schema_and_validity_validation_definition
)

cloud_context.validation_definitions.add(cloud_distribution_validation_definition)

cloud_checkpoint = cloud_context.checkpoints.add(
    gx.Checkpoint(
        name="New heart disease data checkpoint",
        validation_definitions=[
            cloud_schema_and_validity_validation_definition,
            cloud_distribution_validation_definition,
        ],
    )
)

results = cloud_checkpoint.run()

### View Expectation Suites and Validation Results in [GX Cloud](https://app.greatexpectations.io)