# Edit Your Expectation Suite
Use this notebook to recreate and modify your expectation suite:

**Expectation Suite Name**: `customer_data_suite`

## Import Necessary Libraries

In [1]:
import great_expectations as ge
from great_expectations.core.batch import BatchRequest
from great_expectations.dataset import PandasDataset

## Load the Data Context

In [2]:
# Load the data context (this automatically picks up the configuration from expectation.yaml)
context = ge.data_context.DataContext()

In [None]:
# Define the connection to your PostgreSQL datasource
datasource_config = {
    "name": "my_postgres_datasource",
    "class_name": "Datasource",
    "module_name": "great_expectations.datasource",
    "execution_engine": {
        "class_name": "SqlAlchemyExecutionEngine",
        "connection_string": "postgresql+psycopg2://<username>:<password>@<host>:<port>/<database_name>"
    },
    "data_connectors": {
        "default_inferred_data_connector_name": {
            "class_name": "InferredAssetSqlDataConnector",
            "include_schema_name": True,
            "default_regex": {
                "pattern": "(.*)",
                "group_names": ["table_name"]
            }
        }
    }
}

context.test_yaml_config(yaml.dump(datasource_config))
context.add_datasource(**datasource_config)

## Create a Batch Request
Use the pre-configured PostgreSQL connection to create a batch request:

In [3]:
# Create a batch request to load data from PostgreSQL using the pre-defined configuration
batch_request = BatchRequest(
    datasource_name="postgres_datasource",  # This name should match what you have in your expectation.yaml
    data_connector_name="default_inferred_data_connector_name",
    data_asset_name="public.customer_Muyiwa",  # schema and table name
)

## Validator Setup and Data Preview
Create a validator to load the data and provide a sneak peek:

In [4]:
# Get a validator to load the data and validate it
validator = context.get_validator(
    batch_request=batch_request,
    expectation_suite_name="customer_data_suite"  # Replace with your suite name
)

# Sneak peek at the data
print("First 5 rows of the data:")
validator.head()

First 5 rows of the data:


Calculating Metrics:   0%|          | 0/1 [00:00<?, ?it/s]

Unnamed: 0,id,company,last_name,first_name,phone,address,city_and_state,postal_code,country
0,103,Atelier graphique,Schmitt,Carine,40.32.2555,54 Rue Royale,Nantes,44000,France
1,112,Signal Gift Stores,King,Jean,7025551838,8940 Strong St.,Las Vegas NV,83030,USA
2,114,Australian Collectors Co.,Ferguson,Peter,03 5422 4555,636 St Kilda Road Level 3,Melbourne Victoria,3004,Australia
3,119,La Rochelle Gifts,Labrune,Janine,40.42.3677,43 Rue des Cinquante Otages,Nantes,44000,France
4,121,Baane Mini Imports,Bergulfsen,Jonas,31 12 2555,2577 Erling Skakkes gate 78,Stavern,4110,Norway


## Define your expectations

In [5]:
# 1. Expect the 'id' column to exist
validator.expect_column_to_exist("id")

# 2. Expect 'id' values to not be null
validator.expect_column_values_to_not_be_null("id")

# 3. Expect 'id' values to be unique
validator.expect_column_values_to_be_unique("id")

# 4. Expect 'phone' values to match a specific regex pattern (assuming phone number format)
validator.expect_column_values_to_match_regex("phone", r"^\d{2} \d{2} \d{4}$")


Calculating Metrics:   0%|          | 0/2 [00:00<?, ?it/s]

Calculating Metrics:   0%|          | 0/8 [00:00<?, ?it/s]

Calculating Metrics:   0%|          | 0/10 [00:00<?, ?it/s]

Calculating Metrics:   0%|          | 0/11 [00:00<?, ?it/s]

{
  "success": false,
  "result": {
    "element_count": 126,
    "unexpected_count": 122,
    "unexpected_percent": 96.82539682539682,
    "partial_unexpected_list": [
      "40.32.2555",
      "7025551838",
      "03 5422 4555",
      "40.42.3677",
      "(26) 642-7555",
      "+49 69 66 90 2555",
      "6505555787",
      "2125557818",
      "(91) 555 94 44",
      "0921-12 3555",
      "78.32.5555",
      "+65 221 7555",
      "2125557413",
      "2155551555",
      "6505556809",
      "+65 224 1555",
      "+47 2267 3215",
      "2035557845",
      "(1) 356-5555",
      "20.16.1555"
    ],
    "missing_count": 0,
    "missing_percent": 0.0,
    "unexpected_percent_total": 96.82539682539682,
    "unexpected_percent_nonmissing": 96.82539682539682
  },
  "meta": {},
  "exception_info": {
    "raised_exception": false,
    "exception_traceback": null,
    "exception_message": null
  }
}

## Saving the Expectation Suite
After defining the expectations, you need to save the expectation suite to make it reusable in your project:

In [6]:
# Save the expectation suite
validator.save_expectation_suite(discard_failed_expectations=False)

# To see the validation results in your data docs, you can build the data docs
context.build_data_docs()

{'local_site': 'file://C:\\Users\\kanzo\\PycharmProjects\\InterviewQACode\\APIPython\\gx\\uncommitted/data_docs/local_site/index.html'}

## Optional: Validate the data

In [12]:
validation_results = validator.validate()

# Print validation results
print("Validation Results:")
print(validation_results)

2024-08-24T13:32:05+0100 - INFO - 	4 expectation(s) included in expectation_suite.


Calculating Metrics:   0%|          | 0/18 [00:00<?, ?it/s]

Validation Results:
{
  "success": false,
  "results": [
    {
      "success": true,
      "expectation_config": {
        "expectation_type": "expect_column_to_exist",
        "kwargs": {
          "column": "id",
          "batch_id": "cf8584b845d10659b24ba99be273a0c5"
        },
        "meta": {}
      },
      "result": {},
      "meta": {},
      "exception_info": {
        "raised_exception": false,
        "exception_traceback": null,
        "exception_message": null
      }
    },
    {
      "success": true,
      "expectation_config": {
        "expectation_type": "expect_column_values_to_not_be_null",
        "kwargs": {
          "column": "id",
          "batch_id": "cf8584b845d10659b24ba99be273a0c5"
        },
        "meta": {}
      },
      "result": {
        "element_count": 126,
        "unexpected_count": 0,
        "unexpected_percent": 0.0,
        "partial_unexpected_list": []
      },
      "meta": {},
      "exception_info": {
        "raised_exception": fa