## Automated Data Quality Monitoring
**Objective**: Use Great Expectations to perform data profiling and write validation rules.

1. Data Profiling with Great Expectations

### Profile a JSON dataset with product sales data to check for null values in the 'ProductID' and 'Price' fields.
- Create an expectation suite and connect it to the data context.
- Use the `expect_column_values_to_not_be_null` expectation to profile these fields.
- Review the summary to identify any unexpected null values.

In [None]:
# write your code from here
[
  {"ProductID": "PROD123", "ProductName": "Laptop", "Price": 1200.00},
  {"ProductID": "PROD456", "ProductName": "Mouse", "Price": 25.00},
  {"ProductID": null, "ProductName": "Keyboard", "Price": 75.00},
  {"ProductID": "PROD789", "ProductName": "Monitor", "Price": null},
  {"ProductID": "PROD012", "ProductName": "Webcam", "Price": 50.00}
]

import great_expectations as gx
import pandas as pd

# Assuming your JSON file is named 'product_sales.json' and is in the same directory
json_file_path = 'product_sales.json'

# Load the JSON dataset using pandas
df = pd.read_json(json_file_path)

# Get the DataContext
context = gx.get_context()

# Create a datasource for your pandas DataFrame
datasource_name = "my_json_datasource"
pandas_datasource = context.sources.add_pandas(name=datasource_name)

# Create a data asset
data_asset_name = "product_sales_asset"
data_asset = pandas_datasource.add_dataframe_asset(name=data_asset_name, dataframe=df)

# Build a batch request
batch_request = data_asset.build_batch_request()

# Create an expectation suite
expectation_suite_name = "product_null_check_suite"
try:
    suite = context.get_expectation_suite(expectation_suite_name)
    print(f"Found existing Expectation Suite '{expectation_suite_name}'.")
except gx.exceptions.ExpectationSuiteNotFoundError:
    suite = context.create_expectation_suite(expectation_suite_name)
    print(f"Created a new Expectation Suite '{expectation_suite_name}'.")

# Add the expectation to check for null values in 'ProductID'
suite.add_expectation(
    gx.expectations.expect_column_values_to_not_be_null(column="ProductID")
)

# Add the expectation to check for null values in 'Price'
suite.add_expectation(
    gx.expectations.expect_column_values_to_not_be_null(column="Price")
)

# Save the expectation suite
context.save_expectation_suite(suite)

print(f"\nSuccessfully created the Expectation Suite '{expectation_suite_name}' with null checks.")
print("\nYou can now run a validation against this suite to see if there are any null values.")
print(f"The Expectation Suite can be found here: {context.get_expectation_suite_path(expectation_suite_name)}")

# To actually validate the data against the suite:
validator = context.get_validator(
    batch_request=batch_request,
    expectation_suite_name=expectation_suite_name,
)

validation_result = validator.validate()

print("\nValidation Results:")
print(validation_result.statistics)



2. Writing Validation Rules for Data Ingestion

### Define validation rules for an API data source to confirm that 'Status' field contains only predefined statuses ('Active', 'Inactive').

- Apply `expect_column_values_to_be_in_set` to check field values during data ingestion.
- Execute the validation and review any mismatches.