### Using Great Expectations for Automated Data Checks
**Objective**: Use Great Expectations to perform data validation steps on a dataset.

**Task 1**: Validate Column Existence

**Steps**:
- Load your dataset using a Pandas DataFrame.
- Use Great Expectations to setup an expectation suite.
- Create an expectation to confirm that a specific column (e.g., customer_id ) exists in your dataset.
- Run the expectation and observe the results.

In [None]:
# write your code from here
import pandas as pd
import great_expectations as gx

# Step 1: Load your dataset
data = pd.DataFrame({
    'customer_id': [1, 2, 3],
    'name': ['Alice', 'Bob', 'Charlie'],
    'email': ['a@example.com', 'b@example.com', 'c@example.com']
})

# Step 2: Create a Great Expectations context
context = gx.get_context()

# Step 3: Convert the DataFrame into a GX dataset
df_ge = context.sources.pandas_default.read_dataframe(data)

# Step 4: Create or update the expectation suite
suite_name = "column_existence_suite"
suite = context.add_or_update_expectation_suite(suite_name)

# Step 5: Add expectation to check if 'customer_id' exists
df_ge.expect_column_to_exist("customer_id")

# Step 6: Save the expectation suite
df_ge.save_expectation_suite(discard_failed_expectations=False)

# Step 7: Validate the data using the expectation suite
results = df_ge.validate(expectation_suite=suite)

# Step 8: Print the validation results
print("Validation Success:", results["success"])
print("Detailed Results:")
for res in results["results"]:
    print(f" - {res['expectation_config']['expectation_type']} on column '{res['expectation_config']['kwargs'].get('column')}' =>", "Passed" if res['success'] else "Failed")


**Task 2**: Validate Column Data Types

**Steps**:
- Using the same dataset setup, create an expectation to check that a numeric column
(e.g., purchase_amount ) contains only float values.
- Identify a numeric column in your dataset.
- Use Great Expectations to create and validate an expectation that checks the column's data type is correct.
- Run your expectation and check if it passes for your data.

In [None]:
# write your code from here
import pandas as pd
import great_expectations as gx

# Step 1: Load the dataset
data = pd.DataFrame({
    'customer_id': [1, 2, 3],
    'purchase_amount': [99.99, 149.50, 20.0],  # All floats
    'name': ['Alice', 'Bob', 'Charlie']
})

# Step 2: Create a Great Expectations context
context = gx.get_context()

# Step 3: Read the DataFrame into a GX DataAsset
df_ge = context.sources.pandas_default.read_dataframe(data)

# Step 4: Create or update the expectation suite
suite_name = "column_dtype_suite"
suite = context.add_or_update_expectation_suite(suite_name)

# Step 5: Expect 'purchase_amount' column to have float values
df_ge.expect_column_values_to_be_of_type("purchase_amount", "float")

# Step 6: Save the expectation suite
df_ge.save_expectation_suite(discard_failed_expectations=False)

# Step 7: Validate the data
results = df_ge.validate(expectation_suite=suite)

# Step 8: Print results
print("Validation Success:", results["success"])
print("Detailed Results:")
for res in results["results"]:
    column = res['expectation_config']['kwargs'].get('column')
    expected_type = res['expectation_config']['kwargs'].get('type_')
    print(f" - Column '{column}' expected type '{expected_type}' =>", "Passed" if res['success'] else "Failed")


**Task 3**: Validate Range of Values

**Steps**:
- Set an expectation using Great Expectations to ensure that a column (e.g., age ) values
are between 18 and 65.
- Identify a column in your dataset where values fall within a specific range.
- Implement a range-based expectation to check this column and validate your dataset.
- Observe and interpret the result of your expectation.

In [None]:
# write your code from here
import pandas as pd
import great_expectations as gx

# Step 1: Load the dataset
data = pd.DataFrame({
    'customer_id': [1, 2, 3, 4],
    'age': [22, 35, 70, 16],  # 70 and 16 are outside range
    'name': ['Alice', 'Bob', 'Charlie', 'Diana']
})

# Step 2: Create a Great Expectations context
context = gx.get_context()

# Step 3: Convert the DataFrame to a GX-compatible dataset
df_ge = context.sources.pandas_default.read_dataframe(data)

# Step 4: Create or update the expectation suite
suite_name = "value_range_suite"
suite = context.add_or_update_expectation_suite(suite_name)

# Step 5: Set expectation for age values to be between 18 and 65
df_ge.expect_column_values_to_be_between("age", min_value=18, max_value=65)

# Step 6: Save the expectation suite
df_ge.save_expectation_suite(discard_failed_expectations=False)

# Step 7: Validate the data
results = df_ge.validate(expectation_suite=suite)

# Step 8: Print results
print("Validation Success:", results["success"])
print("Detailed Results:")
for res in results["results"]:
    column = res['expectation_config']['kwargs'].get('column')
    min_val = res['expectation_config']['kwargs'].get('min_value')
    max_val = res['expectation_config']['kwargs'].get('max_value')
    print(f" - Column '{column}' expected between {min_val} and {max_val} =>", "Passed" if res['success'] else "Failed")
