### Using Great Expectations for Automated Data Checks
**Objective**: Use Great Expectations to perform data validation steps on a dataset.

**Task 1**: Validate Column Existence

**Steps**:
- Load your dataset using a Pandas DataFrame.
- Use Great Expectations to setup an expectation suite.
- Create an expectation to confirm that a specific column (e.g., customer_id ) exists in your dataset.
- Run the expectation and observe the results.

In [9]:
import pandas as pd
import great_expectations as gx

# Sample data
df = pd.DataFrame({
    "customer_id": [1, 2, 3],
    "name": ["Alice", "Bob", "Charlie"],
    "age": [25, 30, 35]
})

# Create a temporary GE context
context = gx.get_context(mode="ephemeral")

# Add Pandas DataFrame as a Data Asset using Fluent API
datasource = context.datasources.add_pandas(name="my_pandas")
data_asset = datasource.add_dataframe_asset(name="customer_data", dataframe=df)
batch = data_asset.build_batch()

# Create an Expectation Suite
suite = context.add_or_update_expectation_suite(name="customer_suite")

# Validate the batch with the suite
validator = context.get_validator(expectation_suite=suite, batch=batch)

# Add expectation
validator.expect_column_to_exist("customer_id")

# Save expectations
validator.save_expectation_suite(discard_failed_expectations=False)

# Run validation
checkpoint = context.add_or_update_checkpoint(
    name="simple_checkpoint",
    validator=validator
)
results = checkpoint.run()

# Show result
print(results)

AttributeError: 'EphemeralDataContext' object has no attribute 'datasources'

**Task 2**: Validate Column Data Types

**Steps**:
- Using the same dataset setup, create an expectation to check that a numeric column
(e.g., purchase_amount ) contains only float values.
- Identify a numeric column in your dataset.
- Use Great Expectations to create and validate an expectation that checks the column's data type is correct.
- Run your expectation and check if it passes for your data.

In [None]:
# write your code from here
import pandas as pd
import great_expectations as gx

# Sample dataset
df = pd.DataFrame({
    "customer_id": [1, 2, 3],
    "name": ["Alice", "Bob", "Charlie"],
    "purchase_amount": [100.50, 200.75, 150.00]
})

# Create an ephemeral GE context (no files written to disk)
context = gx.get_context(mode="ephemeral")

# Register Pandas datasource and create a data asset
datasource = context.datasources.add_pandas(name="my_pandas")
data_asset = datasource.add_dataframe_asset(name="customer_data", dataframe=df)
batch = data_asset.build_batch()

# Create expectation suite
suite = context.add_or_update_expectation_suite(name="column_type_suite")

# Get validator to work with the batch and suite
validator = context.get_validator(expectation_suite=suite, batch=batch)

# Add expectation: purchase_amount column should be of float type
validator.expect_column_values_to_be_of_type("purchase_amount", "float")

# Save the expectation suite
validator.save_expectation_suite(discard_failed_expectations=False)

# Run validation through a checkpoint
checkpoint = context.add_or_update_checkpoint(
    name="type_validation_checkpoint",
    validator=validator
)
results = checkpoint.run()

# Show validation results
print(results)

AttributeError: 'EphemeralDataContext' object has no attribute 'datasources'

**Task 3**: Validate Range of Values

**Steps**:
- Set an expectation using Great Expectations to ensure that a column (e.g., age ) values
are between 18 and 65.
- Identify a column in your dataset where values fall within a specific range.
- Implement a range-based expectation to check this column and validate your dataset.
- Observe and interpret the result of your expectation.

In [None]:
# write your code from here
import pandas as pd
import great_expectations as gx

# Sample dataset
df = pd.DataFrame({
    "customer_id": [1, 2, 3, 4],
    "name": ["Alice", "Bob", "Charlie", "Diana"],
    "age": [25, 40, 17, 70],  # Contains values outside expected range
    "purchase_amount": [100.5, 200.75, 150.0, 180.25]
})

# Create ephemeral context
context = gx.get_context(mode="ephemeral")

# Add Pandas datasource and data asset
datasource = context.datasources.add_pandas(name="my_pandas")
data_asset = datasource.add_dataframe_asset(name="customer_data", dataframe=df)
batch = data_asset.build_batch()

# Create or update expectation suite
suite = context.add_or_update_expectation_suite(name="range_validation_suite")

# Get validator
validator = context.get_validator(expectation_suite=suite, batch=batch)

# Expect 'age' column values to be between 18 and 65
validator.expect_column_values_to_be_between("age", min_value=18, max_value=65)

# Save expectation suite
validator.save_expectation_suite(discard_failed_expectations=False)

# Run validation
checkpoint = context.add_or_update_checkpoint(
    name="age_range_checkpoint",
    validator=validator
)
results = checkpoint.run()

# Show validation results
print(results)
#

AttributeError: 'EphemeralDataContext' object has no attribute 'datasources'