### Using Great Expectations for Automated Data Checks
**Objective**: Use Great Expectations to perform data validation steps on a dataset.

**Task 1**: Validate Column Existence

**Steps**:
- Load your dataset using a Pandas DataFrame.
- Use Great Expectations to setup an expectation suite.
- Create an expectation to confirm that a specific column (e.g., customer_id ) exists in your dataset.
- Run the expectation and observe the results.

In [2]:
import  ge
import pandas as pd

# Sample dataset
data = {'customer_id': [1, 2, 3], 'purchase_amount': [100.5, 200.0, 150.75], 'age': [25, 30, 35]}
df = pd.DataFrame(data)

# Convert the Pandas DataFrame to a Great Expectations DataFrame
ge_df = ge.from_pandas(df)

# Validate that the 'customer_id' column exists
expectation_result = ge_df.expect_column_to_exist('customer_id')
print(expectation_result)

ModuleNotFoundError: No module named 'ge'

In [None]:
# write your code from here

**Task 2**: Validate Column Data Types

**Steps**:
- Using the same dataset setup, create an expectation to check that a numeric column
(e.g., purchase_amount ) contains only float values.
- Identify a numeric column in your dataset.
- Use Great Expectations to create and validate an expectation that checks the column's data type is correct.
- Run your expectation and check if it passes for your data.

In [None]:
# write your code from here

In [3]:
# Validate that the 'purchase_amount' column contains only float values
expectation_result = ge_df.expect_column_values_to_be_of_type('purchase_amount', 'float')
print(expectation_result)

NameError: name 'ge_df' is not defined

**Task 3**: Validate Range of Values

**Steps**:
- Set an expectation using Great Expectations to ensure that a column (e.g., age ) values
are between 18 and 65.
- Identify a column in your dataset where values fall within a specific range.
- Implement a range-based expectation to check this column and validate your dataset.
- Observe and interpret the result of your expectation.

In [None]:
# write your code from here

In [4]:
# Validate that the 'age' column values are between 18 and 65
expectation_result = ge_df.expect_column_values_to_be_between('age', min_value=18, max_value=65)
print(expectation_result)

NameError: name 'ge_df' is not defined