## Overview
- **Expectation Used:** [expect_column_values_to_be_equal_to_or_greater_than_profile_min](https://github.com/great-expectations/great_expectations/blob/develop/contrib/capitalone_dataprofiler_expectations/capitalone_dataprofiler_expectations/expectations/expect_column_values_to_be_equal_to_or_greater_than_profile_min.py)
- **Expectation Description:** This expectation will take the report of an initial dataset and compare it to the report which is generated by an additional dataset of the same schema. The expectation is that the user specified column should contain values greater than or equal to the min value metric generated in the report of the initial dataset.
- **Example Details:** In this example, the data owner is using this expectation to compare two different time series in the dataset they are using. They want to ensure that all salaries at their company are greater than or equal to the `min` value generated in the report from the original year.

### Imports

In [None]:
import os

import pandas as pd
import numpy as np

# Great expectations imports
import great_expectations as ge
from capitalone_dataprofiler_expectations.expectations. \
    expect_column_values_to_be_equal_to_or_greater_than_profile_min \
    import ExpectColumnValuesToBeEqualToOrGreaterThanProfileMin
from great_expectations.self_check.util import build_pandas_validator_with_data

# Data Profiler import
import dataprofiler as dp

### Setup
Below we are going to import a dataset from the Data Profile testing suite. This csv holds information on the salaries of individuals in the data science field from all over the world.

In [None]:
context = ge.get_context()

In [None]:
data_path = "../../dataprofiler/tests/data/csv/ds_salaries.csv"
data = dp.Data(data_path).data
data

In this example we are going to split up the dataset into three separate years so we can simulate a dataset which will have a yearly aggregation of data.

In [None]:
data.sort_values(by="work_year", axis=0, inplace=True)
years = data["work_year"].unique().tolist()
years

Now that we have the years, we will capture all records from each year in their own dataframes, so we can process them separately.

In [None]:
individual_dataframes = []
for year in years:
    current_year_df = data=data.loc[data["work_year"]==year]
    current_year_df = current_year_df.drop("work_year", axis=1)
    individual_dataframes.append(current_year_df)
individual_dataframes[0].head()

Now we will create a report on the first `individual_dataframe` which corresponds to the year **2020**, then we will output the `min` metric from the `salary_in_usd` column as found in the report.

In [None]:
profiler_options = dp.ProfilerOptions()
profiler_options.set({"data_labeler.is_enabled": False})
profile = dp.Profiler(individual_dataframes[0], len(individual_dataframes[0]), options=profiler_options)
report  = profile.report(report_options={"output_format": "compact"})

Let's take a look at the output `min` metric from the `salary_in_usd` column as found in the report

In [None]:
report['data_stats'][6]['statistics']['min']

### Running the Expectation
We build the validator by passing in the `individual_dataframe` corresponding to **2022**. Then we will use the exception below to find that there are no values in the `salary_in_usd` column that exceed the `max` metric for the `age` column generated in `report`.

In [None]:
validator = build_pandas_validator_with_data(individual_dataframes[1])
results = validator.expect_column_values_to_be_equal_to_or_greater_than_profile_min(
    column='salary_in_usd',
    profile=report
)

### Results
After the data owner generates the expectation results they find that there are 5 salaries recorded in **2022** which are less than the `min` `salaries_in_usd` recorded in **2020**. Despite that the expectation failed and there are some salaries which are lower two year after the original report, the expectation results still indicate which salaries are lower the data owner can tell the company where it might need to compensate more.

In [None]:
results