## Overview
- **Expectation Used:** [expect_column_values_to_be_equal_to_or_less_than_profile_max](https://github.com/great-expectations/great_expectations/blob/develop/contrib/capitalone_dataprofiler_expectations/capitalone_dataprofiler_expectations/expectations/expect_column_values_to_be_equal_to_or_less_than_profile_max.py)
- **Expectation Description:** This expectation will take the report of an initial dataset and compare it to the report which is generated by an additional dataset of the same schema. The expectation is that the user specified column should contain values less than or equal to the max value metric generated in the report of the initial dataset.
- **Use Case:** If a user has data that tracks the daily spending on an account, they might want some data quality checks to track when daily spending reaches an all-time high. With this expectation, as new data is generated based on daily spending, it will raise an expectation violation if the new data indicates higher spending than the max daily spending recorded previously by this account. This is one very practical use for fraud monitoring and detection.
- **Example Details:** In this example, we are using this expectation to check the max age found in the original datasets report against all the age values in the new dataset. The expectation is that all the ages in the new dataset should be less than the max age of the original dataset, otherwise a violation will be raised indicating exactly what values caused a violation.

### Imports

In [None]:
import os

import pandas as pd
import numpy as np

# Great expectations imports
import great_expectations as ge
from capitalone_dataprofiler_expectations.expectations. \
    expect_column_values_to_be_equal_to_or_less_than_profile_max \
    import ExpectColumnValuesToBeEqualToOrLessThanProfileMax
from great_expectations.self_check.util import build_pandas_validator_with_data

# Data Profiler import
import dataprofiler as dp

### Setup
Below we going to import a dataset from the Data Profiler testing suite. This csv holds information on gun crimes statistics across 3 different years.

In [None]:
context = ge.get_context()

In [None]:
guns_data_path = "../../dataprofiler/tests/data/csv/guns.csv"
df = pd.read_csv(guns_data_path)
df.head()

In this example we are going to split up the dataset into three separate years so we can simulate a dataset which will have a yearly aggregation of data.

In [None]:
df.sort_values(by="year", axis=0, inplace=True)
years = df["year"].unique().tolist()
years.reverse()
years

Now that we have the years, we will capture all records from each year in their own dataframes, so we can process them separately.

In [None]:
individual_dataframes = []
for year in years:
    current_year_df = df.loc[df["year"]==year]
    current_year_df = current_year_df.drop("year", axis=1).drop("month", axis=1)
    individual_dataframes.append(current_year_df)
individual_dataframes[0].head()

Now we will create a report on the first `individual_dataframe` which corresponds to the year 2014, then we will output the `max` metric from the `age` column as found in the report.

In [None]:
profiler_options = dp.ProfilerOptions()
profiler_options.set({"data_labeler.is_enabled": False})
profile = dp.Profiler(individual_dataframes[0], len(individual_dataframes[0]), options=profiler_options)
report  = profile.report(report_options={"output_format": "compact"})

In [None]:
report['data_stats'][3]['statistics']['max']

### Running the Expectation
We build the validator by passing in the `individual_dataframe` corresponding to 2013. Then we will use the exception below to find values in the `age` column that exceed the `max` metric for the `age` column generated in `report`.

In [None]:
validator = build_pandas_validator_with_data(individual_dataframes[1])
results = validator.expect_column_values_to_be_equal_to_or_less_than_profile_max(
    column='age',
    profile=report
)

### Results
After we generate the expectation we find that there is one row with a value that exceeds the max age from the previous report with 107 as well as 11 rows with missing values.

In [None]:
results