# Scaffold a new Expectation Suite (Experimental)
This process helps you avoid writing lots of boilerplate when authoring suites by allowing you to select columns and other factors that you care about and letting a profiler write some candidate expectations for you to adjust.

**Expectation Suite Name**: `ygladkikh.stg_payment.warning`

We'd love it if you'd **reach out to us on** the [**Great Expectations Slack Channel**](https://greatexpectations.io/slack)!

In [3]:
import great_expectations as ge
from great_expectations.checkpoint import LegacyCheckpoint
from great_expectations.profile.user_configurable_profiler import UserConfigurableProfiler
from great_expectations.data_context.types.resource_identifiers import ValidationResultIdentifier

context = ge.data_context.DataContext()

expectation_suite_name = "ygladkikh.stg_payment.warning"

# Wipe the suite clean to prevent unwanted expectations in the batch
suite = context.create_expectation_suite(expectation_suite_name, overwrite_existing=True)

batch_kwargs = {'table': 'stg_payment', 'schema': 'ygladkikh', 'data_asset_name': 'ygladkikh.stg_payment', 'datasource': 'green'}
batch = context.get_batch(batch_kwargs, suite)
batch.head()

Unnamed: 0,user_id,pay_doc_type,pay_doc_num,account,phone,billing_period,pay_date,sum
0,10220,MIR,19450,FL-29144,79010980561,2013-02-01,2013-02-22,26343.0
1,10650,VISA,5337,FL-46012,79012101819,2020-04-01,2013-11-03,32982.0
2,11110,MIR,14024,FL-14714,79013438875,2015-06-01,2013-07-17,40925.0
3,10140,MIR,6696,FL-295,79011736295,2014-12-01,2013-11-13,19677.0
4,10230,MASTER,32859,FL-45208,79015457025,2020-04-01,2013-01-16,24876.0


## Select the columns on which you would like to scaffold expectations and those which you would like to ignore.

Great Expectations will choose which expectations might make sense for a column based on the **data type** and **cardinality** of the data in each selected column.

Simply comment out columns that are important and should be included. You can select multiple lines and
use a jupyter keyboard shortcut to toggle each line: **Linux/Windows**:
`Ctrl-/`, **macOS**: `Cmd-/`

In [5]:
ignored_columns = [
#     'user_id',
#     'pay_doc_type',
#     'pay_doc_num',
#     'account',
#     'phone',
#     'billing_period',
#     'pay_date',
#     'sum'
]

## Run the scaffolder

The suites generated here are **not meant to be production suites** - they are **scaffolds to build upon**.

**To get to a production grade suite, you will definitely want to [edit this
suite](https://docs.greatexpectations.io/en/latest/guides/how_to_guides/creating_and_editing_expectations/how_to_edit_an_expectation_suite_using_a_disposable_notebook.html?utm_source=notebook&utm_medium=scaffold_expectations)
after scaffolding gets you close to what you want.**

This is highly configurable depending on your goals.
You can ignore columns or exclude certain expectations, specify a threshold for creating value set expectations, or even specify semantic types for a given column.
You can find more information about [how to configure this profiler, including a list of the expectations that it uses, here.](https://docs.greatexpectations.io/en/latest/guides/how_to_guides/creating_and_editing_expectations/how_to_create_an_expectation_suite_with_the_user_configurable_profiler.html)



In [6]:
profiler = UserConfigurableProfiler(profile_dataset=batch,
    ignored_columns=ignored_columns,
    excluded_expectations=None,
    not_null_only=False,
    primary_or_compound_key=False,
    semantic_types_dict=None,
    table_expectations_only=False,
    value_set_threshold="MANY",
    )

suite = profiler.build_suite()

Using lossy conversion for decimal 10619.148413510747 to float object to support serialization.
Using lossy conversion for decimal 24992.723950870010 to float object to support serialization.
Using lossy conversion for decimal 24858.516274309110 to float object to support serialization.


Creating an expectation suite with the following expectations:

Table-Level Expectations
expect_table_columns_to_match_ordered_list
expect_table_row_count_to_be_between

Expectations by Column
Column Name: account | Column Data Type: STRING | Cardinality: VERY_MANY
expect_column_proportion_of_unique_values_to_be_between
expect_column_values_to_be_in_type_list
expect_column_values_to_not_be_null


Column Name: billing_period | Column Data Type: STRING | Cardinality: MANY
expect_column_proportion_of_unique_values_to_be_between
expect_column_values_to_be_in_set
expect_column_values_to_be_in_type_list
expect_column_values_to_not_be_null


Column Name: pay_date | Column Data Type: STRING | Cardinality: VERY_MANY
expect_column_proportion_of_unique_values_to_be_between
expect_column_values_to_be_in_type_list
expect_column_values_to_not_be_null


Column Name: pay_doc_num | Column Data Type: INT | Cardinality: VERY_MANY
expect_column_max_to_be_between
expect_column_mean_to_be_between
expect_col

## Save & review the scaffolded Expectation Suite

Let's save the scaffolded expectation suite as a JSON file in the
`great_expectations/expectations` directory of your project and rebuild the Data
 Docs site to make it easy to review the scaffolded suite.

In [7]:
context.save_expectation_suite(suite, expectation_suite_name)

results = LegacyCheckpoint(
    name="_temp_checkpoint",
    data_context=context,
    batches=[
        {
          "batch_kwargs": batch_kwargs,
          "expectation_suite_names": [expectation_suite_name]
        }
    ]
).run()
validation_result_identifier = results.list_validation_result_identifiers()[0]
context.build_data_docs()
context.open_data_docs(validation_result_identifier)

Using lossy conversion for decimal 10619.148413510747 to float object to support serialization.
Using lossy conversion for decimal 24992.723950870010 to float object to support serialization.
Using lossy conversion for decimal 24858.516274309110 to float object to support serialization.


## Next steps
After you review this scaffolded Expectation Suite in Data Docs you
should edit this suite to make finer grained adjustments to the expectations.
This can be done by running `great_expectations suite edit ygladkikh.stg_payment.warning`.