# Initialize a new Expectation Suite by profiling a batch of your data.
This process helps you avoid writing lots of boilerplate when authoring suites by allowing you to select columns and other factors that you care about and letting a profiler write some candidate expectations for you to adjust.

**Expectation Suite Name**: `profiling_test_snowflake`


In [1]:
import datetime

import pandas as pd

import great_expectations as gx
import great_expectations.jupyter_ux
from great_expectations.core.batch import BatchRequest
from great_expectations.checkpoint import SimpleCheckpoint
from great_expectations.exceptions import DataContextError

context = gx.get_context()

batch_request = {
    "datasource_name": "my_datasource_new",
    "data_connector_name": "default_inferred_data_connector_name",
    "data_asset_name": "my_schema.employees",
    "limit": 1000,
}

expectation_suite_name = "expectation_test"

validator = context.get_validator(
    batch_request=BatchRequest(**batch_request),
    expectation_suite_name=expectation_suite_name,
)
validator.head(n_rows=5, fetch_all=False)

2023-02-28T21:49:46+0530 - INFO - Great Expectations logging enabled at 20 level by JupyterUX module.
2023-02-28T21:49:46+0530 - INFO - FileDataContext loading zep config
2023-02-28T21:49:46+0530 - INFO - GxConfig.parse_yaml() failed with errors - [{'loc': ('xdatasources',), 'msg': 'field required', 'type': 'value_error.missing'}]
2023-02-28T21:49:46+0530 - INFO - GxConfig.parse_yaml() returning empty `xdatasources`
2023-02-28T21:49:46+0530 - INFO - Loading 'datasources' ->
{}
2023-02-28T21:49:46+0530 - INFO - Loaded 'datasources' ->
{}


Calculating Metrics:   0%|          | 0/1 [00:00<?, ?it/s]

Unnamed: 0,employee_id,first_name,last_name,email,phone_number,hire_date,job_id,salary,manager_id,department_id,elt_ts,elt_by,file_name
0,198.0,Donald,OConnell,*********,*********,21-JUN-07,SH_CLERK,2600.0,124.0,50.0,2023-02-06 04:21:14.674,local,employees.csv
1,199.0,Douglas,Grant,*********,*********,13-JAN-08,SH_CLERK,2600.0,124.0,50.0,2023-02-06 04:21:14.674,local,employees.csv
2,200.0,Jennifer,Whalen,*********,*********,17-SEP-03,AD_ASST,4400.0,101.0,10.0,2023-02-06 04:21:14.674,local,employees.csv
3,201.0,Michael,Hartstein,*********,*********,17-FEB-04,MK_MAN,13000.0,100.0,20.0,2023-02-06 04:21:14.674,local,employees.csv
4,202.0,Pat,Fay,*********,*********,17-AUG-05,MK_REP,6000.0,201.0,20.0,2023-02-06 04:21:14.674,local,employees.csv


# Run the UserConfigurableProfiler

The suites generated here are **not meant to be production suites** -- they are **a starting point to build upon**.

**To get to a production-grade suite, you will definitely want to [edit this
suite](https://docs.greatexpectations.io/en/latest/guides/how_to_guides/creating_and_editing_expectations/how_to_edit_an_expectation_suite_using_a_disposable_notebook.html?utm_source=notebook&utm_medium=profile_based_expectations)
after this initial step gets you started on the path towards what you want.**

This is highly configurable depending on your goals.
You can ignore columns or exclude certain expectations, specify a threshold for creating value set expectations, or even specify semantic types for a given column.
You can find more information about [how to configure this profiler, including a list of the expectations that it uses, here.](https://docs.greatexpectations.io/en/latest/guides/how_to_guides/creating_and_editing_expectations/how_to_create_an_expectation_suite_with_the_user_configurable_profiler.html)



In [2]:
result = context.run_profiler_with_dynamic_arguments(
    name="rule_based_profiler", batch_request=batch_request,
)
validator.expectation_suite = result.get_expectation_suite(
    expectation_suite_name=expectation_suite_name
)

ProfilerNotFoundError: Non-existent Profiler configuration named "rule_based_profiler".

Details: Unable to retrieve object from TupleFilesystemStoreBackend with the following Key: /Users/saisupriya/Desktop/OBE/gx_tutorials/great_expectations/profilers/rule_based_profiler.yml

# Save & review your new Expectation Suite

Let's save the draft expectation suite as a JSON file in the
`great_expectations/expectations` directory of your project and rebuild the Data
 Docs site to make it easy to review your new suite.

In [None]:
print(validator.get_expectation_suite(discard_failed_expectations=False))
validator.save_expectation_suite(discard_failed_expectations=False)

checkpoint_config = {
    "class_name": "SimpleCheckpoint",
    "validations": [
        {
            "batch_request": batch_request,
            "expectation_suite_name": expectation_suite_name,
        }
    ],
}
checkpoint = SimpleCheckpoint(
    f"{validator.active_batch_definition.data_asset_name}_{expectation_suite_name}",
    context,
    **checkpoint_config,
)
checkpoint_result = checkpoint.run()

context.build_data_docs()

validation_result_identifier = checkpoint_result.list_validation_result_identifiers()[0]
context.open_data_docs(resource_identifier=validation_result_identifier)

## Next steps
After you review this initial Expectation Suite in Data Docs you
should edit this suite to make finer grained adjustments to the expectations.
This can be done by running `great_expectations suite edit profiling_test_snowflake`.