# Author Expectations

Watch a [short tutorial video](https://docs.greatexpectations.io/en/latest/getting_started/create_expectations.html?utm_source=notebook&utm_medium=create_expectations#video) or read [the written tutorial](https://docs.greatexpectations.io/en/latest/getting_started/create_expectations.html?utm_source=notebook&utm_medium=create_expectations)

We'd love it if you **reach out for help on** the [**Great Expectations Slack Channel**](https://greatexpectations.io/slack)

In [None]:
import json
import os
import great_expectations as ge
import great_expectations.jupyter_ux
import pandas as pd

## 1. Get a DataContext.
This represents your project that you just created using `great_expectations init`. [Read more in the tutorial](https://docs.greatexpectations.io/en/latest/getting_started/create_expectations.html?utm_source=notebook&utm_medium=create_expectations#get-datacontext-object)

In [None]:
context = ge.data_context.DataContext()

## 2. List the tables in your database.

The `DataContext` will now introspect your database (a `Datasource`) and list the tables. [Read more in the tutorial](https://docs.greatexpectations.io/en/latest/getting_started/create_expectations.html?utm_source=notebook&utm_medium=create_expectations#data-assets)

In [None]:
great_expectations.jupyter_ux.list_available_data_asset_names(context)

## 3. Pick a table and set the expectation suite name

Internally, Great Expectations represents tables as `DataAsset`s and uses this notion to link them to `Expectation Suites`. To learn more about `DataAssets` and how their names are built, see [the reference](https://docs.great_expectations.io/en/latest/reference/data_context_reference.html#data-asset-names). 

In [None]:
table_name = "YOUR_TABLE"  # TODO: replace with your value!
data_asset_name = context.normalize_data_asset_name(table_name)

We recommend naming your first expectation suite for a table `warning`. Later, as you identify some of the expectations that you add to this suite as critical, you can move these expectations into another suite and call it `failure`.

In [None]:
expectation_suite_name = "warning" # TODO: replace with your value!

## 4. Create a new empty expectation suite

In [None]:
context.create_expectation_suite(data_asset_name=data_asset_name, expectation_suite_name=expectation_suite_name)

## 5. Load a batch of data you want to use to create `Expectations`

To learn more about `get_batch` with other data types (such as csv files, pandas, or Spark), see [this tutorial](https://docs.greatexpectations.io/en/latest/getting_started/create_expectations.html?utm_source=notebook&utm_medium=create_expectations#get-batch)

In [None]:
# If you would like to load an entire table or view:
batch_kwargs = {'table': table_name}

# If you would like to load an entire table or view and your database uses schemas:
batch_kwargs = {'table': "users", "schema": "asana"}

# If you would like to load data using a query to construct a temporary table:
# batch_kwargs = {'query': 'SELECT YOUR_ROWS FROM YOUR_TABLE'}

Load a bath of data and take a peek at the first few rows.

In [None]:
batch = context.get_batch(data_asset_name, expectation_suite_name, batch_kwargs)
batch.head()

#### Optionally, customize and review batch options

`BatchKwargs` are extremely flexible - to learn more [read the tutorial](https://docs.greatexpectations.io/en/latest/getting_started/create_expectations.html?utm_source=notebook&utm_medium=create_expectations#reader-options)

Here are the batch kwargs used to load your batch

In [None]:
batch.batch_kwargs

In [None]:
# The datasource can add and store additional identifying information to ensure you can track a batch through
# your pipeline
batch.batch_id

## 6. Author Expectations

With a batch, you can add expectations by calling specific expectation methods. They all begin with `.expect_` which makes autocompleting easy.

See available expectations in the [expectation glossary](https://docs.greatexpectations.io/en/latest/glossary.html?utm_source=notebook&utm_medium=create_expectations).
You can also see available expectations by hovering over data elements in the HTML page generated by profiling your dataset.

Below is an example expectation that checks if the values in the batch's first column are null.

[Read more in the tutorial](https://docs.greatexpectations.io/en/latest/getting_started/create_expectations.html?utm_source=notebook&utm_medium=create_expectations#create-expectations)

In [None]:
column_name = batch.get_table_columns()[0]
batch.expect_column_values_to_not_be_null(column_name)

Add more expectations here. **Hint** start with `batch.expect_` and hit tab for Jupyter's autocomplete to see all the expectations!

In [None]:
batch.expect_column_values_to_be_between('id', min_value=10000000000, max_value=9999999999999999999)

## 7. Review and save your Expectations

Expectations that are `True` on this data batch are added automatically. To view all the expectations you added so far about this data asset, run the cell below.

In [None]:
batch.get_expectation_suite()

    
    
If you decide not to save some expectations that you created, use [remove_expectaton method](https://docs.greatexpectations.io/en/latest/module_docs/data_asset_module.html?highlight=remove_expectation&utm_source=notebook&utm_medium=create_expectations#great_expectations.data_asset.data_asset.DataAsset.remove_expectation). You can also choose not to filter expectations that were `False` on this batch.


The following method will save the expectation suite as a JSON file in the `great_expectations/expectations` directory of your project:
    

In [None]:
batch.save_expectation_suite()

## 8. View the Expectations in Data Docs

Let's now build and look at your Data Docs. These will now include an **Expectation Suite Overview** built from the expectations you just created that helps you communicate about your data with both machines and humans.

In [None]:
context.build_data_docs()
context.open_data_docs()

## Congratulations! You created and saved expectations for at least one of your data assets.

## Next steps:

### 1. Play with Validation

Validation is the process of checking if new batches of this data meet to your expectations before they are processed by your pipeline. Go to [validation_playground.ipynb](validation_playground.ipynb) to see how!


### 2. Explore the documentation & community

You are now among the elite data professionals who know how to build robust descriptions of your data and protections for pipelines and machine learning models. Join the [**Great Expectations Slack Channel**](https://greatexpectations.io/slack) to see how others are wielding these superpowers.