# Try GX

<a href="https://colab.research.google.com/github/greatexpectationslabs/try-gx-notebook/blob/main/notebook/Try_GX.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab">

## What is GX?


[Great Expectations (GX)](https://docs.greatexpectations.io/docs/home) is the leading platform for validating and documenting your data. GX is a framework that enables you to describe data using expressive tests and then validate that the data meets those criteria. [GX 1.0](https://docs.greatexpectations.io/docs/1.0-prerelease/core/introduction/) is the open source Python library that supports the Great Expectations platform.

Software developers have long known that automated testing is essential for managing complex codebases. GX brings the same discipline, confidence, and acceleration to data science and data engineering teams.

Start here to learn how to connect to sample data, build an Expectation, validate sample data, and review Validation Results. This is an ideal place to start if you're new to GX 1.0 and want to experiment with features and see what it offers.

## Install GX

First, install GX. 

In [None]:
!pip install --pre great_expectations

## Import GX

Import the `great_expectations` library and `expectations` module:
* The `great_expectations` module is the root of the GX library and contains shortcuts and convenience methods for starting a GX project in a Python session.
* The `expectations module` contains all the Expectation classes that are provided by the GX library.

In [None]:
import great_expectations as gx
import great_expectations.expectations as gxe

## Connect to sample data

Create a temporary Data Context and connect to sample data.

In Python, a Data Context provides an entrypoint for interacting with many common GX objects.

Initialize a Data Context and then use it to read the contents of a .csv file into a Batch of sample data:

In [None]:
context = gx.get_context()

batch = context.data_sources.pandas_default.read_csv(
    "https://raw.githubusercontent.com/great-expectations/gx_tutorials/main/data/yellow_tripdata_sample_2019-01.csv"
)

In [None]:
batch.head()

You'll use this sample data to test your Expectations.

## Create an Expectation

Expectations are a fundamental component of GX. They allow you to explicitly define the state to which your data should conform.

The sample data you're using is taxi trip record data. With this data, you can make certain assumptions. For example, the passenger count shouldn't be zero because at least one passenger needs to be present. Additionally, a taxi can accomodate a maximum of six passengers.

Run the following code to define an Expectation that the contents of the column `passenger_count` consist of values ranging from `1` to `6`:

In [None]:
expectation = gxe.ExpectColumnValuesToBeBetween(
    column="passenger_count", min_value=1, max_value=6
)

## Validate data and view results

Run the following code to validate the sample data against your Expectation and view the results:

In [None]:
validation_result = batch.validate(expectation)

print(validation_result.describe())

The sample data conforms to the defined Expectation and the following Validation Results are returned:
```
{
    "expectation_type": "expect_column_values_to_be_between",
    "success": true,
    "kwargs": {
        "batch_id": "default_pandas_datasource-#ephemeral_pandas_asset",
        "column": "passenger_count",
        "min_value": 1.0,
        "max_value": 6.0
    },
    "result": {
        "element_count": 10000,
        "unexpected_count": 0,
        "unexpected_percent": 0.0,
        "partial_unexpected_list": [],
        "missing_count": 0,
        "missing_percent": 0.0,
        "unexpected_percent_total": 0.0,
        "unexpected_percent_nonmissing": 0.0,
        "partial_unexpected_counts": [],
        "partial_unexpected_index_list": []
    }
}
```

## Create an Expectation that fails when validated

Create an Expectation that will fail when validated against the provided data.

A failed Expectation lets you know there is something wrong with the data, such as missing or incorrect values, or there is a misunderstanding about the data.

Run the following code to create an Expectation that fails because it assumes that a taxi can seat a maximum of three passengers:

In [None]:
failed_expectation = gxe.ExpectColumnValuesToBeBetween(
    column="passenger_count", min_value=1, max_value=3
)

failed_validation_result = batch.validate(failed_expectation)

print(failed_validation_result.describe())

When an Expectation fails, the Validation Results of the failed Expectation include metrics to help you assess the severity of the issue:
```
{
    "expectation_type": "expect_column_values_to_be_between",
    "success": false,
    "kwargs": {
        "batch_id": "default_pandas_datasource-#ephemeral_pandas_asset",
        "column": "passenger_count",
        "min_value": 1.0,
        "max_value": 3.0
    },
    "result": {
        "element_count": 10000,
        "unexpected_count": 853,
        "unexpected_percent": 8.53,
        "partial_unexpected_list": [
            4,
            4,
            4,
            4,
            4,
            4,
            4,
            4,
            4,
            4,
            4,
            4,
            4,
            4,
            4,
            4,
            4,
            4,
            4,
            4
        ],
        "missing_count": 0,
        "missing_percent": 0.0,
        "unexpected_percent_total": 8.53,
        "unexpected_percent_nonmissing": 8.53,
        "partial_unexpected_counts": [
            {
                "value": 4,
                "count": 20
            }
        ],
        "partial_unexpected_index_list": [
            9147,
            9148,
            9149,
            9150,
            9151,
            9152,
            9153,
            9154,
            9155,
            9156,
            9157,
            9158,
            9159,
            9160,
            9161,
            9162,
            9163,
            9164,
            9165,
            9166
        ]
    }
}
```

## Next Steps

* Go to the [Expectations Gallery](https://greatexpectations.io/expectations) and experiment with other Expectations.

* [Check out GX Cloud](https://greatexpectations.io/cloud), our SaaS platformâ€”it's now in public preview! [Sign up here](https://greatexpectations.io/cloud) and you could be validating your data in minutes. We also offer regular GX Cloud workshops: [click here to get more information and register](https://pages.greatexpectations.io/gx-cloud-workshops).

* To learn more about GX 1.0, see [Community resources](https://docs.greatexpectations.io/docs/1.0-prerelease/core/introduction/community_resources).

* If you're ready to start using GX 1.0 with your own data, the [Set up a GX environment](https://docs.greatexpectations.io/docs/1.0-prerelease/core/installation_and_setup/install_gx) documentation provides a more comprehensive guide to setting up GX to work with specific data formats and environments.