# Great Expectations Task

## 1. Install Great Expectations Library


In [None]:
!pip install great_expectations

## 2. Import Necessary Libraries

In [None]:
import pandas as pd
import great_expectations as gx

## 3. Load Labels.csv

Download and upload the [Labels.csv](https://github.com/zubxxr/SOFE3980U-Lab5/blob/main/Labels.csv) into this notebook, and then load the file.

In [None]:
# Code to load the dataset
df = pd.read_csv('Labels.csv')

## 4. Preview the Dataset

In [None]:
df.head()

## 5. Set Up Great Expectations Context and Data Source

In [None]:
# Create the Great Expectations context
context = gx.data_context.DataContext()

## 6. Define and Create a Data Batch

In [None]:
# Create a batch for the dataset
batch = context.get_batch('pandas', df)

## 7. Define Three Expectations for Column Values

Using this [link](https://greatexpectations.io/expectations/), choose three expectation functions and apply them to the labels dataset in a relevant manner.

You should replace the 'ExpectColumnValuesToBeBetween' function with other functions you select from the link.

You can also check the format/parameters required of each function when you click "See more" on the function.

In [None]:
## Expectation 1
expectation_1 = gx.expectations.ExpectColumnDistinctValuesToBeInSet(
    column="Car1_Location_X", value_set=[-50, -55, -60]
)

## Validate data against Expectation 1
result_1 = batch.validate([expectation_1])

### Expectation 2

In [None]:
## Expectation 2
expectation_2 = gx.expectations.ExpectColumnMaxToBeBetween(
    column="Car1_Location_X", min_value=-60, max_value=-50
)
## Validate data against Expectation 2
result_2 = batch.validate([expectation_2])

### Expectation 3

In [None]:
## Expectation 3
expectation_3 = gx.expectations.ExpectColumnMeanToBeBetween(
    column="Car1_Location_X", min_value=-55, max_value=-50
)
## Validate data against Expectation 3
result_3 = batch.validate([expectation_3])