# Great Expectations Task

## 1. Install Great Expectations Library


In [None]:
!pip show pandas
!pip show great_expectations

Name: pandas
Version: 2.1.4
Summary: Powerful data structures for data analysis, time series, and statistics
Home-page: https://pandas.pydata.org
Author: 
Author-email: The Pandas Development Team <pandas-dev@python.org>
License: BSD 3-Clause License

Copyright (c) 2008-2011, AQR Capital Management, LLC, Lambda Foundry, Inc. and PyData Development Team
All rights reserved.

Copyright (c) 2011-2023, Open source contributors.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

* Redistributions of source code must retain the above copyright notice, this
  list of conditions and the following disclaimer.

* Redistributions in binary form must reproduce the above copyright notice,
  this list of conditions and the following disclaimer in the documentation
  and/or other materials provided with the distribution.

* Neither the name of the copyright holder nor the names of its
  contributors may be u

##2. Import Necessary Libraries

In [None]:
import pandas as pd
import great_expectations as gx

##3. Load Labels.csv

Download and upload the [Labels.csv](https://github.com/zubxxr/SOFE3980U-Lab5/blob/main/Labels.csv) into this notebook, and then load the file.

In [None]:
import pandas as pd

# Load the uploaded file into a pandas DataFrame
df = pd.read_csv("Labels.csv")

# Preview the first few rows of the dataset
df.head()


Unnamed: 0,Timestamp,Car1_Location_X,Car1_Location_Y,Car1_Location_Z,Car2_Location_X,Car2_Location_Y,Car2_Location_Z,Occluded_Image_view,Occluding_Car_view,Ground_Truth_View,pedestrianLocationX_TopLeft,pedestrianLocationY_TopLeft,pedestrianLocationX_BottomRight,pedestrianLocationY_BottomRight
0,1736796157,-51.402977,143,0.596902,-59.32027,140,0.596902,A_001.png,B_001.png,C_001.png,593,361,610,410
1,1736796167,-53.819637,143,0.596902,-59.196568,140,0.596902,A_002.png,B_002.png,C_002.png,579,368,594,415
2,1736796178,-50.239144,143,0.596902,-56.744479,140,0.596902,A_003.png,B_003.png,C_003.png,854,720,854,720
3,1736796188,-53.70722,143,0.596902,-57.30938,140,0.596902,A_004.png,B_004.png,C_004.png,549,368,567,425
4,1736796198,-52.053721,143,0.596902,-59.545897,140,0.596902,A_005.png,B_005.png,C_005.png,524,368,537,413


##4. Preview the Dataset

In [None]:
df.head()

Unnamed: 0,Timestamp,Car1_Location_X,Car1_Location_Y,Car1_Location_Z,Car2_Location_X,Car2_Location_Y,Car2_Location_Z,Occluded_Image_view,Occluding_Car_view,Ground_Truth_View,pedestrianLocationX_TopLeft,pedestrianLocationY_TopLeft,pedestrianLocationX_BottomRight,pedestrianLocationY_BottomRight
0,1736796157,-51.402977,143,0.596902,-59.32027,140,0.596902,A_001.png,B_001.png,C_001.png,593,361,610,410
1,1736796167,-53.819637,143,0.596902,-59.196568,140,0.596902,A_002.png,B_002.png,C_002.png,579,368,594,415
2,1736796178,-50.239144,143,0.596902,-56.744479,140,0.596902,A_003.png,B_003.png,C_003.png,854,720,854,720
3,1736796188,-53.70722,143,0.596902,-57.30938,140,0.596902,A_004.png,B_004.png,C_004.png,549,368,567,425
4,1736796198,-52.053721,143,0.596902,-59.545897,140,0.596902,A_005.png,B_005.png,C_005.png,524,368,537,413


##5. Set Up Great Expectations Context and Data Source

In [None]:
context = gx.data_context.DataContext()




##6. Define and Create a Data Batch

In [None]:
validator = context.get_validator(batch_request=batch_request)

##7. Define Three Expectations for Column Values

Using this [link](https://greatexpectations.io/expectations/), choose three expectation functions and apply them to the labels dataset in a relevant manner.

You should replace the 'ExpectColumnValuesToBeBetween' function with other functions you select from the link.

You can also check the format/parameters required of each function when you click "See more" on the function.

In [None]:
validation_results = validator.validate()
print(validation_results)

Calculating Metrics:   0%|          | 0/12 [00:00<?, ?it/s]

{
  "success": false,
  "results": [
    {
      "success": true,
      "expectation_config": {
        "expectation_type": "expect_column_values_to_be_between",
        "kwargs": {
          "column": "Car1_Location_X",
          "min_value": -1000,
          "max_value": 1000,
          "batch_id": "7f4b8dd2c2e7d06226fc95fcc6c2fa7d"
        },
        "meta": {}
      },
      "result": {
        "element_count": 121,
        "unexpected_count": 0,
        "unexpected_percent": 0.0,
        "partial_unexpected_list": [],
        "missing_count": 0,
        "missing_percent": 0.0,
        "unexpected_percent_total": 0.0,
        "unexpected_percent_nonmissing": 0.0
      },
      "meta": {},
      "exception_info": {
        "raised_exception": false,
        "exception_traceback": null,
        "exception_message": null
      }
    },
    {
      "success": true,
      "expectation_config": {
        "expectation_type": "expect_column_values_to_not_be_null",
        "kwargs": {
     

### Expectation 1

In [None]:
# Ensure "Ground_Truth_View" column has no missing values
expectation_1 = validator.expect_column_values_to_not_be_null(
    column="Ground_Truth_View"
)




Calculating Metrics:   0%|          | 0/6 [00:00<?, ?it/s]

### Validate Data Against Expectation 1

In [None]:
# Validate Expectation 1
result_1 = validator.validate()
print("Validation Result for Expectation 1:", result_1)

Calculating Metrics:   0%|          | 0/12 [00:00<?, ?it/s]

Validation Result for Expectation 1: {
  "success": false,
  "results": [
    {
      "success": true,
      "expectation_config": {
        "expectation_type": "expect_column_values_to_be_between",
        "kwargs": {
          "column": "Car1_Location_X",
          "min_value": -1000,
          "max_value": 1000,
          "batch_id": "7f4b8dd2c2e7d06226fc95fcc6c2fa7d"
        },
        "meta": {}
      },
      "result": {
        "element_count": 121,
        "unexpected_count": 0,
        "unexpected_percent": 0.0,
        "partial_unexpected_list": [],
        "missing_count": 0,
        "missing_percent": 0.0,
        "unexpected_percent_total": 0.0,
        "unexpected_percent_nonmissing": 0.0
      },
      "meta": {},
      "exception_info": {
        "raised_exception": false,
        "exception_traceback": null,
        "exception_message": null
      }
    },
    {
      "success": true,
      "expectation_config": {
        "expectation_type": "expect_column_values_to_no

### Expectation 2

In [None]:
# Validate that "Car1_Location_X" falls within a range
expectation_2 = validator.expect_column_values_to_be_between(
    column="Car1_Location_X",
    min_value=-1000,
    max_value=1000
)




Calculating Metrics:   0%|          | 0/8 [00:00<?, ?it/s]

### Validate Data Against Expectation 2

In [None]:
# Validate Expectation 2
result_2 = validator.validate()
print("Validation Result for Expectation 2:", result_2)

Calculating Metrics:   0%|          | 0/12 [00:00<?, ?it/s]

Validation Result for Expectation 2: {
  "success": false,
  "results": [
    {
      "success": true,
      "expectation_config": {
        "expectation_type": "expect_column_values_to_be_between",
        "kwargs": {
          "column": "Car1_Location_X",
          "min_value": -1000,
          "max_value": 1000,
          "batch_id": "7f4b8dd2c2e7d06226fc95fcc6c2fa7d"
        },
        "meta": {}
      },
      "result": {
        "element_count": 121,
        "unexpected_count": 0,
        "unexpected_percent": 0.0,
        "partial_unexpected_list": [],
        "missing_count": 0,
        "missing_percent": 0.0,
        "unexpected_percent_total": 0.0,
        "unexpected_percent_nonmissing": 0.0
      },
      "meta": {},
      "exception_info": {
        "raised_exception": false,
        "exception_traceback": null,
        "exception_message": null
      }
    },
    {
      "success": true,
      "expectation_config": {
        "expectation_type": "expect_column_values_to_no

### Expectation 3

In [None]:
# Ensure "Occluded_Image_view" contains only valid categories
expectation_3 = validator.expect_column_distinct_values_to_be_in_set(
    column="Occluded_Image_view",
    value_set=["yes", "no"]
)




Calculating Metrics:   0%|          | 0/4 [00:00<?, ?it/s]

### Validate Data Against Expectation 3

In [None]:
# Validate Expectation 3
result_3 = validator.validate()
print("Validation Result for Expectation 3:", result_3)

Calculating Metrics:   0%|          | 0/12 [00:00<?, ?it/s]

Validation Result for Expectation 3: {
  "success": false,
  "results": [
    {
      "success": true,
      "expectation_config": {
        "expectation_type": "expect_column_values_to_be_between",
        "kwargs": {
          "column": "Car1_Location_X",
          "min_value": -1000,
          "max_value": 1000,
          "batch_id": "7f4b8dd2c2e7d06226fc95fcc6c2fa7d"
        },
        "meta": {}
      },
      "result": {
        "element_count": 121,
        "unexpected_count": 0,
        "unexpected_percent": 0.0,
        "partial_unexpected_list": [],
        "missing_count": 0,
        "missing_percent": 0.0,
        "unexpected_percent_total": 0.0,
        "unexpected_percent_nonmissing": 0.0
      },
      "meta": {},
      "exception_info": {
        "raised_exception": false,
        "exception_traceback": null,
        "exception_message": null
      }
    },
    {
      "success": true,
      "expectation_config": {
        "expectation_type": "expect_column_values_to_no