## 1. Initialize the project

Create the working context: data download project if not created already. Project is a placeholder for the code, data, and management of the data operations.

In [None]:
import digitalhub as dh
PROJECT_NAME = "<YOUR_PROJECT_NAME>"
proj = dh.get_or_create_project(PROJECT_NAME)

Note: Make sure to replace <YOUR_PROJECT_NAME> with the actual name of your project before running the code.

## Data Source Setup

We'll create a data item that points to world cities database. # The world cities database contains the following columns:

- name: Name of the city
- country: Country where the city is located
- subcountry: Subdivision (state/province/region) of the country
- geonameid: Unique identifier for the city in the GeoNames database

In [None]:
URL="https://raw.githubusercontent.com/datasets/world-cities/refs/heads/main/data/world-cities.csv"
di = proj.new_dataitem(name="world-cities", kind="table", path=URL)

## 3. Execution

Fetch the "example-etc" operation in the project. 

In [None]:
function_validate_table = proj.get_function("example-validate-table") 

We can now run the function and see the results. To do this we use the run method of the function.

In [None]:
run = function_validate_table.run(action="job", inputs={"di": di.key}, wait=True)

Note: Wait for job to finish. Alternatively, using the Core Management UI, one can navigate to 'Runs' menu , select the corresponding 'run' instance and inspect the logs using 'Logs' tab.

To perform validation  pass the dataitem as an input parameter (di.key). The function will check the data for consistency, completeness, and correctness, and generate a validation report at the end.


In [None]:
import json

report = proj.get_artifact("world-cities_validation-report.json")

json_object = report.download(overwrite=True)

with open(json_object, 'r') as file:
    data = json.load(file)

print(json.dumps(data, indent=4))

The generated report confirms that the table has been successfully validated, with no errors or warnings detected.

{
    "valid": true,
    "stats": {
        "tasks": 1,
        "errors": 0,
        "warnings": 0,
        "seconds": 3.307
    },
    "warnings": [],
    "errors": [],
    "tasks": [
        {
            "name": "world-cities",
            "type": "table",
            "valid": true,
            "place": "dataitem/world-cities.parquet",
            "labels": [
                "index",
                "name",
                "country",
                "subcountry",
                "geonameid"
            ],
            "stats": {
                "errors": 0,
                "warnings": 0,
                "seconds": 3.307,
                "fields": 5,
                "rows": 33073
            },
            "warnings": [],
            "errors": []
        }
    ]
}