# Running Processing Jobs

A processing job serves as a method to enhance the utilization of ordered geospatial data by employing advanced processing algorithms. This process transforms the data into an analytics-ready state or extracts information for subsequent insight derivation.

- <a href="#setup">Set up the notebook</a>
- <a href="#run-jobs">Run processing jobs</a>
- <a href="#job-list">View Jobs</a>
- <a href="#dual-item-jobs">Run dual item process</a>

## <a id="setup"></a> Set up the notebook

### 1. Install prerequisites

In [None]:
!pip install up42-py
import up42, os
from up42 import processing_templates as templates

### 2. Authenticate

Create a `credentials.json` file in a directory named `.up42` under your home directory by running the cell below. The path to the file will be `~/.up42/credentials.json`.

In [2]:
# Define the directory path
up42_directory = os.path.expanduser("~/.up42")

# Create the directory if it doesn't exist
if not os.path.exists(up42_directory):
    os.makedirs(up42_directory)

# Specify the file path
credentials_file_path = os.path.join(up42_directory, "credentials.json")

# Check if the file already exists before creating it
if not os.path.exists(credentials_file_path):
    # Create an empty credentials.json file
    with open(credentials_file_path, "w") as credentials_file:
        print(f"The file {credentials_file_path} has been created.")
        pass
else:
    print(f"The file {credentials_file_path} already exists.")

The file /home/codespace/.up42/credentials.json already exists.


1. Click the link above to the created file and paste the following code:
    ```json
    {
        "username": "<your-email-address>",
        "password": "<your-password>"
    }
    ```
2. Retrieve the email address and password used for logging into the console. Use them as values for username and password.
3. Save the `credentials.json` file.
4. Check that the authentication was successful as follows:

In [3]:
up42.authenticate(cfg_file=credentials_file_path)

2024-06-26 13:18:57,014 - Authentication with UP42 successful!



## <a id="run-jobs"></a>  Run processing jobs
To run a processing job we need to make sure that the input parameters are valid and the processing cost is acceptable. To achieve this we create a process specific job template with the required parameters. Below we use Pansharpening process template as an example.

### 1. Prepare input parameters

Pansharpening template (documentation link here) requires title, pystac.item object to sharpen and optional grey weights. We omit grey weights specification for simplicity. We can retrieve STAC item via multiple ways, let's consider a scenario where we know the delivered asset id.

In [None]:
asset_id = ""
stac_item = up42.initialize_asset(asset_id).stac_items[0]
title = "Example_Pansharpen_SDK"

### 2. Create job template

In [5]:
template = templates.Pansharpening(
    title=title,
    item=stac_item
)

### 3. Are parameters valid?
If the template is invalid we can inspect its errors

In [6]:
if not template.is_valid:
    print(template.errors)

### 4. Control the cost
If the template is valid we can inspect and check its cost

In [7]:
acceptable_cost = 100
assert template.cost <= acceptable_cost
template.cost

Cost(strategy='area', credits=21, size=5.039398810389698, unit='SQ_KM')

### 5. Execute template
If the template is valid and the cost is acceptable we can execute it as a job

In [8]:
job = template.execute()
job

Job(process_id='pansharpening', id='cff0d609-45db-474d-9bb8-4eb1e8273140', account_id='3c86279b-610a-4dba-ad12-abe3bdc51428', workspace_id='27315579-94ae-4081-8f55-7c0993eb900a', definition={'inputs': {'title': 'Example_Pansharpen_SDK', 'item': 'https://api.up42.com/v2/assets/stac/collections/16d739e2-accc-461a-9088-ac7ff06a242f/items/2c13c8df-455a-4d55-810d-c5dc5353408f'}}, status=<JobStatus.CREATED: 'created'>, created=datetime.datetime(2024, 6, 26, 13, 19, 18, 95000), updated=datetime.datetime(2024, 6, 26, 13, 19, 18, 95000), collection_url=None, errors=None, credits=None, started=None, finished=None)

### 6. Track job execution
The template execution returns a job instance with status `CREATED`. When a job finishes it has status `CAPTURED` if it was successful or `RELEASED` if it failed. In order to wait until the job finishes we can track its execution.

In [9]:
job.track()

### 7. Access the results
If job succeeded then we can use `collection_url` to refer to its resulting collection

In [10]:
if job.status == up42.JobStatus.CAPTURED:
    print(job.collection_url)

https://api.up42.com/v2/assets/stac/collections/02c224d4-334c-4468-9f86-9b007bedb806


Or use `collection` to obtain `pystac.Collection` object directly

In [11]:
job.collection

## <a id="job-list"></a> View Jobs

Retrieve all jobs or a specific job available to user, with additional filtering capabilities

### 1. Get All Jobs

Set up parameters for filtering jobs

In [12]:
workspace_id = "" #FILL IN
process_filter = [
    templates.Pansharpening.process_id,
    templates.DetectionChangeSpacept.process_id,
]
status_filter = [
    up42.JobStatus.CAPTURED, 
    up42.JobStatus.RELEASED
]
sort_filter = up42.JobSorting.credits.asc

Call the get jobs method, returning a generator object

In [None]:
jobs = up42.Job.all(
#    process_id=process_filter
#    workspace_id=workspace_id,
#    sort_by=status_filter,
#    min_duration=min_duration,
#    max_duration=max_duration,
)
for job in jobs:
    print(job)

### 2. Get a Specific Job

In [14]:
queried_job = up42.Job.get(job.id)
queried_job

Job(process_id='pansharpening', id='7b0176bc-d71a-4002-9581-bd7d9f6c7994', account_id='3c86279b-610a-4dba-ad12-abe3bdc51428', workspace_id='de807c20-faa1-4064-afe3-7aaf7c49830d', definition={'inputs': {'aoi': {'type': 'Polygon', 'coordinates': [[[13.457193299733992, 52.52448028291545], [13.451130725259974, 52.52648250492052], [13.456011726160677, 52.52958974056038], [13.457205199363967, 52.524561357046316], [13.457193299733992, 52.52448028291545]]]}, 'item': 'https://api.up42.com/v2/assets/stac/collections/f781af6d-ce86-47d9-8bd1-a8226e905707/items/6baee52b-ee35-44e7-911f-2ee639a68572', 'title': 'DS_PHR1A_202210261023104_FR1_PX_E013N52_0614_01724_pansharpening'}}, status=<JobStatus.FAILED: 'failed'>, created=datetime.datetime(2023, 12, 13, 14, 31, 15, 182000), updated=datetime.datetime(2023, 12, 13, 14, 32, 1, 174000), collection_url=None, errors=None, credits=None, started=datetime.datetime(2023, 12, 13, 14, 31, 26, 976000), finished=datetime.datetime(2023, 12, 13, 14, 32, 1, 174000))

## <a id="dual-item-jobs"></a> Run dual item process
Most of the processing algorithms operate on a single stac item, but some of them, e.g. change detection algorithms, require two items.

In [None]:
stac_item1 = None # retrieve via asset.stac_items or storage.pystac_client
stac_item2 = None # retrieve via asset.stac_items or storage.pystac_client

template = templates.HypervergePleiadesChangeDetection(
    title="SDK Detection Change Example", 
    items=[stac_item1, stac_item2]
)

And the rest of the flow is similar to a single item template case above.