# Compute Functions and Jobs
__________________
The Compute module provides scalable compute capabilities to parallelize your computations. Compute enables users to package and execute your Python code within nodes hosted on Descartes Lab's cloud infrastructure. These nodes offer the ability to access imagery at extremely high rates of throughput to execute computations over nearly any spatio-temporal scale. 

Within the compute module, there are two primary objects that we work with:

 * [**Function:**](https://docs.descarteslabs.com/descarteslabs/compute/readme.html#descarteslabs.compute.Function) dynamically created, serverless functions containing user submitted, compiled code that you can submit many jobs to.
 * [**Job:**](https://docs.descarteslabs.com/descarteslabs/compute/readme.html#descarteslabs.compute.Job) submitted request for a single invocation of a created Function. 

Let's start by importing the compute module and creating a test Function object. 

In [None]:
from descarteslabs.compute import Function, FunctionStatus, Job, JobStatus

We'll also need these imports for this example.

In [None]:
from datetime import datetime
import gzip
from descarteslabs.auth import Auth
from descarteslabs.catalog import Blob, properties as p, StorageType

Next, we'll create a very basic `hello_world` function that returns a string constructed from the given argument.

In [None]:
def hello(arg):
    print(f"Hello, {arg}")
    return f"hello {arg}"

To create the Function object, we simply need to call `Function()` and specify the desired parameters. The recommended minimum parameters are your function, a name, and the image that will be used to build the Function environment.  Some common attributes used to better customize the performance of your scalable compute object include: 
 * cpus = Number of CPUs requested for a single job
 * memory = max memory available for each job
 * maximum_concurrency = max number of jobs to run in parallel
 * timeout = max length a job can run in seconds
 * retry_count = max number of times a job can be retried
 * requirements = list of Python dependencies required by this function

For other options, please see the [Compute documentation](https://docs.descarteslabs.com/guides/compute.html).

In [None]:
print("creating function")
async_func = Function(
    hello,
    name="my-compute-hello",
    image="python3.9:latest",
    cpus=0.25,
    memory=512,
    maximum_concurrency=1,
    timeout=600,
    retry_count=0,
)
async_func.save()

# wait for the function to be ready (will take a few minutes)
print("waiting for function to be ready")
async_func.wait_for_completion()

Once the Function is built, we will test it by creating and submitting a Job. There are several ways you can submit jobs to a function:
 * `async_func(args)` - Pass arguments directly to the compute.Function
 * [`async_func.map()`](https://docs.descarteslabs.com/descarteslabs/compute/readme.html#descarteslabs.compute.Function.map) - Submit multiple jobs efficiently (discussed in more detail within the next example - "02_Create_Imagery.ipynb")

In [None]:
# invoke the function
print("submitting a job")
job = async_func("Hello from my function!")
# print the job result and logs
print("waiting for the job to complete")
job.wait_for_completion()
print(job.result())
print(job.log())

### Waiting for Completion

There are multiple ways to wait for completion.
 * You can wait on an individual job as in the previous cell.
 * You can wait for all pending and running jobs for the function to complete using `async_func.wait_for_completion()`.
 * You can navigate to the Compute monitor app at [app.descarteslabs.com/monitor](https://app.descarteslabs.com/monitor).
 * You can embed the same directly in your notebook as in the following cell.

In [None]:
from IPython.display import IFrame

IFrame("https://app.descarteslabs.com/monitor", width=700, height=350)

### Integration with DL Storage - Logging and Results
In addition to accessing your `Function` build log and your `Job` results and logs as shown above, you can also
access these using the Catalog module, as all these artifacts are automatically stored in the Catalog as Storage
Blobs. Logs are retained only for a period of 30 days. Results are retained indefinitely, even after the `Function`
and its `Job`s have been deleted.

In order to access these artifacts, you will need to form the correct ids to retrieve them. All three require
the correct `StorageType` value (`"logs"` for logs and `"compute"` for results), your namespace (Available as
`Function.namespace` while your function exists, or can be derived from your authentication details as shown below),
and the name of the `Blob` derived from your `Function.id` and `Job.id`.

* Function Build Log id:

    `logs/<namespace>/<function-id>`

* Job Execution Log id:

    `logs/<namespace>/<function-id>/<job-id>`

* Job Result id:
        
    `compute/<namespace>/<function-id>/<job-id>`

Note that the build log content is compressed!
    

In [None]:
print("Namespace")
auth = Auth.get_default_auth()
namespace = f"{auth.payload['org']}:{auth.namespace}"
assert namespace == async_func.namespace
print(namespace)

print(f"Build Log for {async_func.id}")
build_log = Blob.get(f"{StorageType.LOGS}/{namespace}/{async_func.id}")
print(gzip.decompress(build_log.data()).decode())

print(f"Results for {async_func.id}")
for b in (
    Blob.search()
    .filter(p.namespace == namespace)
    .filter(p.name.startswith(f"{async_func.id}/"))
    .filter(p.storage_type == StorageType.COMPUTE)
):
    print(f"ID: {b.id}")
    print(b.data())
    print("\n")

print(f"Job Execution Logs for {async_func.id}")
for b in (
    Blob.search()
    .filter(p.namespace == namespace)
    .filter(p.name.startswith(f"{async_func.id}/"))
    .filter(p.storage_type == StorageType.LOGS)
):
    print(f"ID: {b.id}")
    print(b.data())
    print("\n")

### Searching Functions
You can also search, filter, and sort your previously created `Function` objects. Here we will find all of our `Function`s created today:

In [None]:
today = datetime.today().strftime("%Y-%m-%d")

for func in Function.search().filter(Function.creation_date > today):
    print(func.id)
    print(func.creation_date)
    print(func.status)
    print(len(list(func.jobs)))

In [None]:
# Get active functions by specifying the "rady" status
active_funcs = [{"name": f.name, "id": f.id} for f in Function.search().filter(Function.status==FunctionStatus.READY)]
active_funcs

In [None]:
# Filter Functions by name prefixes
ndvi_funcs = [f.name for f in Function.search().filter(Function.name.startswith("my"))]
ndvi_funcs

### Deleting Functions

In order to release all resources associated with a `Function` you should delete it when you are done. Any jobs must have been completed before you can delete the `Function`. When deleted, all associated `Job`s, build logs, and job logs will be deleted. Results will not be deleted and will remain available via the Catalog. 

In [None]:
async_func.delete_jobs(delete_results=True)
async_func.delete()