# Using cloudknot to perform matrix-vector multiplication of random matrices

This example uses cloudknot to perform matrix-vector multiplication of some random matrices with varying standard deviations.

In [10]:
import cloudknot as ck
import logging
logger = logging.getLogger()
logger.setLevel(logging.DEBUG)

AttributeError: module 'logging' has no attribute 'TRACE'

First, we write the python script that we want to run on AWS batch. Note that we import the necessary python packages within the function `random_mv_prod`.

In [6]:
def random_mv_prod(b):    
    import numpy as np
    import pandas as pd
    import s3fs
    import json
    import logging
    import os.path as op
    import nibabel as nib
    import dipy.data as dpd
    import dipy.tracking.utils as dtu
    import dipy.tracking.streamline as dts
    from dipy.io.streamline import save_tractogram, load_tractogram
    from dipy.stats.analysis import afq_profile, gaussian_weights
    from dipy.io.stateful_tractogram import StatefulTractogram
    from dipy.io.stateful_tractogram import Space
    import dipy.core.gradients as dpg
    from dipy.segment.mask import median_otsu
    
    x = np.random.normal(0, b, 1024)
    A = np.random.normal(0, b, (1024, 1024))
    
    return np.dot(A, x)

Create a knot using the `random_mv_prod` function and a job definition memory of 128 MiB.

In [9]:
knot = ck.Knot(name='random-mv-prod-h', base_image="python:3.11", func=random_mv_prod, memory=128, retries=3)

DEBUG:urllib3.connectionpool:Resetting dropped connection: iam.amazonaws.com
DEBUG:urllib3.connectionpool:https://iam.amazonaws.com:443 "POST / HTTP/1.1" 200 454
DEBUG:urllib3.connectionpool:Resetting dropped connection: ec2.us-east-1.amazonaws.com
DEBUG:urllib3.connectionpool:https://ec2.us-east-1.amazonaws.com:443 "POST / HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:https://ec2.us-east-1.amazonaws.com:443 "POST / HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:https://ec2.us-east-1.amazonaws.com:443 "POST / HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:https://ec2.us-east-1.amazonaws.com:443 "POST / HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:https://ec2.us-east-1.amazonaws.com:443 "POST / HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:https://ec2.us-east-1.amazonaws.com:443 "POST / HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:https://ec2.us-east-1.amazonaws.com:443 "POST / HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:https://ec2.us-east-1.amazonaws.com:443 "POST / HT

KeyboardInterrupt: 

Submit 20 batch jobs to the knot. The `map()` method returns a list of futures for the results of each batch job. You can optionally supply a list of environment variables to each job.

In [None]:
# import numpy since it was only imported in the `random_mv_prod` function above
import numpy as np

In [None]:
# Submit the jobs
result_future = knot.map(np.linspace(0.1, 100, 17), env_vars=[{'name': 'MY_ENV_VAR', 'value': 'foo'}])

We can query the jobs associated with this knot by calling `knot.view_jobs()`, prints a bunch of job info and provides a consice summary of job statuses.

In [None]:
# Rerun this cell as often as you like to update your job status info
knot.view_jobs()

We can also inspect each BatchJob instance by looking at `knot.jobs` which returns a list of BatchJob instances for each submitted job, e.g.:

In [None]:
last_job = knot.jobs[-1]

In [None]:
print(last_job.done)
print(last_job.result(timeout=15))

In [None]:
last_job.status

`Knot.map()` returns a list of futures so you can use any of the futures methods to query the results, e.g. `done()` or `result()`.

In [None]:
print(result_future.done())

In [None]:
print(result_future.result())

Once you're all done, clobber the knot, including the underlying PARS and the remote repo.

In [None]:
knot.clobber(clobber_pars=True, clobber_repo=True, clobber_image=True)