# Tool runner

Use this notebook to operate the `toolbox_runner` backend from a Python environment. You need to have docker installed and built all images found in `./images`.
Only images that are tagged with a tag prefixed *tbr_* will be recognized as tool. Then, you can run the tool and obtain the result from within Python, without
the need to install all dependencies or the environment of the tool.

## List all tools found and inspect them

Each tool exposes a config file to learn about its parameters (and one day also the outputs)

In [1]:
from toolbox_runner.run import list_tools

In [2]:
tools = list_tools(as_dict=True)

tools

{'profile': profile: Dataset Profile  FROM tbr_profile:latest VERSION: 0.1,
 'variogram': variogram: Variogram fitting  FROM tbr_skgstat:latest VERSION: 1.0,
 'kriging': kriging: Kriging interpolation  FROM tbr_skgstat:latest VERSION: 1.0}

Now, we can pick a tool and learn about the parameter names and their types

In [4]:
vario = tools.get('variogram')

print(vario.title)
print('-------------')
print(vario.description)
print('\nParameters\n-------------')
for key, conf in vario.parameters.items():
    print(f"{key}:\t\t{conf['type']}")

Variogram fitting
-------------
Estimate an empirical variogram and fit a model

Parameters
-------------
coordinates:		file
values:		file
n_lags:		integer
model:		enum
estimator:		enum
maxlag:		string
fit_method:		enum


Finally, you can grab your data from anywhere. The coordinates and values needs to be a N-D and 1-D array of same length. You can supply the path to a `.mat` file, or use the numpy ecosystem to pass two arrays. You can find an example in the source for the geostatistical tools image.

In [3]:
# use pandas to read the file
import pandas as pd
df = pd.read_csv('../images/skgstat/in/meuse.csv')

# extract the numpy arrays
coords = df[['x', 'y']].values
vals = df.lead.values

We can use the `Tool.run` function to call the tool inside the docker container.

In [6]:
step_path = vario.run(result_path='./', coordinates=coords, values=vals, model='exponential', n_lags=15, maxlag='median')
print(f'Results cached at {step_path}')

Results cached at ./1666766315_variogram.tar.gz


## Dataset profiling

In [4]:
profile = tools.get('profile')

In [5]:
step = profile.run(result_path='./', data=df)
step

./1666770522_profile.tar.gz

In [4]:
step.outputs

['./out/STDOUT.log', './out/report.html']

In [None]:
# this is weird, but is works
from IPython.display import display, HTML
html = step.get_file('./out/report.html').decode()

display(HTML(html))