## General `task` usage

A general view of the execution of multiple tasks over a dataset is the following:

![tasks](tasks.png)

In a _k_-Nearest Neighbors algorithm, this applies as follows:

## Distributed `fit` (training)

![fit](knn-fit.png)

In [None]:
# Pseudocode for the fit stage (simplified)

from pycompss import task
from sklearn import NearestNeighbors

@task(...)
def fit(data):
    nn = NearestNeighbors()
    nn.fit(data)
    ...

## Distributed `kneighbors` (inference)

![kneighbors](knn-kneighbors.png)

In [None]:
# Pseudocode for the kneighbors stage (simplified)

@task(...)
def kneighbors(nn, X):
    dist, ind = nn.kneighbors(X=X)
    ...

## Merge stage over the partial `kneighbors`

![merge](knn-merge.png)

In [None]:
# Pseudocode for the merge (simplified)

@task(...)
def merge(dist, ind):
    aggr_dist = np.hstack(dist)
    aggr_ind = np.hstack(ind)

    # Final indexes of the indexes (sic)
    final_ii = np.argsort(aggr_dist)[:,:k]

    # Final results
    return(
        np.take_along_axis(aggr_dist, final_ii, 1),
        np.take_along_axis(aggr_ind, final_ii, 1)
    )