# A primer on building PipeGraph's custom blocks

## Motivation for wrappers
Consider the following Scikit-Learn common objects:

In [11]:
import sklearn

classifier = sklearn.naive_bayes.GaussianNB()
dbscanner = sklearn.cluster.DBSCAN()
scaler = sklearn.preprocessing.MinMaxScaler() 

And let's load some data to run the examples:

In [16]:
from sklearn.datasets import load_iris
iris = load_iris()
X, y = iris.data, iris.target

Now, let's fit each of the above defined sklearn objects and get the output produced afterwards by using the corresponding method (predict, fit_predict, transform):

In [37]:
classifier.fit(X, y)
dbscanner.fit(X, y)
scaler.fit(X);

In [38]:
classifier.predict(X)

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 2, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])

In [43]:
classifier.predict_proba(X)

array([[  1.00000000e+000,   1.38496103e-018,   7.25489025e-026],
       [  1.00000000e+000,   1.48206242e-017,   2.29743996e-025],
       [  1.00000000e+000,   1.07780639e-018,   2.35065917e-026],
       [  1.00000000e+000,   1.43871443e-017,   2.89954283e-025],
       [  1.00000000e+000,   4.65192224e-019,   2.95961100e-026],
       [  1.00000000e+000,   1.52598944e-014,   1.79883402e-021],
       [  1.00000000e+000,   1.13555084e-017,   2.79240943e-025],
       [  1.00000000e+000,   6.57615274e-018,   2.79021029e-025],
       [  1.00000000e+000,   9.12219356e-018,   1.16607332e-025],
       [  1.00000000e+000,   3.20344249e-018,   1.12989524e-025],
       [  1.00000000e+000,   4.48944985e-018,   5.19388089e-025],
       [  1.00000000e+000,   1.65734172e-017,   7.24605453e-025],
       [  1.00000000e+000,   1.19023891e-018,   3.06690017e-026],
       [  1.00000000e+000,   7.39520546e-020,   1.77972179e-027],
       [  1.00000000e+000,   2.58242749e-019,   8.73399972e-026],
       [  

In [42]:
classifier.predict_log_proba(X)

array([[  0.00000000e+00,  -4.11208597e+01,  -5.78855367e+01],
       [  0.00000000e+00,  -3.87505119e+01,  -5.67328319e+01],
       [  0.00000000e+00,  -4.13716038e+01,  -5.90125166e+01],
       [  0.00000000e+00,  -3.87801966e+01,  -5.65000742e+01],
       [  0.00000000e+00,  -4.22118362e+01,  -5.87821546e+01],
       [ -1.50990331e-14,  -3.18135483e+01,  -4.77671483e+01],
       [  0.00000000e+00,  -3.90168287e+01,  -5.65377225e+01],
       [  0.00000000e+00,  -3.95630818e+01,  -5.65385104e+01],
       [  0.00000000e+00,  -3.92358214e+01,  -5.74109854e+01],
       [  0.00000000e+00,  -4.02823057e+01,  -5.74425024e+01],
       [  0.00000000e+00,  -3.99448015e+01,  -5.59171461e+01],
       [  0.00000000e+00,  -3.86387316e+01,  -5.55841702e+01],
       [  0.00000000e+00,  -4.12723776e+01,  -5.87465451e+01],
       [  0.00000000e+00,  -4.40508700e+01,  -6.15933405e+01],
       [  0.00000000e+00,  -4.28003869e+01,  -5.76999890e+01],
       [  0.00000000e+00,  -3.79878625e+01,  -5.2407385

In [39]:
dbscanner.fit_predict(X)

array([ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
        0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
        0,  0,  0,  0,  0,  0,  0, -1,  0,  0,  0,  0,  0,  0,  0,  0,  1,
        1,  1,  1,  1,  1,  1, -1,  1,  1, -1,  1,  1,  1,  1,  1,  1,  1,
       -1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,
        1,  1, -1,  1,  1,  1,  1,  1, -1,  1,  1,  1,  1, -1,  1,  1,  1,
        1,  1,  1, -1, -1,  1, -1, -1,  1,  1,  1,  1,  1,  1,  1, -1, -1,
        1,  1,  1, -1,  1,  1,  1,  1,  1,  1,  1,  1, -1,  1,  1, -1, -1,
        1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1], dtype=int64)

In [40]:
scaler.transform(X)

array([[ 0.22222222,  0.625     ,  0.06779661,  0.04166667],
       [ 0.16666667,  0.41666667,  0.06779661,  0.04166667],
       [ 0.11111111,  0.5       ,  0.05084746,  0.04166667],
       [ 0.08333333,  0.45833333,  0.08474576,  0.04166667],
       [ 0.19444444,  0.66666667,  0.06779661,  0.04166667],
       [ 0.30555556,  0.79166667,  0.11864407,  0.125     ],
       [ 0.08333333,  0.58333333,  0.06779661,  0.08333333],
       [ 0.19444444,  0.58333333,  0.08474576,  0.04166667],
       [ 0.02777778,  0.375     ,  0.06779661,  0.04166667],
       [ 0.16666667,  0.45833333,  0.08474576,  0.        ],
       [ 0.30555556,  0.70833333,  0.08474576,  0.04166667],
       [ 0.13888889,  0.58333333,  0.10169492,  0.04166667],
       [ 0.13888889,  0.41666667,  0.06779661,  0.        ],
       [ 0.        ,  0.41666667,  0.01694915,  0.        ],
       [ 0.41666667,  0.83333333,  0.03389831,  0.04166667],
       [ 0.38888889,  1.        ,  0.08474576,  0.125     ],
       [ 0.30555556,  0.

As it can be seen, in order to have access for each object's output, one needs to call different methods. So as to offer a homogeneous interface a collection of adapters is available in PipeGraph. Them all derive from the ```AdapterForSkLearnLikeAdaptee``` baseclass. This class is an adapter for Scikit-Learn objects in order to provide a common interface based on fit and predict methods irrespectively of whether the adapted object provided a ```transform```, ```fit_predict```, or ```predict interface```.

As it can be seen from the following code fragment, the ```fit``` and ```predict``` allow for an arbitrary number of positional and keyword based parameters. These will have to be coherent with the adaptees expectations, but at least we are not imposing hard constrains to the adapter's interface.
```
class AdapterForSkLearnLikeAdaptee(BaseEstimator):
    def fit(self, *pargs, **kwargs):
       ...
    def predict(self, *pargs, **kwargs):
       ...
```

Those sklearn objects following the ```predict``` protocol can be wrapped into the class ```AdapterForFitPredictAdaptee```:

In [45]:
from pipegraph.adapter import AdapterForFitPredictAdaptee

wrapped_classifier = AdapterForFitPredictAdaptee(classifier)

ModuleNotFoundError: No module named 'pipegraph'