# Variance Threshold

This is a component that removes all low-variance features using an implementation from [Scikit-learn](https://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.VarianceThreshold.html). 
<br>
Scikit-learn is an open source machine learning library that supports supervised and unsupervised learning. It also provides various tools for model fitting, data preprocessing, model selection and evaluation, and many other utilities.

This notebook shows:
- how to use the SDK to load a model and other artifacts.
- how to use a model to provide real-time transformations.

In [None]:
%%writefile Transformer.py
import logging
from typing import List, Iterable, Dict, Union

import numpy as np
from platiagro import load_model

logger = logging.getLogger(__name__)


class Transformer(object):
    """
    Model template. You can load your model parameters in __init__ from a location accessible at runtime.
    """

    def __init__(self, dataset=None, target=None, experiment_id=None):
        logger.info("Initializing")

        # Load Artifacts: Estimator, etc
        model = load_model(experiment_id=experiment_id)
        self.estimator = model["estimator"]
        self.columns = model["columns"]
        self.numerical_indexes = model["numerical_indexes"]

        logger.info("Init complete!")

    def class_names(self):
        return self.columns

    def transform_input(self, X: np.ndarray, feature_names: Iterable[str], meta: Dict = None) -> Union[np.ndarray, List, str, bytes]:
        """Returns a transformation on input data.

        Args:
            X (numpy.array): Array-like with data.
            feature_names (iterable, optional): Array of feature names.
            meta (dict, optional): Dict of metadata.

        Returns:
            numpy.array: Array-like with transformations.
        """
        # Perform Transformation
        if np.ma.any(self.numerical_indexes):
            X_thr = self.estimator.transform(X[:, self.numerical_indexes])
            X = np.concatenate((X[:, ~self.numerical_indexes], X_thr), axis=1)

        return X

## Deployment Test

It simulates a model deployed by PlatIAgro

In [None]:
%%writefile env
export DATASET="${DATASET:-boston}"
export TARGET="${TARGET:-medv}"
export EXPERIMENT_ID="${EXPERIMENT_ID:-881c6caa-2fa8-4165-b408-9eabceb5f752}"
export MODEL_NAME="Transformer"
export API_TYPE="REST"
export SERVICE_TYPE="TRANSFORMER"
export PERSISTENCE=0
export LOG_LEVEL="DEBUG"
read -r -d "" PARAMETERS << EOM
[{"type": "STRING", "name": "dataset", "value": "$DATASET"},
 {"type": "STRING", "name": "target", "value": "$TARGET"},
 {"type": "STRING", "name": "experiment_id", "value": "$EXPERIMENT_ID"}]
EOM
export PARAMETERS

In [None]:
%%bash
. ./env
seldon-core-microservice "$MODEL_NAME" "$API_TYPE" \
    --service-type "$SERVICE_TYPE" \
    --persistence "$PERSISTENCE" \
    --parameters "$PARAMETERS" \
    --log-level "$LOG_LEVEL" > log.txt 2>&1 &

ATTEMPT=0
until $(curl --output /dev/null --silent --head --fail http://localhost:5000/health/ping); do
    # exit process if not healthy after 10 seconds
    if [ "$ATTEMPT" -gt 10 ]; then
        cat log.txt
        exit 1
    fi
    ATTEMPT=$((ATTEMPT + 1))
    sleep 1
done
echo "Deployment successful. Waiting for requests."

## Make transformations

In [None]:
%%bash
curl -sSL localhost:5000/transform-input --data-binary @- << EOF
json={
    "data": {
        "names": ["crim", "zn", "indus", "chas", "nox", "rm", "age", "dis", "rad", "tax", "ptratio", "black", "lstat"],
        "ndarray": [
            [0.00632, 18.0, 2.31, 0, 0.5379999999999999, 6.575, 65.2, 4.09, 1, 296, 15.3, 396.9, 4.98]
        ]
    }
}
EOF

## View logs

In [None]:
!cat log.txt

## Clean up the test

In [None]:
!ps -ef | grep [s]eldon-core-microservice | awk '{print $2}' | xargs -r kill