# **Performance Evaluation**
In order to test the goodness of algorithms there are some evaluators
The Evaluator can be:
- a BinaryClassificationEvaluator for binary data
- a MulticlassClassificationEvaluator for multiclass problems

Provided metrics are:
- **Accuracy**
- **Precision**
- **Recall**
- **F-measure**

The instantiated estimator has the method evaluate() that is applied on a DataFrame
- It compares the predictions with the true label values
- Output:
    - The double value of the computed performance metric
    
Parameters of MulticlassClassificationEvaluator:
- **metricName**
    - ‘accuracy', ‘f1’, ‘weightedPrecision’, ‘weightedRecall’
- **labelCol:input**
    - Column with the true label/class value
- **predictionCol:**
    - Input column with the predicted class/label value

In [2]:
from pyspark.mllib.linalg import Vectors
from pyspark.ml.feature import VectorAssembler
from pyspark.ml.classification import LogisticRegression
from pyspark.ml.evaluation import MulticlassClassificationEvaluator
from pyspark.ml import Pipeline
from pyspark.ml import PipelineModel

# input and output folders
labeledData = "./databases/trainingData.csv"
outputPath = "./predictionsEval/"

In [3]:
# Create a DataFrame from labeledData.csv
# Training data in raw format
labeledDataDF = spark.read.load(labeledData,\
                                    format="csv", header=True,\
                                    inferSchema=True)

# Split labeled data in training and test set
# training data : 75%
# test data: 25%
trainDF, testDF = labeledDataDF.randomSplit([0.75, 0.25], seed=10)

In [4]:
# *************************
# Training step
# *************************

# Define an assembler to create a column (features) of type Vector
# containing the double values associated with columns attr1, attr2, attr3
assembler = VectorAssembler(inputCols=["attr1", "attr2", "attr3"],\
outputCol="features")

In [5]:
# Create a LogisticRegression object.
lr = LogisticRegression()
lr.setMaxIter(10)
lr.setRegParam(0.01)

LogisticRegression_0532bab5fcbf

In [6]:
# Define a pipeline 
pipeline = Pipeline().setStages([assembler, lr])
classificationModel = pipeline.fit(trainDF)

# Now, the classification model can be used to predict the class label
# of new unlabeled data

In [7]:
# Make predictions on the test data using the transform() method of the
# trained classification model transform uses only the content of 'features'
# to perform the predictions.The model is associated with the pipeline and hence
# also the assembler is executed
predictionsDF = classificationModel.transform(testDF)

In [11]:
# The predicted value is column prediction
# The actual label is column label
# Define a set of evaluators

myEvaluatorAcc = MulticlassClassificationEvaluator(labelCol="label",\
                                                    predictionCol="prediction",\
                                                    metricName='accuracy')

myEvaluatorF1 = MulticlassClassificationEvaluator(labelCol="label",\
                                                    predictionCol="prediction",\
                                                    metricName='f1')

myEvaluatorWeightedPrecision = MulticlassClassificationEvaluator(labelCol="label",\
                                                            predictionCol="prediction",\
                                                                metricName='weightedPrecision')
myEvaluatorWeightedRecall = MulticlassClassificationEvaluator(labelCol="label",\
                                                            predictionCol="prediction",\
                                                                metricName='weightedRecall')

In [12]:
# Apply the evaluators on the predictions associated with the test data
# Print the results on the standard output

print("Accuracy on test data ", myEvaluatorAcc.evaluate(predictionsDF))
print("F1 on test data ", myEvaluatorF1.evaluate(predictionsDF))
print("Weighted recall on test data ",\
                    myEvaluatorWeightedRecall.evaluate(predictionsDF))
print("Weighted precision on test data ",\
                    myEvaluatorWeightedPrecision.evaluate(predictionsDF))

Accuracy on test data  1.0
F1 on test data  1.0
Weighted recall on test data  1.0
Weighted precision on test data  1.0


### Do you think caching is useful here?
Do you think it would be useful to cache the trainig set, or something else? In this case we use the matrix associated with the predictions many times. For this reason it would be a good idea to cache its content. 

Is also a good idea to cache the training DF. If we analyse the content of the algorithm we are going to apply, for instance the LR, the system will iterate many times on the training DF.