# How to use the framework


In case that FrImCla is not installed in your system, the first task consist in installing using pip.

In [1]:
!pip install frimcla



To begin, we have to import all the classes that we will need to be able to use our framework.

In [2]:
import warnings
import time
import json
import argparse
from frimcla.utils.conf import Conf
from imutils import paths
from __future__ import print_function
from frimcla.index_features import generateFeatures
from frimcla.StatisticalComparison import statisticalComparison, majorityVoting
from frimcla.train import train
from frimcla.prediction import prediction,predictionEnsemble
warnings.simplefilter(action="ignore", category=FutureWarning)

Using TensorFlow backend.


RuntimeError: module compiled against API version 0xc but this version of numpy is 0xb

RuntimeError: module compiled against API version 0xc but this version of numpy is 0xb

We will use a small dataset.

In [None]:
!wget "https://drive.google.com/uc?id=1nMhMINs75FEIW2bAgmSbW1OqUV1Cqs3b&export=download&authuser=0" -O DogCat.zip
!unzip DogCat.zip

### Configuring the dataset path

First of all we have to know the path which we have our dataset. The dataset must have a folder for each class that we want to predict. 

In [3]:
datasetPath = "DogCat"

### Feature Extractor

In this step we decide the feature extractor models that we are going to use with our dataset. These models will extract the most important points of the images. Then we save the points and with the classifier models that we will choose after this, we will classify the images with the classes of the dataset. Each feature extractor model has a different way to collect the most important points and for this reason we have to compare the models, because there is not a model that always fits better with the datasets.

In [4]:
featureExtractors = [["resnet", "False"],["vgg16", "False"]]

Now that we have the feature extractor models we can execute the algorithm that collect the features of the dataset for each model. The only thing that we have to do is indicate the paths of the dataset and the output and the models that we want to use for the study. The verbose parameter is to indicate whether we want to appear information about the execution on console.

In [5]:
generateFeatures("./", 32, datasetPath, featureExtractors, False)

This algorithm will create a set of files that contains the features of the images. Each file corresponds to a model of those indicated above. 

### Classification models

Once we have stored the features of the images, we have to choose the clasiffication models that we are going to use for the dataset. All these classifiers will be used for each feature extractor model to know which is the performance of every combination.

In [6]:
modelClassifiers = [ "MLP","SVM","KNN", "LogisticRegression", "GradientBoost", "RandomForest"]

With the classifiers chosen, now that we have to do is to carry out a statistical analysis. The analysis studies and compares every combination. Once the analysis has compared all the combinations gives us the best combination of feature extractor model and classifier model and all the combinations that have not significant differencies with the best result.  

### Performance measures and ensemble

We have to select a performance measure to know which is the performance of the algorithm. In this case, there are five different measures (accuracy, recall, precision, auroc and f1). The user have to select only one of the five measures. Accuracy is the default measure.

To improve the performance, the framework uses an ensemble technique called majority voting. This technique uses all the models generated to predict the class of the image. The framework only saves the models that have a certain percentage of the measure chosen by the user. Then these models will be trained to classify the images. 

In [7]:
measure = "accuracy"

In [8]:
majorityVoting("./", datasetPath, featureExtractors, modelClassifiers, measure, False)

### Training the models

Finally, we have to train the models. In this step, all the models that have been selected will be trained. But only models with more than 56% of the measure chosen previously. Thus, the framework prevents models with bad results from worsening the prediction. In this case we do not split the dataset in test and train data, we need all the images to train and improve the results of the model. 

In this function FrImCla asks the user if he wants a web application or not (Y if the user wants the webapp). 

In [9]:
train("./", datasetPath, 1)

y


Once we have the model trained, we can predict the class of the new images.

### Prediction

Now, with the models trained we can predict the classes of our images. For this task, we have developed another algorithm to use the models. This execution will give us the predicted class of the image that we choose. 

In [10]:
datasetName = datasetPath[datasetPath.rfind("/"):]
imagePaths = list(paths.list_images(datasetPath))

with open("./" + datasetName +'/ConfModel.json') as data:
    datos = json.load(data)

extractors = datos["featureExtractors"]
classifiers = ["GradientBoost","RandomForest", "SVM","KNN","LogisticRegression", "MLP"]

imagePaths = list(paths.list_images("/home/magarcd/Escritorio/dataset/MiasRecortadasTEST"))
predictionEnsemble(featureExtractors, modelClassifiers, imagePaths, "./", datasetName)