# Parametrized Learning
While k-NN and other methods without learning may be enough for some simple applications, it becomes unusable when we require a huge amount of info, because always that we need to make the image classification all the database needs to be available and that's not the case with parametrized Learning
Parametrized Learning lays on the learning of some parameters that represent the set of data, and just these parameters will be used on future classifications. 
So, simply put, after the model is trained, all we need are the values of the parameters to classsify our datapoint, and not the whole dataset. 

### Four Components of Parametrized Learning
In machine learning, the parametrization uses four parameters that will be defined below:

 - #### Data
Data consists of our dataset with the respective labels
> In a 1000 images dataset, with 32x32 pixels RGB each, the data would be a matrix with dimensions [1000x32x32x3]

 - #### Weights and Biases
The weights are the parameters of the classifier, or how each pixel will pass through each of the nodes of the network.
> If there are three possible classes for classification, the weight's matrix size is [3x3072] and the bias matrix size [1x3072]

 - #### Scoring Function 
The scoring function is what makes your predictions, simple put, it reads your input data, make some kind of transform to it, multiply with the weights and outputs the predicted labels.
> An example of scoring funtion is in the linear classification, the function is WeightsMatrix*Inputs+BiasMatrix

 - #### Loss Function
The loss function is a measure of how well your algorithm is behaving. This function is used to tune up the weights and so make a better model.
> An example of Loss Function is the Hinge Loss Function, that is sum(max(0, sj-syi+1)), that means: take the max between the scoring value of the right class minus the scoring value of the each of the classes.
> If the loss fucntion for a image is 0, it means it have correctly guessed the datapoint, because the right class has the biggest value, so all  the subtractions results are negative and the max takes the 0

### Starting the action
So now, we'll show what are the parameters through an example.

The dataset in this case will be just one 32x32 RGB pixels image. So data will be a [32x32x3] matrix, that will be flatten in a 3072 array.

The weights matrix and the bias matrix will be randomly choosen (that's not something that we'll do in the following projects) 

In [2]:
import cv2
import numpy as np


# Initialize class labels and set the seed of our pseudo-random number generator
# '1' is chosen as the seed because it gives the 'correct classification'
labels = ['dog', 'cat', 'panda']
np.random.seed(1)

# Randomly initialize the weight and bias vectors between 0 and 1
w = np.random.randn(3, 3072)
b = np.random.randn(3)

# Load image, resize it (ignoring the aspect ratio) and flatten it
original = cv2.imread('beagle.png')
image = cv2.resize(original, (32, 32)).flatten()


The scores will of the linear classification type, so we'll use the form WeightsMatrix*Inputs+BiasMatrix

In [3]:
# Compute the output scores
scores = w.dot(image) + b

# Loop over the scores and labels to display them
for label, score in zip(labels, scores):
    print('[INFO]: {}: {:.2f}'.format(label, score))

# Draw the label with the highest score on the image as our prediction
cv2.putText(original, 'Label: {}'.format(labels[np.argmax(scores)]), (10, 30),
            cv2.FONT_HERSHEY_SIMPLEX, 0.9, (0, 255, 0), 2)

# Display our input image
cv2.imshow("Image", original)
cv2.waitKey(0)
cv2.destroyAllWindows()


[INFO]: dog: 7963.93
[INFO]: cat: -2930.99
[INFO]: panda: 3362.47
