## Model evaluation
In this task we will take a look at the evaluation of a classifier. To do this we give you some functions that allow you to train a classifier with PyTorch. PyTorch is a bit more advanced than scikit-learn and needs you to do more of the busy work yourself.
On the other hand it gives you the freedom to manually create your own training schemes and network configurations. Together with tensorflow it is the defacto industry standart when it comes to neural network training.
For this task it's not really necessary to understand the PyTorch code but if you're interested in learning PyTorch try to follow along by reading the comments. Don't worry, it's ok if you don't understand everything. Just be aware that for our purpose torch.tensor behaves mostly like numpy.array which you should be familiar with by now.


A great opporunity to learn more is the PyTorch Homepage wich provides many Tutorials on different machine learning tasks. 
https://pytorch.org/tutorials/

If you want to find information on a given function take a look at the documentation: 
https://pytorch.org/docs/stable/index.html

In [None]:
#Install the needed packages
!python -m pip install torch
!python -m pip install sklearn

In [None]:
import torch
from torch.utils.data import Dataset, DataLoader
from sklearn.datasets import load_wine
from sklearn.preprocessing import StandardScaler
torch.manual_seed(0)

### Load the data
We load the Wine data set from scikit learn and normalize it with z-score transformation. Afterwards we shuffle the data because it is ordered by class and this order would mess with the kfold crossvalidation you are going to implement.

In [None]:
wine = load_wine()
data = wine["data"]

target = torch.from_numpy(wine["target"])

#scale the data to mean = 0 and var = 1
scaler = StandardScaler()
scaler.fit(data)
data = torch.from_numpy(scaler.transform(data)).float()

#Because the data is ordered we need to shuffle it
shuffle_seed = torch.randperm(data.shape[0])
data = data[shuffle_seed]
target = target[shuffle_seed]

attribute_count = data.shape[1]
label_count = len(wine["target_names"])

### Dataset
As you should already know you can print the information about the dataset with the "DESCR" key.

In [None]:
print(wine["DESCR"])

### Define model
Here we define our model. Some of the values are fixed by our dataset, like the number of input neurons and the number of output neurons. The hidden layers can be varied and are given in here as a list of integers, where every element defines the number of neurons in a hidden layer i.e. hidden_layers =  [10,10] defines a neural network with two hidden layers with 10 neurons each.

In [None]:
def create_model(hidden_layers = [],input_size = attribute_count, output_size = label_count, 
                 activation = torch.nn.ReLU(),output_activation = torch.nn.Identity()):
    #the list of sizes is usefull to manage the input and output sizes of the layers in our network
    sizes = [input_size] + hidden_layers + [output_size]
    #the list of layers will be combined by using nn.Sequential to easily create a feed forwad network
    #from a list of layers and activation functions
    layers = []
    
    for i in range(len(sizes)-1):
        #choose the inner activation function for all layers except the last one
        act = activation if i < len(sizes) -2 else output_activation
        #concatenate a Linear layer and the activation function with our layer list
        layers+= [torch.nn.Linear(sizes[i],sizes[i+1]),act]
    #create the neural network from our layer list
    return torch.nn.Sequential(*layers)

### Training Loop
The train_model function contains the training Loop for a given model. Mandatory inputs are the model, data, target and epochs. 

In [None]:
def trainModel(model, data, target, epochs, lr = 0.01, batchsize = 20, shuffle = False):
    #How to calculate the Loss (here we use crossentropy) 
    criterion = torch.nn.CrossEntropyLoss()
    
    #The Optimization method for the weights Adam or Stochastic Gradient Descent (SGD) are feasible
    optimizer = torch.optim.Adam(model.parameters(),lr=lr)
    #Loop n times over the Dataset
    for epoch in range(epochs):
        #It may be helpful to shuffle your data every epoch, we don't do it here for reproducibility reasons
        if shuffle:
            seed = torch.randperm(data.shape[0])
            data = data[seed]
            target = target[seed]
        for index in range(0,len(data),batchsize):
            #create the batch
            batch_last = index + batchsize
            data_batch = data[index: batch_last] if batch_last < data.shape[0] else data[index: -1]
            target_batch = target[index: batch_last] if batch_last < target.shape[0] else target[index: -1]
            
            #forward pass
            #calculate the outputs
            scores = model(data_batch)
            #calculate the loss
            loss = criterion(scores, target_batch)
            #backpropagation
            #The gradient has to be set to zero before calculating the new gradients
            optimizer.zero_grad()
            #propagate the loss backwards through the network
            loss.backward()
            #update the weights
            optimizer.step()
    #return the trained model       
    return model
    

### Make predictions
The predict function takes the model and some data and predicts the class asscociated with the data.

In [None]:
def predict(data,model):
    #if a single datapoint is given we have to unsqueeze it to handle more than one datapoint aswell
    if(len(data.shape)) == 1:
        data = data.unsqueeze(0)
    #find the output of our model that has the largest value and use it as our prediction
    #(torch.tensor.max() returns the largest value as the first return value and its index as the scond return value)
    _, prediction = model(data).max(1)
    return prediction

### Accuracy
The calculate_accuracy function takes some data and the asscociated targets and a model and calculates the accuracy of the model

In [None]:
def calculate_accuracy(data, target, model):
    num_samples = data.shape[0]
    #switch to evaluation mode
    model.eval()
    with torch.no_grad():
        #generate the predictions for the data from our model
        prediction = predict(data,model)
        #sum up correct predictions (True = 1)
        num_correct = (prediction == target).sum()
        #calculate accuracy (proportion of correct predictions)
        return num_correct/num_samples


### Putting it all together
Now it is time to put it all together. We create a Model with two hidden layers with 100 neurons each and train it on the whole dataset. After that we evaluate the accuracy of our model on the training-data.

In [None]:
model = create_model([10])
model = trainModel(model, data, target, 50, lr = 0.01)
accuracy = calculate_accuracy(data,target, model)
print(f"Accuracy on training set: {accuracy*100:.2f} %")

### Crossvalidation
100% Accuracy looks really good, but maybe it's too good to be true. Till now we trained on the same set that we used for evaluation, this is a bad practice especially for small  datasets like ours because our network may be overfitting.

Now it's your turn, write a function that performs kfold crossvalidation on the dataset to test the quality of your model. To do so split the data into k training and test subsets. Train multiple models on the training data and evaluate the accuracy on the test data.

Return the different results aswell as the average accuracy.

In [None]:
def kfold_crossvalidation(k, data, target, hidden = [10], epochs  = 50, lr = 0.01):
    #your code here
    return (accuracies, avg_accuracy)

### Test kfold crossvalidation
The following code can be used to test your implementation, if your average accuracy is at ~97% you probably have done it correctly.

In [None]:
torch.manual_seed(0)
kfold_crossvalidation(10, data, target, [10], 10, 0.01)

### Calculate the confusion Matrix
Since our model is not as perfect as it seems, let's find out what kind of misclassifications it produced. Write a function that calculates the confusion matrix for our data. To do so create a m x m matrix with m = number of classes. Predict the classes and compare the prediction with the target. Sum up how often the classes where assigned the different classes by our classifier.

In [None]:
def confusion_matrix(data,target,model):
    #Your code here
    return confusion_matrix

### Test the confusion matrix
The following code can be used to test your confusion matrix. If you have implemented correctly there should be (an) error(s). 

In [None]:
torch.manual_seed(0)

training_data = data[0:120]
training_target = target[0:120]

test_data = data[120:-1]
test_target = target[120:-1]

model = create_model([10])
model = trainModel(model, training_data, training_target, 10, lr = 0.01)

print(confusion_matrix(test_data,test_target,model))

### What kind of error(s) did our model produce?

In [None]:
#Your answer here