### Part d): Classification  analysis using neural networks

With a well-written code it should now be easy to change the
activation function for the output layer.

Here we will change the cost function for our neural network code
developed in parts b) and c) in order to perform a classification analysis. 

We will here study the Wisconsin Breast Cancer  data set. This is a typical binary classification problem with just one single output, either True or Fale, $0$ or $1$ etc.
You find more information about this at the [Scikit-Learn
site](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_breast_cancer.html) or at the [University of California
at Irvine](https://archive.ics.uci.edu/ml/datasets/breast+cancer+wisconsin+(original)). 

To measure the performance of our classification problem we use the
so-called *accuracy* score.  The accuracy is as you would expect just
the number of correctly guessed targets $t_i$ divided by the total
number of targets, that is

$$
\text{Accuracy} = \frac{\sum_{i=1}^n I(t_i = y_i)}{n} ,
$$

where $I$ is the indicator function, $1$ if $t_i = y_i$ and $0$
otherwise if we have a binary classification problem. Here $t_i$
represents the target and $y_i$ the outputs of your FFNN code and $n$ is simply the number of targets $t_i$.

Discuss your results and give a critical analysis of the various parameters, including hyper-parameters like the learning rates and the regularization parameter $\lambda$ (as you did in Ridge Regression), various activation functions, number of hidden layers and nodes and activation functions.  

As stated in the introduction, it can also be useful to study other
datasets. 

Again, we strongly recommend that you compare your own neural Network
code for classification and pertinent results against a similar code using **Scikit-Learn**  or **tensorflow/keras** or **pytorch**.

PLAN:
1. last ned breast cancer data set
2. bruk ffnn til klassifikasjon (med ... som siste lag)
3. Bruk back propogation for å forbedre svaret
4. bruke accuracy til å teste resultatene mine 

In [69]:
## just downloading the dataset
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
import autograd.numpy as np
from autograd import grad, elementwise_grad
import FFNN as fn

wisconsin = load_breast_cancer()
X = wisconsin.data
target = wisconsin.target
target = target.reshape(target.shape[0], 1)

X_train, X_val, t_train, t_val = train_test_split(X, target)

scaler = MinMaxScaler()
scaler.fit(X_train)
X_train = scaler.transform(X_train)
X_val = scaler.transform(X_val)

#print(X[0].size)

#""""
network_input_size = X[0].size
print(X.size)
print(network_input_size)
layer_output_sizes = [2, 3, 2]
activation_funcs = [fn.sigmoid, fn.sigmoid, fn.ReLU]
activation_ders = [fn.sigmoid_der, fn.sigmoid_der, fn.ReLU_der]

layers = [(np.random.randn(network_input_size,2), np.random.randn(network_input_size))]
batched_layers = [(layers[0][0].T, layers[0][1])]
##fn.create_layers(network_input_size, layer_output_sizes)

predict = fn.feed_forward_batch(X, layers, activation_funcs)

layer_grads = fn.backpropagation_batch(X, layers, activation_funcs, target, activation_ders)
print(layer_grads)

cost_grad = grad(fn.cost, 0)
print(cost_grad(layers, X, activation_funcs, target))

cost_grad = grad(fn.cost_batch, 0)
print(cost_grad(batched_layers, X, activation_funcs, target))
#"""

17070
30


ValueError: operands could not be broadcast together with shapes (569,2) (30,) 

In [264]:
## https://gpt.uio.no/chat/810796
import autograd.numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score
import FFNN as fn
import random

np.random.seed(42) #random seed to ensure reproducibility
# Load and preprocess the Wisconsin Breast Cancer dataset
data = datasets.load_breast_cancer()
X = data.data
y = data.target
y = y.reshape(-1, 1)

scaler = StandardScaler()
X = scaler.fit_transform(X)

X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42) 

input_size = X_train.shape[1]
output_layer = [1]
hidden_layers = [10]  
layers = fn.create_layers_batch(input_size, hidden_layers+output_layer)
activation_funcs = [fn.ReLU, fn.ReLU]
activation_ders = [fn.ReLU_der, fn.ReLU_der]
# Hyperparameters
learning_rate = 0.01
epochs = 1000



# Training loop
for epoch in range(epochs):
    grads =fn.backpropagation_batch(X_train, layers, activation_funcs, y_train, activation_ders)
    for i, (W, b) in enumerate(layers):
        dW, db = grads[i]
        W -= learning_rate * dW
        b -= learning_rate * db
        layers[i] = (W, b)
    
    if epoch % 100 == 0:
        cost_value = fn.cost_batch(layers, X_train, activation_funcs, y_train)
        print(f"Epoch {epoch}, cost: {cost_value}")

# Evaluate the model
prediction = fn.feed_forward_batch(X_test, layers, activation_funcs)
prediction = np.round(prediction) 
accuracy = accuracy_score(y_test, prediction) ## prediction is 1 or 0
print(f"Accuracy on test set: {accuracy}")

##confusion matrix:
falseP = 0
trueP = 0
falseN = 0
trueN = 0

for i in range(prediction.size):
    pred = prediction[i][0]
    sol = y_test[i][0]
    if pred==1 and sol==0 :
        falseP += 1
    elif pred==1 and sol==1:
        trueP +=1
    elif pred==0 and sol==1:
        falseN +=1
    else:
        trueN +=1

print("False positives: ", falseP )
print("True positives: ", trueP)
print("False Negatives: ", falseN )
print("True negatives: ", trueN)




Epoch 0, cost: 0.28078489039995186
Epoch 100, cost: 0.2590700171880951
Epoch 200, cost: 0.2323591085539164
Epoch 300, cost: 0.20303611055866658
Epoch 400, cost: 0.17551267807657506
Epoch 500, cost: 0.153316116978721
Epoch 600, cost: 0.13708704198667834
Epoch 700, cost: 0.12555328099051946
Epoch 800, cost: 0.11713940343342391
Epoch 900, cost: 0.11066877003437493
Accuracy on test set: 0.8881118881118881
False positives:  13
True positives:  86
False Negatives:  3
True negatives:  41


train_test_split har et parameter test_size, som er initialisert til å være 0.25. 

sigmpid, relu, sigmoid: lr=0.01, e=1000, test_size=0.99
Accuracy on test set: 0.6294326241134752
-||- test_size = 0.1
Accuracy on test set: 0.8947368421052632

relu, sigmoid, -||- test =0.1
Accuracy on test set: 0.7017543859649122

sigmoid, sigmoid -||-
Accuracy on test set: 0.8771929824561403

