# Fuzzy Machine Learning Model Fusion

<span style='font-size:1.5em; color:blue'>
 In this notebook, we will be working with the results of inference processing on remote sensing data from three deep convolutional neural networks (DCNN).
</span>


## Deep Convolutional Neural Networks

### ResNet50 

![images/resnet50_kaggle.png MISSING](images/resnet50_kaggle.png)

**_Image from [Kaggle.com](https://kaggle.com)_**



### InceptionV3

![images/inceptionv3_kaggle.png MISSING](images/inceptionv3_kaggle.png)

**_Image from [Kaggle.com](https://kaggle.com)_**


### DenseNet

![images/densenet_towarddatascience.png MISSING](images/densenet_towarddatascience.png)

**_Image from [TowardsDataScience.com](https://towardsdatascience.com/)_**

### Further Reading / References
 * [ResNet - Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385)
 * [InceptionV3 - Rethinking the Inception Architecture for Computer Vision](https://arxiv.org/abs/1512.00567)
 * [DenseNet - Densely Connected Convolutional Networks](https://arxiv.org/abs/1608.06993)

---

## Data : Remote Sensing and Image Scene Classification

<span style='font-size:1.5em; color:blue'>
 The [RESISC-45 data set is described here.](https://arxiv.org/abs/1703.00121)
 The data is composed of 45 classes of remote image scenes.
 Below is a sample of the RESISC-45 data from the _arxiv_ paper.
</span>

![images/RESISC45_p1.PNG MISSING](images/RESISC45_p1.PNG)
![images/RESISC45_p2.PNG MISSING](images/RESISC45_p2.PNG)


---

## We did some heavy lifting for you!

### aka - GPU Training
![convfilters_giphy.gif MISSING](images/convfilters_giphy.gif)
**_Image from [Giphy](https://media.giphy.com/media/metK0W9OSCoyQ/giphy.gif)_**

<span style='font-size:1.5em; color:blue'>
 We have trained the three DCNN with the following hyperparameters:
</span>
 * <span style='font-size:1.25em'>Epochs : 15</span>
 * <span style='font-size:1.25em'>Batch Size : 64</span>
 * <span style='font-size:1.25em'>Optimizer: Adam </span>
 * <span style='font-size:1.25em'>Initial Learning Rate: 1e-3</span>

<span style='font-size:1.5em; color:blue'>
 Hardware
</span>
 * <span style='font-size:1.25em'>Nvidia V100</span>
 * <span style='font-size:1.25em'>approximately 10 hours for each architecture's 5-fold experiments</span>


<span style='font-size:1.5em; color:blue'>
Performance Characteristics - 5-Fold Cross-Validation
</span>
 * <span style='font-size:1.25em'>ResNet50 : 92.1%</span>
 * <span style='font-size:1.25em'>InceptionV3 : 60.6%</span>
 * <span style='font-size:1.25em'>DenseNet : 89.1%</span>

---

## Fusion with the Choquet Fuzzy Integral

<span style='font-size:1.5em; color:blue'>
We will be using the techniques of enhanced decision information fusion from [this publication](https://doi.org/10.1109/LGRS.2018.2839092).
</span>

![images/DCNN_Fusion_Framework_Vert.png MISSING](images/DCNN_Fusion_Framework_Vert.png)

#### Choquet Fuzzy Integral


In [1]:
import os
import sys
module_path = os.path.abspath(os.path.join('../src'))
if module_path not in sys.path:
    sys.path.append(module_path)
    
import ChI
import numpy as np
import csv
from os import listdir
from os.path import isfile, join
import pandas as pd
import cvxopt

In [2]:
def soft_max(samples):
    # Normalizes each sample w.r.t. each sample (soft max)
    for i in range(0, samples.shape[0]):  # for each sample
        for j in range(0, samples.shape[1]):  # for each network
            samples[i,j,:] = np.exp(samples[i, j, :]) / sum(np.exp(samples[i, j, :]))

    return [samples]


In [3]:
# Specify path for network outputs & csv containing cv accuracies
network_path = '../datafiles/pred'
cross_val_path = '../datafiles/cross_val'

# set up variables
image_names = []


In [4]:
csv_files = [f for f in listdir(network_path) if isfile(join(network_path, f))] # this is for the data files
cross_val = [f for f in listdir(cross_val_path) if isfile(join(cross_val_path, f))] # this is for the cv accuracies
num_nets = csv_files.__len__() # how many nets? 


In [5]:
# Create dictionary to store samples
data = dict.fromkeys(csv_files)

# Read in all of the csv data into a dictionary
densities = []
for file in cross_val:
    csv_data = []
    data_info = np.genfromtxt((cross_val_path + '/' + file), usecols=(1), skip_header=True,dtype="f", delimiter=',')
    densities.append(np.mean(data_info))

densities = np.asarray(densities)
print(densities)

[0.9216391  0.6062524  0.89075804]


In [6]:
first_net = 1 # this is a flag.
print(csv_files)


['pred.densenet.5fold::A.csv', 'pred.inception.5fold::A.csv', 'pred.res50.5fold::A.csv']


In [7]:
for file in csv_files:
    csv_data = []
    data_info = np.genfromtxt((network_path + '/' + file), usecols=(1, 2, 3), dtype="|U", delimiter=',') 
    confidence_vectors = np.genfromtxt((network_path + '/' + file), delimiter=';')

    for line in range(0, data_info.__len__()):
        if first_net:
            image_names.append(data_info[line,0])
        csv_data.append(np.hstack((data_info[line,:-1], data_info[line,2].partition(';')[0], confidence_vectors[line, 1:])))

    first_net = 0
    data[file] = csv_data

In [8]:

# How many classes are there?
num_classes = confidence_vectors.shape[1]# Assuming the first 4 columns are 'image	y_true	confidence	y_pred'

    # Now I need to build the samples and their corresponding labels
    # There will be the same number of ChI's as there are classes(L0
    # One ChI per class, so each one sample will turn into L samples
num_samples = data_info.__len__()
samples = np.zeros([num_samples, csv_files.__len__(), num_classes])
label = np.zeros([num_samples, num_classes])

samples_list = []

for i in range(0,3):
    samples_list.append(list(pd.read_csv(f'samples{0}.csv', header=None, index_col=0).itertuples()))
samples = np.asarray(samples_list).transpose(1,0,2)
label = np.asarray(list(pd.read_csv(f'label.csv', header=None, index_col=0).itertuples()))


In [9]:
##############################################
# Start Training and Testing
##############################################
print('--Starting Training and Testing')

train_samples = samples.copy()
train_labels  = label

test_samples = samples.copy()
test_labels = label
##############################################
# Normalize Training Data & Testing Data
##############################################
[train_samples] = soft_max(train_samples)
[test_samples] = soft_max(test_samples)

print('--Training--')
##############################################
# Train the ChI(s)
##############################################

CHIs = []
for j in range(0, num_classes):
    CHIs.append(ChI.ChoquetIntegral())

for j, chi in enumerate(CHIs):
    print('Class ChI {}'.format(j))
    tr = chi.train_chi_sugeno(densities)
    print(chi.fm)


--Starting Training and Testing


  """
  """


--Training--
Class ChI 0
[0.92163908 0.60625237 0.97117915 0.89075804 0.9944276  0.95895167
 1.        ]
Class ChI 1
[0.92163908 0.60625237 0.97117915 0.89075804 0.9944276  0.95895167
 1.        ]
Class ChI 2
[0.92163908 0.60625237 0.97117915 0.89075804 0.9944276  0.95895167
 1.        ]
Class ChI 3
[0.92163908 0.60625237 0.97117915 0.89075804 0.9944276  0.95895167
 1.        ]
Class ChI 4
[0.92163908 0.60625237 0.97117915 0.89075804 0.9944276  0.95895167
 1.        ]
Class ChI 5
[0.92163908 0.60625237 0.97117915 0.89075804 0.9944276  0.95895167
 1.        ]
Class ChI 6
[0.92163908 0.60625237 0.97117915 0.89075804 0.9944276  0.95895167
 1.        ]
Class ChI 7
[0.92163908 0.60625237 0.97117915 0.89075804 0.9944276  0.95895167
 1.        ]
Class ChI 8
[0.92163908 0.60625237 0.97117915 0.89075804 0.9944276  0.95895167
 1.        ]
Class ChI 9
[0.92163908 0.60625237 0.97117915 0.89075804 0.9944276  0.95895167
 1.        ]
Class ChI 10
[0.92163908 0.60625237 0.97117915 0.89075804 0.9944276

In [10]:
print('--Testing--')
##############################################
# Test the ChI(s)
##############################################
exper_out, known_out = [], []
dec = []
for j in range(0, test_samples.shape[0]): # for each data point
    out = []
    for k, chi in enumerate(CHIs): # for each ChI
        test_sample = np.transpose(test_samples[j, :, k])
        test_label = np.argmax(test_labels[j, :])
        out.append(chi.chi_sugeno(test_sample))
    out = np.asarray(out)
    exper_out.append(np.argmax(out))
    known_out.append(test_label)
for i in range(0, exper_out.__len__()):
    if exper_out[i] == known_out[i]:
        dec.append(1)
    else:
        dec.append(0)
with open('Results{}.csv'.format(i), 'w', newline='') as csvfile:
    spamwriter = csv.writer(csvfile, delimiter=',')
    for k in range(0, exper_out.__len__()):
        spamwriter.writerow([exper_out[k], known_out[k]])

acc = np.sum(dec) / dec.__len__()
print(acc)

--Testing--
0.9993761696818465


In [14]:
# DATA DRIVEN
print('--Training--')
##############################################
# Train the ChI(s)
##############################################

CHIs = []
for j in range(0, num_classes):
    CHIs.append(ChI.ChoquetIntegral())

print(train_samples)
      
for j, chi in enumerate(CHIs):
    print('Class ChI {}'.format(j))
    train_data = np.transpose(train_samples[:,:,j])
    label_data = train_labels[:, j]
    tr = chi.train_chi_quad(train_data, label_data)
    print(chi.fm)


--Training--
[[[8.10686725e-03 9.91584782e-01 1.68251057e-04 ... 5.73747486e-08
   1.11606210e-06 6.75650366e-12]
  [8.10686725e-03 9.91584782e-01 1.68251057e-04 ... 5.73747486e-08
   1.11606210e-06 6.75650366e-12]
  [8.10686725e-03 9.91584782e-01 1.68251057e-04 ... 5.73747486e-08
   1.11606210e-06 6.75650366e-12]]

 [[2.11738797e-02 1.08142790e-08 3.49496266e-08 ... 3.84348733e-10
   2.63677546e-08 6.17516461e-10]
  [2.11738797e-02 1.08142790e-08 3.49496266e-08 ... 3.84348733e-10
   2.63677546e-08 6.17516461e-10]
  [2.11738797e-02 1.08142790e-08 3.49496266e-08 ... 3.84348733e-10
   2.63677546e-08 6.17516461e-10]]

 [[5.55360225e-02 1.04346388e-08 3.37227038e-08 ... 3.70855994e-10
   2.54421024e-08 5.95838261e-10]
  [5.55360225e-02 1.04346388e-08 3.37227038e-08 ... 3.70855994e-10
   2.54421024e-08 5.95838261e-10]
  [5.55360225e-02 1.04346388e-08 3.37227038e-08 ... 3.70855994e-10
   2.54421024e-08 5.95838261e-10]]

 ...

 [[           nan 0.00000000e+00 0.00000000e+00 ... 0.00000000e+00

ValueError: domain error

In [None]:
print('--Testing--')
##############################################
# Test the ChI(s)
##############################################
exper_out, known_out = [], []
dec = []
for j in range(0, test_samples.shape[0]): # for each data point
    out = []
    for k, chi in enumerate(CHIs): # for each ChI
        test_sample = np.transpose(test_samples[j, :, k])
        test_label = np.argmax(test_labels[j, :])
        out.append(chi.chi_quad(test_sample))
    out = np.asarray(out)
    exper_out.append(np.argmax(out))
    known_out.append(test_label)
for i in range(0, exper_out.__len__()):
    if exper_out[i] == known_out[i]:
        dec.append(1)
    else:
        dec.append(0)
with open('Results{}.csv'.format(i), 'w', newline='') as csvfile:
    spamwriter = csv.writer(csvfile, delimiter=',')
    for k in range(0, exper_out.__len__()):
        spamwriter.writerow([exper_out[k], known_out[k]])

acc = np.sum(dec) / dec.__len__()