<a href="https://colab.research.google.com/github/iamshnoo/mlpack-testing/blob/master/Multi_Label_Soft_Margin_Loss.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Multi Label Soft Margin Loss function

## Link to PyTorch Docs for this function


https://pytorch.org/docs/stable/nn.html#multilabelsoftmarginloss

## Imports and Installations

### Install mlpack library from linux package manager

In [0]:
%%capture
!sudo apt-get install libmlpack-dev 

### PyTorch library imports

In [0]:
import torch
import torch.nn as nn

## Test Case 1 - 3 classes, 3 examples, not weighted (used in loss_functions_test.cpp)

### Assumptions and Statements

**Overview :**

Input is a collection of 3 multi-label examples. <br>
Each example belongs to 1 of 3 possible classes. <br>
Hence, the class label for each example is 1x3 one-hot encoded vector. <br>
Default weights (1) are assigned to each class using a 1x3 weights vector. <br>

---

**Summary of shapes of tensors involved :** <br>

The shape of the input matrix is 3x3. <br>
The shape of each example is 1x3. <br>
The shape of each class label is 1x3. <br>
The shape of the target matrix is 3x3. <br>
The shape of the weight matrix is 1x3. <br>
The shape of the gradient of the input matrix will also be 3x3. <br>
The output of the forward function is reduced to a 1x1 tensor using either sum or mean reduction. <br>

---

### Sum

In [3]:
torch.set_printoptions(precision=5)
x = torch.tensor([[ 0.1778,  0.1203, -0.2264],[ 0.0957,  0.2403, -0.3400],[ 0.1397,  0.1925, -0.3336]], requires_grad=True) # 3 Rows, 3 columns
y = torch.tensor([[0., 1., 0.],[1., 0., 0.],[0., 0., 1.]], requires_grad=True) # 3 Rows, 3 columns
weights = torch.tensor([1, 1, 1])
criterion_mean = nn.MultiLabelSoftMarginLoss(reduction='sum', weight=weights)
loss_mean = criterion_mean(x,y)
print("FORWARD:\n", loss_mean, "\nBACKWARD")
loss_mean.backward()
print(x.grad)
print("---------------------------------------------------------------")
torch.set_printoptions(precision=6)
print(torch.sum(x.grad))
print("---------------------------------------------------------------")

FORWARD:
 tensor(2.14829, grad_fn=<SumBackward0>) 
BACKWARD
tensor([[ 0.18144, -0.15665,  0.14788],
        [-0.15870,  0.18660,  0.13860],
        [ 0.17829,  0.18266, -0.19421]])
---------------------------------------------------------------
tensor(0.505909)
---------------------------------------------------------------


### Mean

In [4]:
torch.set_printoptions(precision=6)
x = torch.tensor([[ 0.1778,  0.1203, -0.2264],[ 0.0957,  0.2403, -0.3400],[ 0.1397,  0.1925, -0.3336]], requires_grad=True) # 3 Rows, 3 columns
y = torch.tensor([[0., 1., 0.],[1., 0., 0.],[0., 0., 1.]], requires_grad=True) # 3 Rows, 3 columns
weights = torch.tensor([1, 1, 1])
criterion_mean = nn.MultiLabelSoftMarginLoss(reduction='mean', weight=weights)
loss_mean = criterion_mean(x,y)
print("FORWARD:\n", loss_mean, "\nBACKWARD")
loss_mean.backward()
print(x.grad)
print("---------------------------------------------------------------")
print(torch.sum(x.grad))
print("---------------------------------------------------------------")

FORWARD:
 tensor(0.716095, grad_fn=<MeanBackward0>) 
BACKWARD
tensor([[ 0.060481, -0.052218,  0.049293],
        [-0.052899,  0.062199,  0.046201],
        [ 0.059430,  0.060886, -0.064737]])
---------------------------------------------------------------
tensor(0.168636)
---------------------------------------------------------------


### C++ File **`test.cpp`** to reproduce same results 

In [0]:
%%capture
%%writefile test.cpp
#include <iostream>
#include <armadillo>

using namespace std;
using namespace arma;

int main()
{
  // Constructor
  arma::mat x,y;
  arma::mat weight;

  x << 0.1778 << 0.1203 << -0.2264 << endr
    << 0.0957 << 0.2403 << -0.3400 << endr
    << 0.1397 << 0.1925 << -0.3336 << endr;

  y << 0 << 1 << 0 << endr
    << 1 << 0 << 0 << endr
    << 0 << 0 << 1 << endr;

  weight.ones(1,3);

  // Forward
  arma::mat logSigmoid = arma::log((1 / (1 + arma::exp(-x))));
  arma::mat logSigmoidNeg = arma::log(1 / (1 + arma::exp(x)));
  arma::mat loss = arma::mean(arma::sum(-(y % logSigmoid + (1 - y) % logSigmoidNeg)) % weight, 1);
  double loss_sum = arma::as_scalar(loss);
  double loss_mean = arma::as_scalar(loss / x.n_rows);

  // Backward - mean
  arma::mat sigmoid = 1 / (1+arma::exp(-x));
  arma::mat output ;
  output.set_size(size(x));
  output = - (y % ( 1-sigmoid) - (1-y) % sigmoid) % arma::repmat(weight, y.n_rows, 1) / output.n_elem;

  // Display
  cout << "------------------------------------------------------------------" << endl;
  cout << "USER-PROVIDED MATRICES : " << endl;
  cout << "------------------------------------------------------------------" << endl;
  cout << "Input shape : "<< x.n_rows << " " << x.n_cols << endl;
  cout << "Input : " << endl << x << endl;
  cout << "Target shape : "<< y.n_rows << " " << y.n_cols << endl;
  cout << "Target : " << endl << y << endl;
  cout << "Weight : " << weight << endl;
  cout << "Weight shape : "<< weight.n_rows << " " << weight.n_cols << endl;
  cout << "------------------------------------------------------------------" << endl;
  cout << "SUM " << endl;
  cout << "------------------------------------------------------------------" << endl;
  cout << "FORWARD : " << endl;
  cout << "Loss (sum):\n" << loss_sum << '\n';
  cout << "BACKWARD : " << endl;
  cout << "Output shape : "<< output.n_rows << " " << output.n_cols << endl;
  cout << "Output (sum) : " << endl << output * x.n_rows << endl;                                             
  cout << "Sum of all values in this matrix : " << arma::as_scalar(arma::accu(output * x.n_rows)) << endl;    
  cout << "------------------------------------------------------------------" << endl;
  cout << "MEAN " << endl;
  cout << "------------------------------------------------------------------" << endl;
  cout << "FORWARD : " << endl;
  cout << "Loss (mean):\n" << loss_mean << '\n';
  cout << "BACKWARD : " << endl;
  cout << "Output shape : "<< output.n_rows << " " << output.n_cols << endl;
  cout << "Output (mean) : " << endl << output << endl;
  cout << "Sum of all values in this matrix : " << arma::as_scalar(arma::accu(output)) << endl;
  cout << "------------------------------------------------------------------" << endl;
  return 0;
}

### Run using **`g++ test.cpp -o test -larmadillo && ./test`**

In [6]:
%%script bash
g++ test.cpp -o test -larmadillo && ./test

------------------------------------------------------------------
USER-PROVIDED MATRICES : 
------------------------------------------------------------------
Input shape : 3 3
Input : 
   0.1778   0.1203  -0.2264
   0.0957   0.2403  -0.3400
   0.1397   0.1925  -0.3336

Target shape : 3 3
Target : 
        0   1.0000        0
   1.0000        0        0
        0        0   1.0000

Weight :    1.0000   1.0000   1.0000

Weight shape : 1 3
------------------------------------------------------------------
SUM 
------------------------------------------------------------------
FORWARD : 
Loss (sum):
2.14829
BACKWARD : 
Output shape : 3 3
Output (sum) : 
   0.1814  -0.1567   0.1479
  -0.1587   0.1866   0.1386
   0.1783   0.1827  -0.1942

Sum of all values in this matrix : 0.505909
------------------------------------------------------------------
MEAN 
------------------------------------------------------------------
FORWARD : 
Loss (mean):
0.716095
BACKWARD : 
Output shape : 3 3
Output 

## Test Case 2 - 3 classes, 4 examples, weighted (used in loss_functions_test.cpp)

### Assumptions and Statements.

**Overview :**

Input is a collection of 4 multi-label examples. <br>
Each example belongs to 1 of 3 possible classes. <br>
Hence, the class label for each example is 1x3 one-hot encoded vector. <br>
Different weights are assigned to each class using a 1x3 weights vector. <br>

---

**Summary of shapes of tensors involved :** <br>

The shape of the input matrix is 4x3. <br>
The shape of each example is 1x3. <br>
The shape of each class label is 1x3. <br>
The shape of the target matrix is 4x3. <br>
The shape of the weight matrix is 1x3. <br>
The shape of the gradient of the input matrix will also be 4x3. <br>
The output of the forward function is reduced to a 1x1 tensor using either sum or mean reduction. <br>

---

### Sum

In [7]:
torch.set_printoptions(precision=5)
x = torch.tensor([[ 0.1778,  0.1203, -0.2264],[ 0.0957,  0.2403, -0.3400],[ 0.1397,  0.1925, -0.3336], [ 0.2256, 0.3144, -0.8695]], requires_grad=True) # 4 Rows, 3 columns
y = torch.tensor([[0., 1., 0.],[1., 0., 0.],[0., 0., 1.],[1., 0., 0.]], requires_grad=True) # 4 Rows, 3 columns
weights = torch.tensor([1, 2, 3])
criterion_mean = nn.MultiLabelSoftMarginLoss(reduction='sum', weight=weights)
loss_mean = criterion_mean(x,y)
print("FORWARD:\n", loss_mean, "\nBACKWARD")
loss_mean.backward()
print(x.grad)
print("---------------------------------------------------------------")
print(torch.sum(x.grad))
print("---------------------------------------------------------------")

FORWARD:
 tensor(5.35057, grad_fn=<SumBackward0>) 
BACKWARD
tensor([[ 0.18144, -0.31331,  0.44364],
        [-0.15870,  0.37319,  0.41581],
        [ 0.17829,  0.36532, -0.58264],
        [-0.14795,  0.38531,  0.29536]])
---------------------------------------------------------------
tensor(1.43577)
---------------------------------------------------------------


### Mean

In [8]:
torch.set_printoptions(precision=5)
x = torch.tensor([[ 0.1778,  0.1203, -0.2264],[ 0.0957,  0.2403, -0.3400],[ 0.1397,  0.1925, -0.3336], [ 0.2256, 0.3144, -0.8695]], requires_grad=True) # 4 Rows, 3 columns
y = torch.tensor([[0., 1., 0.],[1., 0., 0.],[0., 0., 1.],[1., 0., 0.]], requires_grad=True) # 4 Rows, 3 columns
weights = torch.tensor([1, 2, 3])
criterion_mean = nn.MultiLabelSoftMarginLoss(reduction='mean', weight=weights)
loss_mean = criterion_mean(x,y)
print("FORWARD:\n", loss_mean, "\nBACKWARD")
loss_mean.backward()
print(x.grad)
print("---------------------------------------------------------------")
torch.set_printoptions(precision=6)
print(torch.sum(x.grad))
print("---------------------------------------------------------------")

FORWARD:
 tensor(1.33764, grad_fn=<MeanBackward0>) 
BACKWARD
tensor([[ 0.04536, -0.07833,  0.11091],
        [-0.03967,  0.09330,  0.10395],
        [ 0.04457,  0.09133, -0.14566],
        [-0.03699,  0.09633,  0.07384]])
---------------------------------------------------------------
tensor(0.358943)
---------------------------------------------------------------


### C++ File **`test.cpp`** to reproduce same results 

In [0]:
%%capture
%%writefile test.cpp
#include <iostream>
#include <armadillo>

using namespace std;
using namespace arma;

int main()
{
  // Constructor
  arma::mat x,y;
  arma::mat weight;

  x << 0.1778 << 0.1203 << -0.2264 << endr
    << 0.0957 << 0.2403 << -0.3400 << endr
    << 0.1397 << 0.1925 << -0.3336 << endr
    << 0.2256 << 0.3144 << -0.8695 << endr;

  y << 0 << 1 << 0 << endr
    << 1 << 0 << 0 << endr
    << 0 << 0 << 1 << endr
    << 1 << 0 << 0 << endr;

  weight.ones(1,3);
  weight(0, 1) = 2;
  weight(0, 2) = 3;

  // Forward
  arma::mat logSigmoid = arma::log((1 / (1 + arma::exp(-x))));
  arma::mat logSigmoidNeg = arma::log(1 / (1 + arma::exp(x)));
  arma::mat loss = arma::mean(arma::sum(-(y % logSigmoid + (1 - y) % logSigmoidNeg)) % weight, 1);
  double loss_sum = arma::as_scalar(loss);
  double loss_mean = arma::as_scalar(loss / x.n_rows);

  // Backward - mean
  arma::mat sigmoid = 1 / (1+arma::exp(-x));
  arma::mat output ;
  output.set_size(size(x));
  output = - (y % ( 1-sigmoid) - (1-y) % sigmoid) % arma::repmat(weight, y.n_rows, 1) / output.n_elem;

  // Display
  cout << "------------------------------------------------------------------" << endl;
  cout << "USER-PROVIDED MATRICES : " << endl;
  cout << "------------------------------------------------------------------" << endl;
  cout << "Input shape : "<< x.n_rows << " " << x.n_cols << endl;
  cout << "Input : " << endl << x << endl;
  cout << "Target shape : "<< y.n_rows << " " << y.n_cols << endl;
  cout << "Target : " << endl << y << endl;
  cout << "Weight : " << weight << endl;
  cout << "Weight shape : "<< weight.n_rows << " " << weight.n_cols << endl;
  cout << "------------------------------------------------------------------" << endl;
  cout << "SUM " << endl;
  cout << "------------------------------------------------------------------" << endl;
  cout << "FORWARD : " << endl;
  cout << "Loss (sum):\n" << loss_sum << '\n';
  cout << "BACKWARD : " << endl;
  cout << "Output shape : "<< output.n_rows << " " << output.n_cols << endl;
  cout << "Output (sum) : " << endl << output * x.n_rows << endl;                                             
  cout << "Sum of all values in this matrix : " << arma::as_scalar(arma::accu(output * x.n_rows)) << endl;    
  cout << "------------------------------------------------------------------" << endl;
  cout << "MEAN " << endl;
  cout << "------------------------------------------------------------------" << endl;
  cout << "FORWARD : " << endl;
  cout << "Loss (mean):\n" << loss_mean << '\n';
  cout << "BACKWARD : " << endl;
  cout << "Output shape : "<< output.n_rows << " " << output.n_cols << endl;
  cout << "Output (mean) : " << endl << output << endl;
  cout << "Sum of all values in this matrix : " << arma::as_scalar(arma::accu(output)) << endl;
  cout << "------------------------------------------------------------------" << endl;
  return 0;
}

### Run using **`g++ test.cpp -o test -larmadillo && ./test`**

In [10]:
%%script bash
g++ test.cpp -o test -larmadillo && ./test

------------------------------------------------------------------
USER-PROVIDED MATRICES : 
------------------------------------------------------------------
Input shape : 4 3
Input : 
   0.1778   0.1203  -0.2264
   0.0957   0.2403  -0.3400
   0.1397   0.1925  -0.3336
   0.2256   0.3144  -0.8695

Target shape : 4 3
Target : 
        0   1.0000        0
   1.0000        0        0
        0        0   1.0000
   1.0000        0        0

Weight :    1.0000   2.0000   3.0000

Weight shape : 1 3
------------------------------------------------------------------
SUM 
------------------------------------------------------------------
FORWARD : 
Loss (sum):
5.35057
BACKWARD : 
Output shape : 4 3
Output (sum) : 
   0.1814  -0.3133   0.4436
  -0.1587   0.3732   0.4158
   0.1783   0.3653  -0.5826
  -0.1479   0.3853   0.2954

Sum of all values in this matrix : 1.43577
------------------------------------------------------------------
MEAN 
-------------------------------------------------------

## Test Case 3 - 4 classes, 3 examples, weighted

### Assumptions and Statements

**Overview :**

Input is a collection of 3 multi-label examples. <br>
Each example belongs to 1 of 4 possible classes. <br>
Hence, the class label for each example is 1x4 one-hot encoded vector. <br>
Different weights are assigned to each class using a 1x4 weights vector. <br>

---

**Summary of shapes of tensors involved :** <br>

The shape of the input matrix is 3x4. <br>
The shape of each example is 1x4. <br>
The shape of each class label is 1x4. <br>
The shape of the target matrix is 3x4. <br>
The shape of the weight matrix is 1x4. <br>
The shape of the gradient of the input matrix will also be 3x4. <br>
The output of the forward function is reduced to a 1x1 tensor using either sum or mean reduction. <br>

---

### Sum

In [11]:
torch.set_printoptions(precision=5)
x = torch.tensor([[ 0.1778,  0.1203, -0.2264, 0.1406],[ 0.0957,  0.2403, -0.3400, 0.0276],[ 0.1397,  0.1925, -0.3336, 0.3144]], requires_grad=True) # 3 Rows, 4 columns
y = torch.tensor([[0., 1., 0., 1.],[1., 0., 0., 0.],[0., 0., 1., 0.]]) # 3 Rows, 4 columns
weights = torch.tensor([1, 2, 3, 4])
criterion_mean = nn.MultiLabelSoftMarginLoss(reduction='sum', weight=weights)
loss_mean = criterion_mean(x,y)
print("FORWARD:\n", loss_mean, "\nBACKWARD")
loss_mean.backward()
print(x.grad)
print("---------------------------------------------------------------")
print(torch.sum(x.grad))
print("---------------------------------------------------------------")

FORWARD:
 tensor(5.36739, grad_fn=<SumBackward0>) 
BACKWARD
tensor([[ 0.13608, -0.23498,  0.33273, -0.46491],
        [-0.11902,  0.27989,  0.31186,  0.50690],
        [ 0.13372,  0.27399, -0.43698,  0.57796]])
---------------------------------------------------------------
tensor(1.29724)
---------------------------------------------------------------


### Mean

In [12]:
torch.set_printoptions(precision=5)
x = torch.tensor([[ 0.1778,  0.1203, -0.2264, 0.1406],[ 0.0957,  0.2403, -0.3400, 0.0276],[ 0.1397,  0.1925, -0.3336, 0.3144]], requires_grad=True) # 3 Rows, 4 columns
y = torch.tensor([[0., 1., 0., 1.],[1., 0., 0., 0.],[0., 0., 1., 0.]]) # 3 Rows, 4 columns
weights = torch.tensor([1, 2, 3, 4])
criterion_mean = nn.MultiLabelSoftMarginLoss(reduction='mean', weight=weights)
loss_mean = criterion_mean(x,y)
print("FORWARD:\n", loss_mean, "\nBACKWARD")
loss_mean.backward()
print(x.grad)
print("---------------------------------------------------------------")
torch.set_printoptions(precision=6)
print(torch.sum(x.grad))
print("---------------------------------------------------------------")

FORWARD:
 tensor(1.78913, grad_fn=<MeanBackward0>) 
BACKWARD
tensor([[ 0.04536, -0.07833,  0.11091, -0.15497],
        [-0.03967,  0.09330,  0.10395,  0.16897],
        [ 0.04457,  0.09133, -0.14566,  0.19265]])
---------------------------------------------------------------
tensor(0.432414)
---------------------------------------------------------------


### C++ File **`test.cpp`** to reproduce same results 

In [0]:
%%capture
%%writefile test.cpp
#include <iostream>
#include <armadillo>

using namespace std;
using namespace arma;

int main()
{
  // Constructor
  arma::mat x,y;
  arma::mat weight;

  x << 0.1778 << 0.1203 << -0.2264 << 0.1406 << endr
    << 0.0957 << 0.2403 << -0.3400 << 0.0276 << endr
    << 0.1397 << 0.1925 << -0.3336 << 0.3144 << endr;

  y << 0 << 1 << 0 << 1 << endr
    << 1 << 0 << 0 << 0 << endr
    << 0 << 0 << 1 << 0 << endr;

  weight.ones(1,4);
  weight(0, 1) = 2;
  weight(0, 2) = 3;
  weight(0, 3) = 4;

  // Forward
  arma::mat logSigmoid = arma::log((1 / (1 + arma::exp(-x))));
  arma::mat logSigmoidNeg = arma::log(1 / (1 + arma::exp(x)));
  arma::mat loss = arma::mean(arma::sum(-(y % logSigmoid + (1 - y) % logSigmoidNeg)) % weight, 1);
  double loss_sum = arma::as_scalar(loss);
  double loss_mean = arma::as_scalar(loss / x.n_rows);

  // Backward - mean
  arma::mat sigmoid = 1 / (1+arma::exp(-x));
  arma::mat output ;
  output.set_size(size(x));
  output = - (y % ( 1-sigmoid) - (1-y) % sigmoid) % arma::repmat(weight, y.n_rows, 1) / output.n_elem;

  // Display
  cout << "------------------------------------------------------------------" << endl;
  cout << "USER-PROVIDED MATRICES : " << endl;
  cout << "------------------------------------------------------------------" << endl;
  cout << "Input shape : "<< x.n_rows << " " << x.n_cols << endl;
  cout << "Input : " << endl << x << endl;
  cout << "Target shape : "<< y.n_rows << " " << y.n_cols << endl;
  cout << "Target : " << endl << y << endl;
  cout << "Weight : " << weight << endl;
  cout << "Weight shape : "<< weight.n_rows << " " << weight.n_cols << endl;
  cout << "------------------------------------------------------------------" << endl;
  cout << "SUM " << endl;
  cout << "------------------------------------------------------------------" << endl;
  cout << "FORWARD : " << endl;
  cout << "Loss (sum):\n" << loss_sum << '\n';
  cout << "BACKWARD : " << endl;
  cout << "Output shape : "<< output.n_rows << " " << output.n_cols << endl;
  cout << "Output (sum) : " << endl << output * x.n_rows << endl;                                             
  cout << "Sum of all values in this matrix : " << arma::as_scalar(arma::accu(output * x.n_rows)) << endl;    
  cout << "------------------------------------------------------------------" << endl;
  cout << "MEAN " << endl;
  cout << "------------------------------------------------------------------" << endl;
  cout << "FORWARD : " << endl;
  cout << "Loss (mean):\n" << loss_mean << '\n';
  cout << "BACKWARD : " << endl;
  cout << "Output shape : "<< output.n_rows << " " << output.n_cols << endl;
  cout << "Output (mean) : " << endl << output << endl;
  cout << "Sum of all values in this matrix : " << arma::as_scalar(arma::accu(output)) << endl;
  cout << "------------------------------------------------------------------" << endl;
  return 0;
}

### Run using **`g++ test.cpp -o test -larmadillo && ./test`**

In [14]:
%%script bash
g++ test.cpp -o test -larmadillo && ./test

------------------------------------------------------------------
USER-PROVIDED MATRICES : 
------------------------------------------------------------------
Input shape : 3 4
Input : 
   0.1778   0.1203  -0.2264   0.1406
   0.0957   0.2403  -0.3400   0.0276
   0.1397   0.1925  -0.3336   0.3144

Target shape : 3 4
Target : 
        0   1.0000        0   1.0000
   1.0000        0        0        0
        0        0   1.0000        0

Weight :    1.0000   2.0000   3.0000   4.0000

Weight shape : 1 4
------------------------------------------------------------------
SUM 
------------------------------------------------------------------
FORWARD : 
Loss (sum):
5.36739
BACKWARD : 
Output shape : 3 4
Output (sum) : 
   0.1361  -0.2350   0.3327  -0.4649
  -0.1190   0.2799   0.3119   0.5069
   0.1337   0.2740  -0.4370   0.5780

Sum of all values in this matrix : 1.29724
------------------------------------------------------------------
MEAN 
-------------------------------------------------

## Test Case 4 - 3 classes, 5 examples, weighted

### Assumptions and Statements

**Overview :**

Input is a collection of 5 multi-label examples. <br>
Each example belongs to 1 of 3 possible classes. <br>
Hence, the class label for each example is 1x3 one-hot encoded vector. <br>
Different weights are assigned to each class using a 1x3 weights vector. <br>

---

**Summary of shapes of tensors involved :** <br>

The shape of the input matrix is 5x3. <br>
The shape of each example is 1x3. <br>
The shape of each class label is 1x3. <br>
The shape of the target matrix is 5x3. <br>
The shape of the weight matrix is 1x3. <br>
The shape of the gradient of the input matrix will also be 5x3. <br>
The output of the forward function is reduced to a 1x1 tensor using either sum or mean reduction. <br>

---

### Sum

In [15]:
torch.set_printoptions(precision=5)
x = torch.tensor([[ 0.1778,  0.1203, -0.2264],[ 0.0957,  0.2403, -0.3400],[ 0.1397,  0.1925, -0.3336], [ 0.2256, 0.3144, -0.8695], [0.1406, 0.2569, 0.0789]], requires_grad=True) # 5 Rows, 3 columns
y = torch.tensor([[0., 1., 0.],[1., 0., 0.],[0., 0., 1.],[1., 0., 0.], [0., 0., 1.]]) # 5 Rows, 3 columns
weights = torch.tensor([1, 2, 3])
criterion_mean = nn.MultiLabelSoftMarginLoss(reduction='sum', weight=weights)
loss_mean = criterion_mean(x,y)
print("FORWARD:\n", loss_mean, "\nBACKWARD")
loss_mean.backward()
print(x.grad)
print("---------------------------------------------------------------")
print(torch.sum(x.grad))
print("---------------------------------------------------------------")

FORWARD:
 tensor(6.81357, grad_fn=<SumBackward0>) 
BACKWARD
tensor([[ 0.18144, -0.31331,  0.44364],
        [-0.15870,  0.37319,  0.41581],
        [ 0.17829,  0.36532, -0.58264],
        [-0.14795,  0.38531,  0.29536],
        [ 0.17836,  0.37592, -0.48029]])
---------------------------------------------------------------
tensor(1.50977)
---------------------------------------------------------------


### Mean

In [16]:
torch.set_printoptions(precision=5)
x = torch.tensor([[ 0.1778,  0.1203, -0.2264],[ 0.0957,  0.2403, -0.3400],[ 0.1397,  0.1925, -0.3336], [ 0.2256, 0.3144, -0.8695], [0.1406, 0.2569, 0.0789]], requires_grad=True) # 5 Rows, 3 columns
y = torch.tensor([[0., 1., 0.],[1., 0., 0.],[0., 0., 1.],[1., 0., 0.], [0., 0., 1.]]) # 5 Rows, 3 columns
weights = torch.tensor([1, 2, 3])
criterion_mean = nn.MultiLabelSoftMarginLoss(reduction='mean', weight=weights)
loss_mean = criterion_mean(x,y)
print("FORWARD:\n", loss_mean, "\nBACKWARD")
loss_mean.backward()
print(x.grad)
print("---------------------------------------------------------------")
torch.set_printoptions(precision=6)
print(torch.sum(x.grad))
print("---------------------------------------------------------------")

FORWARD:
 tensor(1.36271, grad_fn=<MeanBackward0>) 
BACKWARD
tensor([[ 0.03629, -0.06266,  0.08873],
        [-0.03174,  0.07464,  0.08316],
        [ 0.03566,  0.07306, -0.11653],
        [-0.02959,  0.07706,  0.05907],
        [ 0.03567,  0.07518, -0.09606]])
---------------------------------------------------------------
tensor(0.301953)
---------------------------------------------------------------


### C++ File **`test.cpp`** to reproduce same results 

In [0]:
%%capture
%%writefile test.cpp
#include <iostream>
#include <armadillo>

using namespace std;
using namespace arma;

int main()
{
  // Constructor
  arma::mat x,y;
  arma::mat weight;

  x << 0.1778 << 0.1203 << -0.2264 << endr
    << 0.0957 << 0.2403 << -0.3400 << endr
    << 0.1397 << 0.1925 << -0.3336 << endr
    << 0.2256 << 0.3144 << -0.8695 << endr
    << 0.1406 << 0.2569 <<  0.0789 << endr;

  y << 0 << 1 << 0 << endr
    << 1 << 0 << 0 << endr
    << 0 << 0 << 1 << endr
    << 1 << 0 << 0 << endr
    << 0 << 0 << 1 << endr;

  weight.ones(1,3);
  weight(0, 1) = 2;
  weight(0, 2) = 3;

  // Forward
  arma::mat logSigmoid = arma::log((1 / (1 + arma::exp(-x))));
  arma::mat logSigmoidNeg = arma::log(1 / (1 + arma::exp(x)));
  arma::mat loss = arma::mean(arma::sum(-(y % logSigmoid + (1 - y) % logSigmoidNeg)) % weight, 1);
  double loss_sum = arma::as_scalar(loss);
  double loss_mean = arma::as_scalar(loss / x.n_rows);

  // Backward - mean
  arma::mat sigmoid = 1 / (1+arma::exp(-x));
  arma::mat output ;
  output.set_size(size(x));
  output = - (y % ( 1-sigmoid) - (1-y) % sigmoid) % arma::repmat(weight, y.n_rows, 1) / output.n_elem;

  // Display
  cout << "------------------------------------------------------------------" << endl;
  cout << "USER-PROVIDED MATRICES : " << endl;
  cout << "------------------------------------------------------------------" << endl;
  cout << "Input shape : "<< x.n_rows << " " << x.n_cols << endl;
  cout << "Input : " << endl << x << endl;
  cout << "Target shape : "<< y.n_rows << " " << y.n_cols << endl;
  cout << "Target : " << endl << y << endl;
  cout << "Weight : " << weight << endl;
  cout << "Weight shape : "<< weight.n_rows << " " << weight.n_cols << endl;
  cout << "------------------------------------------------------------------" << endl;
  cout << "SUM " << endl;
  cout << "------------------------------------------------------------------" << endl;
  cout << "FORWARD : " << endl;
  cout << "Loss (sum):\n" << loss_sum << '\n';
  cout << "BACKWARD : " << endl;
  cout << "Output shape : "<< output.n_rows << " " << output.n_cols << endl;
  cout << "Output (sum) : " << endl << output * x.n_rows << endl;                                             
  cout << "Sum of all values in this matrix : " << arma::as_scalar(arma::accu(output * x.n_rows)) << endl;    
  cout << "------------------------------------------------------------------" << endl;
  cout << "MEAN " << endl;
  cout << "------------------------------------------------------------------" << endl;
  cout << "FORWARD : " << endl;
  cout << "Loss (mean):\n" << loss_mean << '\n';
  cout << "BACKWARD : " << endl;
  cout << "Output shape : "<< output.n_rows << " " << output.n_cols << endl;
  cout << "Output (mean) : " << endl << output << endl;
  cout << "Sum of all values in this matrix : " << arma::as_scalar(arma::accu(output)) << endl;
  cout << "------------------------------------------------------------------" << endl;
  return 0;
}

### Run using **`g++ test.cpp -o test -larmadillo && ./test`**

In [18]:
%%script bash
g++ test.cpp -o test -larmadillo && ./test

------------------------------------------------------------------
USER-PROVIDED MATRICES : 
------------------------------------------------------------------
Input shape : 5 3
Input : 
   0.1778   0.1203  -0.2264
   0.0957   0.2403  -0.3400
   0.1397   0.1925  -0.3336
   0.2256   0.3144  -0.8695
   0.1406   0.2569   0.0789

Target shape : 5 3
Target : 
        0   1.0000        0
   1.0000        0        0
        0        0   1.0000
   1.0000        0        0
        0        0   1.0000

Weight :    1.0000   2.0000   3.0000

Weight shape : 1 3
------------------------------------------------------------------
SUM 
------------------------------------------------------------------
FORWARD : 
Loss (sum):
6.81357
BACKWARD : 
Output shape : 5 3
Output (sum) : 
   0.1814  -0.3133   0.4436
  -0.1587   0.3732   0.4158
   0.1783   0.3653  -0.5826
  -0.1479   0.3853   0.2954
   0.1784   0.3759  -0.4803

Sum of all values in this matrix : 1.50977
--------------------------------------------

## Test Case 5 - 5 classes, 3 examples, weighted

### Assumptions and Statements

**Overview :**

Input is a collection of 3 multi-label examples. <br>
Each example belongs to 1 of 5 possible classes. <br>
Hence, the class label for each example is 1x5 one-hot encoded vector. <br>
Different weights are assigned to each class using a 1x5 weights vector. <br>

---

**Summary of shapes of tensors involved :** <br>

The shape of the input matrix is 3x5. <br>
The shape of each example is 1x5. <br>
The shape of each class label is 1x5. <br>
The shape of the target matrix is 3x5. <br>
The shape of the weight matrix is 1x5. <br>
The shape of the gradient of the input matrix will also be 3x5. <br>
The output of the forward function is reduced to a 1x1 tensor using either sum or mean reduction. <br>

---

### Sum

In [19]:
torch.set_printoptions(precision=4)
x = torch.tensor([[ 0.1778,  0.1203, -0.2264, 0.2256, 0.3144],[ 0.0957,  0.2403, -0.3400, -0.8695, 0.2457],[ 0.1397,  0.1925, -0.3336, 0.2569, 0.0789]], requires_grad=True) # 3 Rows, 5 columns
y = torch.tensor([[0., 1., 0., 1., 0.],[1., 0., 0., 0., 0.],[0., 0., 1., 1., 1.]]) # 3 Rows, 5 columns
weights = torch.tensor([1, 2, 3, 4, 5])
criterion_mean = nn.MultiLabelSoftMarginLoss(reduction='sum', weight=weights)
loss_mean = criterion_mean(x,y)
print("FORWARD:\n", loss_mean, "\nBACKWARD")
loss_mean.backward()
print(x.grad)
print("---------------------------------------------------------------")
torch.set_printoptions(precision=6)
print(torch.sum(x.grad))
print("---------------------------------------------------------------")

FORWARD:
 tensor(6.0863, grad_fn=<SumBackward0>) 
BACKWARD
tensor([[ 0.1089, -0.1880,  0.2662, -0.3551,  0.5780],
        [-0.0952,  0.2239,  0.2495,  0.2363,  0.5611],
        [ 0.1070,  0.2192, -0.3496, -0.3489, -0.4803]])
---------------------------------------------------------------
tensor(0.732939)
---------------------------------------------------------------


### Mean

In [20]:
torch.set_printoptions(precision=5)
x = torch.tensor([[ 0.1778,  0.1203, -0.2264, 0.2256, 0.3144],[ 0.0957,  0.2403, -0.3400, -0.8695, 0.2457],[ 0.1397,  0.1925, -0.3336, 0.2569, 0.0789]], requires_grad=True) # 3 Rows, 5 columns
y = torch.tensor([[0., 1., 0., 1., 0.],[1., 0., 0., 0., 0.],[0., 0., 1., 1., 1.]]) # 3 Rows, 5 columns
weights = torch.tensor([1, 2, 3, 4, 5])
criterion_mean = nn.MultiLabelSoftMarginLoss(reduction='mean', weight=weights)
loss_mean = criterion_mean(x,y)
print("FORWARD:\n", loss_mean, "\nBACKWARD")
loss_mean.backward()
print(x.grad)
print("---------------------------------------------------------------")
torch.set_printoptions(precision=6)
print(torch.sum(x.grad))
print("---------------------------------------------------------------")

FORWARD:
 tensor(2.02877, grad_fn=<MeanBackward0>) 
BACKWARD
tensor([[ 0.03629, -0.06266,  0.08873, -0.11836,  0.19265],
        [-0.03174,  0.07464,  0.08316,  0.07876,  0.18704],
        [ 0.03566,  0.07306, -0.11653, -0.11630, -0.16010]])
---------------------------------------------------------------
tensor(0.244313)
---------------------------------------------------------------


### C++ File **`test.cpp`** to reproduce same results 

In [0]:
%%capture
%%writefile test.cpp
#include <iostream>
#include <armadillo>

using namespace std;
using namespace arma;

int main()
{
  // Constructor
  arma::mat x,y;
  arma::mat weight;

  x << 0.1778 << 0.1203 << -0.2264 <<  0.2256 << 0.3144 << endr
    << 0.0957 << 0.2403 << -0.3400 << -0.8695 << 0.2457 << endr
    << 0.1397 << 0.1925 << -0.3336 << 0.2569  << 0.0789 << endr;

  y << 0 << 1 << 0 << 1 << 0 << endr
    << 1 << 0 << 0 << 0 << 0 << endr
    << 0 << 0 << 1 << 1 << 1 << endr;

  weight.ones(1,5);
  weight(0, 1) = 2;
  weight(0, 2) = 3;
  weight(0, 3) = 4;
  weight(0, 4) = 5;

  // Forward
  arma::mat logSigmoid = arma::log((1 / (1 + arma::exp(-x))));
  arma::mat logSigmoidNeg = arma::log(1 / (1 + arma::exp(x)));
  arma::mat loss = arma::mean(arma::sum(-(y % logSigmoid + (1 - y) % logSigmoidNeg)) % weight, 1);
  double loss_sum = arma::as_scalar(loss);
  double loss_mean = arma::as_scalar(loss / x.n_rows);

  // Backward - mean
  arma::mat sigmoid = 1 / (1+arma::exp(-x));
  arma::mat output ;
  output.set_size(size(x));
  output = - (y % ( 1-sigmoid) - (1-y) % sigmoid) % arma::repmat(weight, y.n_rows, 1) / output.n_elem;

  // Display
  cout << "------------------------------------------------------------------" << endl;
  cout << "USER-PROVIDED MATRICES : " << endl;
  cout << "------------------------------------------------------------------" << endl;
  cout << "Input shape : "<< x.n_rows << " " << x.n_cols << endl;
  cout << "Input : " << endl << x << endl;
  cout << "Target shape : "<< y.n_rows << " " << y.n_cols << endl;
  cout << "Target : " << endl << y << endl;
  cout << "Weight : " << weight << endl;
  cout << "Weight shape : "<< weight.n_rows << " " << weight.n_cols << endl;
  cout << "------------------------------------------------------------------" << endl;
  cout << "SUM " << endl;
  cout << "------------------------------------------------------------------" << endl;
  cout << "FORWARD : " << endl;
  cout << "Loss (sum):\n" << loss_sum << '\n';
  cout << "BACKWARD : " << endl;
  cout << "Output shape : "<< output.n_rows << " " << output.n_cols << endl;
  cout << "Output (sum) : " << endl << output * x.n_rows << endl;                                             
  cout << "Sum of all values in this matrix : " << arma::as_scalar(arma::accu(output * x.n_rows)) << endl;    
  cout << "------------------------------------------------------------------" << endl;
  cout << "MEAN " << endl;
  cout << "------------------------------------------------------------------" << endl;
  cout << "FORWARD : " << endl;
  cout << "Loss (mean):\n" << loss_mean << '\n';
  cout << "BACKWARD : " << endl;
  cout << "Output shape : "<< output.n_rows << " " << output.n_cols << endl;
  cout << "Output (mean) : " << endl << output << endl;
  cout << "Sum of all values in this matrix : " << arma::as_scalar(arma::accu(output)) << endl;
  cout << "------------------------------------------------------------------" << endl;
  return 0;
}

### Run using **`g++ test.cpp -o test -larmadillo && ./test`**

In [22]:
%%script bash
g++ test.cpp -o test -larmadillo && ./test

------------------------------------------------------------------
USER-PROVIDED MATRICES : 
------------------------------------------------------------------
Input shape : 3 5
Input : 
   0.1778   0.1203  -0.2264   0.2256   0.3144
   0.0957   0.2403  -0.3400  -0.8695   0.2457
   0.1397   0.1925  -0.3336   0.2569   0.0789

Target shape : 3 5
Target : 
        0   1.0000        0   1.0000        0
   1.0000        0        0        0        0
        0        0   1.0000   1.0000   1.0000

Weight :    1.0000   2.0000   3.0000   4.0000   5.0000

Weight shape : 1 5
------------------------------------------------------------------
SUM 
------------------------------------------------------------------
FORWARD : 
Loss (sum):
6.0863
BACKWARD : 
Output shape : 3 5
Output (sum) : 
   0.1089  -0.1880   0.2662  -0.3551   0.5780
  -0.0952   0.2239   0.2495   0.2363   0.5611
   0.1070   0.2192  -0.3496  -0.3489  -0.4803

Sum of all values in this matrix : 0.732939
--------------------------------