Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[MXNET -1030] Cosine Embedding Loss #12750

Merged
merged 38 commits into from
Oct 29, 2018
Merged
Show file tree
Hide file tree
Changes from 32 commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
b3b5de0
COsine Embedding Loss function added
gaurav-gireesh Oct 5, 2018
eb9b9b4
Added unit tests for Cosine Embedding Loss Function
gaurav-gireesh Oct 5, 2018
7fdd85d
Added Latex code for formula for cosine embedding loss
gaurav-gireesh Oct 7, 2018
013a604
Fixing document rendering
gaurav-gireesh Oct 8, 2018
aac12ad
Fixing documentation issue
gaurav-gireesh Oct 8, 2018
1c97924
PR Comments addressed for using F (NDArray or Symbol) to calculate no…
gaurav-gireesh Oct 8, 2018
9766983
Markdown file updated. Added entry for CosineEmbeddingLoss
gaurav-gireesh Oct 8, 2018
f05eb7b
Added a line after .. math:: to fix documentation
gaurav-gireesh Oct 9, 2018
c02e111
Documentation check - pylint fix
gaurav-gireesh Oct 9, 2018
c10f1ef
Formula update
gaurav-gireesh Oct 9, 2018
95dd2a7
Making the formula simpler for correct rendering incrementally - Upda…
gaurav-gireesh Oct 9, 2018
01607b4
Making the formula simpler for correct rendering incrementally - Upda…
gaurav-gireesh Oct 9, 2018
c8bca0b
Making the formula simpler for correct rendering incrementally - Upda…
gaurav-gireesh Oct 9, 2018
5194cd8
Making the formula simpler for correct rendering incrementally - Upda…
gaurav-gireesh Oct 9, 2018
c01f8cb
Making the formula simpler for correct rendering incrementally - Upda…
gaurav-gireesh Oct 9, 2018
4b3fe81
Trigger CI
gaurav-gireesh Oct 9, 2018
78bd725
making the utility function cosine similarity internal
gaurav-gireesh Oct 9, 2018
3b3e117
Added a test case for label = -1, for dissimilar vectors
gaurav-gireesh Oct 10, 2018
2df6953
Refactored names of parameters to the loss functions and updated the …
gaurav-gireesh Oct 10, 2018
5c642cb
PR comments addressed changes in documentation
gaurav-gireesh Oct 10, 2018
4be5104
Added random input vectors and labelled tests
gaurav-gireesh Oct 11, 2018
410a708
Renaming variables
gaurav-gireesh Oct 11, 2018
1f48429
Pylint issues fixed
gaurav-gireesh Oct 11, 2018
b618b61
Merged from upstream master branch + Resolved conflicts
gaurav-gireesh Oct 15, 2018
ed762e5
Resolving conflicts
gaurav-gireesh Oct 15, 2018
89aafbc
Pylint issues fixed
gaurav-gireesh Oct 15, 2018
4a3167b
Style issues fixed trailing whitespaces removed
gaurav-gireesh Oct 15, 2018
d80baac
Review comment addressed, sample_weight added in the parameter
gaurav-gireesh Oct 25, 2018
b36e097
Merge remote-tracking branch 'upstream/master' into cosineloss
gaurav-gireesh Oct 26, 2018
c195ed0
Trigger CI
gaurav-gireesh Oct 26, 2018
308666b
Reordered Parameter description
gaurav-gireesh Oct 26, 2018
16c3ecd
comments addressed - spelling errors
gaurav-gireesh Oct 26, 2018
2dfeaa2
nit comments addressed
gaurav-gireesh Oct 26, 2018
0bd4b24
Trigger CI
gaurav-gireesh Oct 26, 2018
ca030a7
Merge remote-tracking branch 'upstream/master' into cosineloss
gaurav-gireesh Oct 26, 2018
ede1588
Trugger CI
gaurav-gireesh Oct 26, 2018
67572c5
Trigger CI
gaurav-gireesh Oct 26, 2018
55d4b1e
Trigger CI
gaurav-gireesh Oct 27, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/api/python/gluon/loss.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ This package includes several commonly used loss functions in neural networks.
LogisticLoss
TripletLoss
CTCLoss
CosineEmbeddingLoss
PoissonNLLLoss
```

Expand Down
70 changes: 69 additions & 1 deletion python/mxnet/gluon/loss.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
'SigmoidBinaryCrossEntropyLoss', 'SigmoidBCELoss',
'SoftmaxCrossEntropyLoss', 'SoftmaxCELoss',
'KLDivLoss', 'CTCLoss', 'HuberLoss', 'HingeLoss',
'SquaredHingeLoss', 'LogisticLoss', 'TripletLoss', 'PoissonNLLLoss']
'SquaredHingeLoss', 'LogisticLoss', 'TripletLoss', 'PoissonNLLLoss', 'CosineEmbeddingLoss']

import numpy as np
from .. import ndarray
Expand Down Expand Up @@ -767,3 +767,71 @@ def hybrid_forward(self, F, pred, target, sample_weight=None, epsilon=1e-08):
loss += stirling_factor
loss = _apply_weighting(F, loss, self._weight, sample_weight)
return F.mean(loss)


class CosineEmbeddingLoss(Loss):
r"""For a target label 1 or -1, vectors input1 and inout2, the function computes the cosine distance
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: typo input2

between the vectors. This can be interpreted as how similar/dissimilar two input vectors are.

.. math::

L = \sum_i \begin{cases} 1 - {cos\_sim({input1}_i, {input2}_i)} & \text{ if } {label}_i = 1\\
{cos\_sim({input1}_i, {input2}_i)} & \text{ if } {label}_i = -1 \end{cases}\\
cos\_sim(input1, input2) = \frac{{input1}_i.{input2}_i}{||{input1}_i||.||{input2}_i||}

`input1`, `input2` can have arbitrary shape as long as they have the same number of elements.

Parameters
----------
weight : float or None
Global scalar weight for loss.
batch_axis : int, default 0
The axis that represents mini-batch.
margin : float
Margin of separation between correct and incorrect pair.


Inputs:
- **input1**: a tensor with arbitrary shape
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update document on param names

- **input2**: another tensor with same shape as pred to which input1 is
compared for similarity and loss calculation
- **label**: A 1-D tensor indicating for each pair input1 and input2, target label is 1 or -1
- **sample_weight**: element-wise weighting tensor. Must be broadcastable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be added to hybrid_forward. Could you also put this after label?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed this. Thanks

to the same shape as input1. For example, if input1 has shape (64, 10)
and you want to weigh each sample in the batch separately,
sample_weight should have shape (64, 1).

Outputs:
- **loss**: The loss tensor with shape (batch_size,).
"""
def __init__(self, weight=None, batch_axis=0, margin=0, **kwargs):
super(CosineEmbeddingLoss, self).__init__(weight, batch_axis, **kwargs)
self._margin = margin

def hybrid_forward(self, F, input1, input2, label, sample_weight=None):
input1 = _reshape_like(F, input1, input2)
label = label.reshape((-1, 1))
cos_sim = self._cosine_similarity(F, input1, input2)
y_1 = label == 1
y_minus_1 = label == -1
cos_sim_a = (1 - cos_sim) * y_1

if F is ndarray:
z_array = F.array([0])
else:
z_array = F.zeros((1, 1))
cos_sim_b = F.broadcast_maximum(z_array, y_minus_1 * (cos_sim - self._margin), axis=1)
loss = cos_sim_a + cos_sim_b
loss = _apply_weighting(F, loss, self._weight, sample_weight)
return loss

def _cosine_similarity(self, F, x, y, axis=-1):
# Calculates the cosine similarity between 2 vectors
x_norm = F.norm(x, axis=axis).reshape(-1, 1)
y_norm = F.norm(y, axis=axis).reshape(-1, 1)
x_dot_y = F.sum(x*y, axis=axis).reshape(-1, 1)
if F is ndarray:
eps_arr = F.array([1e-12])
else:
eps_arr = F.full((1, 1), 1e-12)
return (x_dot_y / F.broadcast_maximum(x_norm * y_norm, eps_arr))
18 changes: 18 additions & 0 deletions tests/python/unittest/test_loss.py
Original file line number Diff line number Diff line change
Expand Up @@ -349,6 +349,23 @@ def test_triplet_loss():
assert mod.score(data_iter, eval_metric=mx.metric.Loss())[0][1] < 0.05

@with_seed()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add additional tests for label = -1 and hybridization.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add additional tests for label = -1 and hybridization.

  1. Added tests for label 1 and -1 in a randomly generated set of labels,
  2. This function is meant to be utility similar to cosine_distance.

def test_cosine_loss():
#Generating samples
input1 = mx.nd.random.randn(3, 2)
input2 = mx.nd.random.randn(3, 2)
label = mx.nd.sign(mx.nd.random.randn(input1.shape[0]))
#Calculating loss from cosine embedding loss function in Gluon
Loss = gluon.loss.CosineEmbeddingLoss()
loss = Loss(input1, input2, label)

# Calculating the loss Numpy way
numerator = mx.nd.sum(input1 * input2, keepdims=True, axis=1)
denominator = mx.nd.sqrt(mx.nd.sum(input1**2, axis=1, keepdims=True)) \
* mx.nd.sqrt(mx.nd.sum(input2**2, axis=1, keepdims=True))
numpy_loss = mx.nd.where(label == 1, 1-numerator/denominator, \
mx.nd.broadcast_maximum(mx.nd.array([0]), numerator/denominator, axis=1))
assert_almost_equal(loss.asnumpy(), numpy_loss.asnumpy(), rtol=1e-3, atol=1e-5)

def test_poisson_nllloss():
pred = mx.nd.random.normal(shape=(3, 4))
min_pred = mx.nd.min(pred)
Expand Down Expand Up @@ -404,6 +421,7 @@ def test_poisson_nllloss_mod():
optimizer='adam')
assert mod.score(data_iter, eval_metric=mx.metric.Loss())[0][1] < 0.05


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: additional empty line not needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. It is not required. Following the other test modules. They have a 2 lines gap after the last test function and main function.

if __name__ == '__main__':
import nose
nose.runmodule()