How to use NTXentLoss as in CPC? #179

vgaraujov · 2020-08-15T00:12:32Z

Hello! Thanks for this incredible contribution.

I want to know how to use the NTXentLoss as in CPC model. I mean, I have a positive sample and N-1 negative samples.

Thank you for your help in this matter.

KevinMusgrave · 2020-08-15T00:27:08Z

If you have just a single positive pair in your batch:

from pytorch_metric_learning.losses import NTXentLoss
loss_func = NTXentLoss()

# in your training loop
batch_size = data.size(0)
embeddings = your_model(data)
labels = torch.arange(batch_size)
# The assumption here is that data[0] and data[1] are the positive pair
# And there are no other positive pairs in the batch
labels[1] = labels[0]
loss = loss_func(embeddings, labels)
loss.backward()

If your batch size is N, and you have N/2 positive pairs:

from pytorch_metric_learning.losses import NTXentLoss
loss_func = NTXentLoss()

# in your training loop
batch_size = data.size(0)
embeddings = your_model(data)
# The assumption here is that data[0] and data[1] are a positive pair
# data[2] and data[3] are the next positive pair, and so on
labels = torch.arange(batch_size)
labels[1::2] = labels[0::2]
loss = loss_func(embeddings, labels)
loss.backward()

Basically you need to create labels such that positive pairs share the same label.

vgaraujov · 2020-08-18T21:09:16Z

Thank you for your answer @KevinMusgrave
To be sure how it works...

Regarding this assumption, data[0] and data[1] are positive pairs, so the rest (data[2:]) will be used as negatives samples?

If you have just a single positive pair in your batch:

from pytorch_metric_learning.losses import NTXentLoss
loss_func = NTXentLoss()

# in your training loop
batch_size = data.size(0)
embeddings = your_model(data)
labels = torch.arange(batch_size)
# The assumption here is that data[0] and data[1] are the positive pair
# And there are no other positive pairs in the batch
labels[1] = labels[0]
loss = loss_func(embeddings, labels)
loss.backward()

Does it mean that all examples that are not positive samples in the batch automatically used as negative samples?

KevinMusgrave · 2020-08-18T21:34:27Z

Yes, data[2:] will be used as negative samples, because their labels are different from data[0] and data[1]. And data[0] and data[1] are the only positive pair because there are no other labels that occur more than once.

Regarding this assumption, data[0] and data[1] are positive pairs, so the rest (data[2:]) will be used as negatives samples?

To be clear, data[0] is not a pair by itself. It forms a positive pair with data[1]. Similarly, data[0] forms a negative pair with data[2], data[3]...data[N].

vgaraujov · 2020-08-22T02:54:30Z

Thank you so much @KevinMusgrave !

YounkHo · 2021-04-19T17:58:33Z

I'm still confused that if I have a batch of randomly sampled image and their corresponding label. How can I use NTXentLoss?

from pytorch_metric_learning.losses import NTXentLoss
loss_func = NTXentLoss()

# in your training loop
batch_size = data.size(0)
embeddings = your_model(data)
labels = torch.arange(batch_size)
# The assumption here is that data[0] and data[1] are the positive pair
# And there are no other positive pairs in the batch
labels[1] = labels[0]
loss = loss_func(embeddings, labels)
loss.backward()

From code provided, why can we regard data[0] and data[1] as positive sample while other pairs are negative?

KevinMusgrave · 2021-04-19T18:10:21Z

That assumption was in response to the original question. If you have labels, then you can ignore the above discussion and just do:

loss = loss_func(embeddings, labels)

YounkHo · 2021-04-20T01:59:30Z

Then how can NTXentLoss distinguish positive sample or negative? Images are labeled with their own labels.

KevinMusgrave · 2021-04-20T02:23:59Z

Images with the same label form positive pairs, and images with different labels form negative pairs.

For example, if the labels in a batch are [0, 0, 1, 1, 1] then:

the positive pairs will be formed by indices [0, 1], [1, 0], [2, 3], [2, 4], [3, 2], [3, 4], [4, 2], [4, 3]
the negative pairs will be formed by indices [0, 2], [0, 3], [0, 4], [1, 2], [1, 3], [1, 4], [2, 0], [2, 1], [3, 0], [3, 1], [4, 0], [4, 1]

YounkHo · 2021-04-20T02:27:31Z

What if there is no same label pairs as batch are sampled randomly?

KevinMusgrave · 2021-04-20T02:30:05Z

Then NTXentLoss will return 0, because it requires positive pairs to compute an actual loss.

KevinMusgrave · 2021-04-20T02:31:06Z

You can try using MPerClassSampler to ensure there are positive pairs in every batch.

YounkHo · 2021-04-22T08:37:42Z

Another problem is that when I use MPerClassSampler in my own project, I found that all training data are not shuffled(all labels in one batch are the same). However, shuffle is not allowed when using a sampler.

sampler = samplers.MPerClassSampler(data["label_names"], m=4, length_before_new_iter=len(data["image_names"]))
data_loader_params = dict(sampler = sampler, batch_size = self.batch_size, num_workers = 12, pin_memory = True)
data_loader = torch.utils.data.DataLoader(dataset, **data_loader_params)

Is there anything wrong with my usage?

KevinMusgrave · 2021-04-22T09:02:22Z

What is your batch size? Can you print the batch labels and paste them here?

YounkHo · 2021-04-22T09:08:53Z

Batch Size in data loader is set to 128 and labels pass to MPerClassSampler is as follows:

["n01532829","n01558993","n01704323","n01749939","n01770081","n01843383","n01910747","n02074367","n02089867","n02091831",.....,"n07613480"]

however , labels in training batch(code: inputs, targets = inputs.cuda(), targets.cuda()) with MPerClassSampler are as follows:

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0], device='cuda:0')

KevinMusgrave · 2021-04-22T09:16:27Z

The labels should be integers, sorry I forgot to mention that.

Edit: Actually I haven't tested with strings. It's possible strings work.

Edit2: Nevermind, string labels should work.

Also, in addition to passing the batch size into the dataloader, you can pass batch_size into MPerClassSampler. Then it will check to make sure your m, labels, and batch_size are all compatible.

KevinMusgrave · 2021-04-22T09:22:00Z

Make sure that labels is the same length as your dataset.

ashutoshml · 2022-01-05T06:35:32Z

Images with the same label form positive pairs, and images with different labels form negative pairs.

For example, if the labels in a batch are [0, 0, 1, 1, 1] then:

the positive pairs will be formed by indices [0, 1], [1, 0], [2, 3], [2, 4], [3, 2], [3, 4], [4, 2], [4, 3]

the negative pairs will be formed by indices [0, 2], [0, 3], [0, 4], [1, 2], [1, 3], [1, 4], [2, 0], [2, 1], [3, 0], [3, 1], [4, 0], [4, 1]

@KevinMusgrave
Is there a way to specify negative and positive pairs instead of providing labels?
i.e.,

loss_func(embeddings, positive_pairs, negative_pairs)

KevinMusgrave · 2022-01-05T07:43:51Z

Yes, but unfortunately "dummy" labels are still required:

# positive pairs are formed by (a1, p)
# negatives pairs are formed by (a2, n)
a1= torch.randint(0, 10, size=(100,))
p = torch.randint(0, 10, size=(100,))
a2 = torch.randint(0, 10, size=(100,))
n = torch.randint(0, 10, size=(100,))

pairs = a1, p, a2, n
# won't actually be used
labels = torch.zeros(len(embeddings))
loss_func(embeddings, labels, pairs)

ashutoshml · 2022-01-05T09:49:57Z

Yes, but unfortunately "dummy" labels are still required:

# positive pairs are formed by (a1, p)
# negatives pairs are formed by (a2, n)
a1= torch.randint(0, 10, size=(100,))
p = torch.randint(0, 10, size=(100,))
a2 = torch.randint(0, 10, size=(100,))
n = torch.randint(0, 10, size=(100,))

pairs = a1, p, a2, n
# won't actually be used
labels = torch.zeros(len(embeddings))
loss_func(embeddings, labels, pairs)

Thanks.

"the positive pairs will be formed by indices [0, 1], [1, 0], [2, 3], [2, 4], [3, 2], [3, 4], [4, 2], [4, 3]
the negative pairs will be formed by indices [0, 2], [0, 3], [0, 4], [1, 2], [1, 3], [1, 4], [2, 0], [2, 1], [3, 0], [3, 1], [4, 0], [4, 1]"

So the number of pairs (a1, p) can be different from (a2, n) as in this example, right? As in

a1= torch.randint(0, 10, size=(100,))
p = torch.randint(0, 10, size=(100,))
a2 = torch.randint(0, 10, size=(500,))
n = torch.randint(0, 10, size=(500,))

KevinMusgrave · 2022-01-05T12:23:04Z

Yes

KevinMusgrave · 2022-06-29T22:03:17Z

Starting in v1.5.0, losses like ContrastiveLoss and TripletMarginLoss no longer require dummy labels if indices_tuple is passed in:

loss = loss_fn(embeddings, indices_tuple=triplets)

(Posting here for future readers.)

lucasestini · 2022-07-15T14:55:15Z

Hello, does the loss consider every positive pair in the batch? Like, if I have 3 samples belonging to the same class, do they all contribute to the loss and get pulled together? Are they considered as 6 positives pairs or treated at once?

KevinMusgrave · 2022-07-15T15:10:06Z

@lucasestini They are considered as 6 positive pairs.

yankungou · 2022-09-29T13:23:45Z

Hi @KevinMusgrave, thank you for your wonderful code. I found that it still requires dummy labels as input. I don't know why. My pytorch_metric_learning version is 1.6.2. PyTorch version is 1.12.0. I use DDP. Thanks!

KevinMusgrave · 2022-09-29T13:43:05Z

@YK711 Are you using DistributedLossWrapper? I see I forgot to update that class to make labels optional. I've added it to my todo list now: #531

yankungou · 2022-10-05T05:47:00Z

Yes, I use DistributedLossWrapper. Thank you @KevinMusgrave!

KevinMusgrave · 2023-01-30T00:19:02Z

SelfSupervisedLoss has been added to v2.0.0. If you have two "views" of data, you don't need to make labels anymore:

from pytorch_metric_learning.losses import SelfSupervisedLoss
loss_func = SelfSupervisedLoss(NTXentLoss())
embeddings = model(data)
augmented = model(augmented_data)
loss = loss_func(embeddings, augmented)

happen2me · 2023-04-09T16:46:32Z

I find the formula in the documentation for the NTXentLoss is misleading:

If labels for indices [a, b, c, d] are [0, 0, 2, 3], the formula implies that the positive pair is [a, b], while the negative pairs are [a, c], [a, d]. However, this is different from its actual implementation, where the positive pairs are [a, b], [b, a], and the negative samples are [a, c], [a, d], [b, c], [b, d], [c, a], [c, b], [c, d], [d, a], [d, b], [d, c].

Maybe we should remove this formula and replace it with more clear sampling information? I can make a pull request for it if I am not wrong :)

KevinMusgrave · 2023-04-09T19:18:50Z

If labels for indices [a, b, c, d] are [0, 0, 2, 3], the formula implies that the positive pair is [a, b], while the negative pairs are [a, c], [a, d].

@happen2me Actually that is how it works.
The negative pairs for [a, b] will all be of the form [a, _].
The negative pairs for [b, a] will all be of the form [b, _].
And the loss for [a, b] is computed separately from the loss for [b, a].

I apologize if my comments further up this thread were confusing.

See these related comments:

#606 (comment)
#600 (comment)
#6 (comment)

Edit: Or are you referring to the fact that both [a, b] and [b, a] are used as positive pairs, rather than just [a, b] ?

happen2me · 2023-04-09T20:14:50Z

@KevinMusgrave Thank you very much for your reply! Yes, when I saw the equation, I thought the positive pairs are [anchor, positive], and the negative pairs are [anchor, negative 1], [anchor, negative 2]... But in fact, those of the same label with anchor are regarded positive samples to each other (including the anchor), while the negative samples also include pairs like [positive, negative k], and [negative k1, negative k2], which is not that intuitive purely from the equation.

I wanted to apply the loss in a situation like [question, positive passage, negative passage 1, negative passage 2, ...]. In this case, pushing negatives apart from each other isn't really necessary. I managed to do it with manually created indices_tuple.

KevinMusgrave · 2023-04-11T17:35:06Z

I wanted to apply the loss in a situation like [question, positive passage, negative passage 1, negative passage 2, ...]. In this case, pushing negatives apart from each other isn't really necessary. I managed to do it with manually created indices_tuple.

@happen2me Just to make sure there's no confusion, I'll go through a simple example.

Say we have a batch size of 4 with labels: [dog, dog, cat, mouse].

In this case there will be two losses:

one loss for the positive pair [0,1] with negative pairs [0,2], [0,3].
one loss for the positive pair [1,0] with negative pairs [1,2], [1,3].

So the negative pair [2,3] (cat, mouse) isn't used at all.

I've created an issue for improving the documentation: #608

KevinMusgrave added Frequently Asked Questions Frequently Asked Questions question A general question about the library labels Aug 15, 2020

vgaraujov closed this as completed Aug 22, 2020

KevinMusgrave mentioned this issue Nov 20, 2020

NTXentLoss with sequences #235

Closed

KevinMusgrave mentioned this issue Jan 5, 2022

Make labels optional #412

Closed

KevinMusgrave mentioned this issue Apr 11, 2023

Explain more clearly how NTXentLoss works #608

Closed

stompsjo mentioned this issue Jul 12, 2023

Adding details to NTXentLoss documentation #650

Merged

stompsjo mentioned this issue Jul 20, 2023

NT-Xet loss? #6

Closed

stompsjo mentioned this issue May 7, 2024

NTXentLoss, normalize issue. #696

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to use NTXentLoss as in CPC? #179

How to use NTXentLoss as in CPC? #179

vgaraujov commented Aug 15, 2020

KevinMusgrave commented Aug 15, 2020 •

edited

vgaraujov commented Aug 18, 2020

KevinMusgrave commented Aug 18, 2020 •

edited

vgaraujov commented Aug 22, 2020

YounkHo commented Apr 19, 2021 •

edited

KevinMusgrave commented Apr 19, 2021

YounkHo commented Apr 20, 2021

KevinMusgrave commented Apr 20, 2021

YounkHo commented Apr 20, 2021

KevinMusgrave commented Apr 20, 2021

KevinMusgrave commented Apr 20, 2021

YounkHo commented Apr 22, 2021 •

edited

KevinMusgrave commented Apr 22, 2021

YounkHo commented Apr 22, 2021

KevinMusgrave commented Apr 22, 2021 •

edited

KevinMusgrave commented Apr 22, 2021

ashutoshml commented Jan 5, 2022 •

edited

KevinMusgrave commented Jan 5, 2022

ashutoshml commented Jan 5, 2022

KevinMusgrave commented Jan 5, 2022

KevinMusgrave commented Jun 29, 2022

lucasestini commented Jul 15, 2022

KevinMusgrave commented Jul 15, 2022

yankungou commented Sep 29, 2022 •

edited

KevinMusgrave commented Sep 29, 2022

yankungou commented Oct 5, 2022

KevinMusgrave commented Jan 30, 2023

happen2me commented Apr 9, 2023

KevinMusgrave commented Apr 9, 2023 •

edited

happen2me commented Apr 9, 2023

KevinMusgrave commented Apr 11, 2023 •

edited

How to use NTXentLoss as in CPC? #179

How to use NTXentLoss as in CPC? #179

Comments

vgaraujov commented Aug 15, 2020

KevinMusgrave commented Aug 15, 2020 • edited

vgaraujov commented Aug 18, 2020

KevinMusgrave commented Aug 18, 2020 • edited

vgaraujov commented Aug 22, 2020

YounkHo commented Apr 19, 2021 • edited

KevinMusgrave commented Apr 19, 2021

YounkHo commented Apr 20, 2021

KevinMusgrave commented Apr 20, 2021

YounkHo commented Apr 20, 2021

KevinMusgrave commented Apr 20, 2021

KevinMusgrave commented Apr 20, 2021

YounkHo commented Apr 22, 2021 • edited

KevinMusgrave commented Apr 22, 2021

YounkHo commented Apr 22, 2021

KevinMusgrave commented Apr 22, 2021 • edited

KevinMusgrave commented Apr 22, 2021

ashutoshml commented Jan 5, 2022 • edited

KevinMusgrave commented Jan 5, 2022

ashutoshml commented Jan 5, 2022

KevinMusgrave commented Jan 5, 2022

KevinMusgrave commented Jun 29, 2022

lucasestini commented Jul 15, 2022

KevinMusgrave commented Jul 15, 2022

yankungou commented Sep 29, 2022 • edited

KevinMusgrave commented Sep 29, 2022

yankungou commented Oct 5, 2022

KevinMusgrave commented Jan 30, 2023

happen2me commented Apr 9, 2023

KevinMusgrave commented Apr 9, 2023 • edited

happen2me commented Apr 9, 2023

KevinMusgrave commented Apr 11, 2023 • edited

KevinMusgrave commented Aug 15, 2020 •

edited

KevinMusgrave commented Aug 18, 2020 •

edited

YounkHo commented Apr 19, 2021 •

edited

YounkHo commented Apr 22, 2021 •

edited

KevinMusgrave commented Apr 22, 2021 •

edited

ashutoshml commented Jan 5, 2022 •

edited

yankungou commented Sep 29, 2022 •

edited

KevinMusgrave commented Apr 9, 2023 •

edited

KevinMusgrave commented Apr 11, 2023 •

edited