Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Histogram loss #651

Merged
merged 12 commits into from
Jul 25, 2023
Merged

Conversation

domenicoMuscill0
Copy link
Contributor

@domenicoMuscill0 domenicoMuscill0 commented Jul 13, 2023

I implemented the Histogram Loss and i used as comparison the rather complex code of this PyTorch implementation of the loss function. This is the officially recognized PyTorch implementation from the author of the paper themselves, but during testing i corrected a bug into this code and documented it. There is another implementation of the loss function, but it is not officially recognized. Should i add it to the tests anyway?

@KevinMusgrave
Copy link
Owner

Is there a good default value we can use for num_bins?

@domenicoMuscill0
Copy link
Contributor Author

domenicoMuscill0 commented Jul 21, 2023

From the Histogram loss paper:

The only associated parameter of such loss is the number of histogram bins, to which the results have very low sensitivity

and

the results for CUB-200-2011 remain stable for the sizes equal to 0.005, 0.01, 0.02 and 0.04 (these values correspond to 400, 200, 100 and 50 bins in the histograms)

so i think that 100 can be good since, in any case, finetuning on this parameter is not significative.

@KevinMusgrave
Copy link
Owner

Ok I set n_bins=100 as the default.

Can you add a description for n_bins and delta here: https://github.com/KevinMusgrave/pytorch-metric-learning/pull/651/files#diff-d85c985479d586f06110b2e294e6f4ea69e44e309f922e142012746f231bbf51R435-R436

Also does this loss work with anything other than CosineSimilarity?

Need to adjust the code accordingly to the documentation
@domenicoMuscill0
Copy link
Contributor Author

  1. About the description of n_bins and delta. I have written the code to account for both the cases where the user wants to use a predefined mesh for the partition of the interval and where an exact number of bins should be used. The logic was that at least one parameter between n_bins and delta should be specified and, if both were to be defined, both the values should give the same partition:

assert (
delta == 2 / n_bins
), f"delta and n_bins must satisfy the equation delta = 2/n_bins.\nPassed values are delta={delta} and n_bins={n_bins}"

If now n_bins is by default equal to 100 the first control that at least one parameter should be set is useless. I propose to let both n_bins and delta be undefined by default, and if the user does not specify neither of them the default value of 100 for n_bins is used. I am going to commit this change as soon as possible.

  1. About CosineSimilarity. Again from the same paper:

We assume that the last layer of the network
performs length-normalization, so that the embedded vectors {yi = f(xi; θ)} are L2-normalized.

I think that it can work with any distance given that the embeddings are normalized. It is a strict requirement that the values of the distances are bounded in [-1, 1], so i think that any distance fulfilling this request is acceptable.

@KevinMusgrave KevinMusgrave merged commit ea0946a into KevinMusgrave:dev Jul 25, 2023
19 checks passed
@KevinMusgrave
Copy link
Owner

Thanks @domenicoMuscill0 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants