Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Observing very strong Euclidean baseline results for reconstruction ... #35

Open
0xSameer opened this issue Jun 24, 2019 · 7 comments
Open

Comments

@0xSameer
Copy link

Hi,

I was able to replicate the results for Poincare and Lorentz manifolds as reported in your publications. However, when recreating the Euclidean baselines I am noticing much stronger reconstruction scores. For example, with the following changes to ./train-nouns.sh:

-manifold euclidean
-dims 200
-lr 1.0

After just 200 epochs, we get:

json_stats: {"epoch": 199, ..., "mean_rank": 1.69, "map_rank": 0.90}

And after 1400 epochs, we get:

"mean_rank": 1.19, "map_rank": 0.95

No other changes were made to the code. Are we doing something wrong?
Note that we had to add an entry into the train-nouns.sh script for Euclidean manifold and used the same learning rate as specified for the Poincare manifold (1.0), and not the default of 1000.0 set in the code.

Thanks!

@0xSameer
Copy link
Author

Correction in the hyperparameters description above. We are using:

-dim 200

not "dims" ...

@mnick
Copy link
Contributor

mnick commented Aug 29, 2019

I wanted to quickly follow up on our separate email conversation: We identified
the issue and are finishing additional experiments. I will update this issue in
the next days with the results. Thanks again for filing this!

@HHalva
Copy link

HHalva commented Jul 24, 2020

I am still seeing this. Is there any update? Would be very important for everyone to know what's going on. Are the results presented in the NIPS'17 paper wrong/misleading?

@martinwhl
Copy link

Same situation here, wondering if there's any update on this?

@mnick
Copy link
Contributor

mnick commented Sep 1, 2020

Thank you for raising this again and sorry for the delay. In addition to our follow-ups over email we should have updated this issue on Github as well.

Basically: The reason for the stronger Euclidean baseline using the open sourced code is that the paper used a different setting where the Euclidean embeddings were regularized (similar to previous work). With open sourcing the code we disabled this regularization by default and it turned out to work better (as pointed out by Sameer). Since it led to a stronger Euclidean baseline in higher dimensions we decided to keep it like that in the code. Hyperbolic embeddings provide a substantial performance improvement in lower dimensions, which is really the main focus of this work.

@martinwhl
Copy link

Thank you for raising this again and sorry for the delay. In addition to our follow-ups over email we should have updated this issue on Github as well.

Basically: The reason for the stronger Euclidean baseline using the open sourced code is that the paper used a different setting where the Euclidean embeddings were regularized (similar to previous work). With open sourcing the code we disabled this regularization by default and it turned out to work better (as pointed out by Sameer). Since it led to a stronger Euclidean baseline in higher dimensions we decided to keep it like that in the code. Hyperbolic embeddings provide a substantial performance improvement in lower dimensions, which is really the main focus of this work.

Sorry, I'm not sure I fully understand how the Euclidean embeddings were regularized... could you please explain a little bit more? A lot of thanks.

@davecerr
Copy link

@martinthewhale I think the idea is that since there is exponentially more "space" near the boundary of the Poincare ball, the easiest way for the algorithm to minimise the loss is to push all nodes outwards. This is a form of overfitting since we ideally want nodes that are higher in the original hierarchy to be kept closer to the centre of the ball. I believe this is achieved by regularising the norm of v in equation 6 in the paper. This means that for every parent(v)/child(u) relationship we consider we are always encouraging parents (nodes higher in the hierarchy) to stay closer to the origin.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants