Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGMOD experiments reproducibility #64

Closed
alex-bogatu opened this issue Jul 19, 2020 · 2 comments
Closed

SIGMOD experiments reproducibility #64

alex-bogatu opened this issue Jul 19, 2020 · 2 comments

Comments

@alex-bogatu
Copy link

Hi,
I am trying to reproduce the experiments from the SIGMOD 2018 paper: http://pages.cs.wisc.edu/~anhai/papers1/deepmatcher-sigmod18.pdf. I am having a hard time finding the right setup and I get results far poorer than the ones reported in the paper for most of the datasets. Can you please give me a hint regarding the right setup? For example, what are the parameters for the hybrid setup? Using the defaults leads to poor results and following the existing guides in the repository did not help much.

As an example, for the (complete) iTunes-Amazon scenario the best I could obtain was F1: 35.09 | Prec: 33.33 | Rec: 37.04. But the paper reports better results.

Thank you!

@sidharthms
Copy link
Collaborator

sidharthms commented Jul 19, 2020

Hi, here's a colab notebook showing how to reproduce the numbers in the paper for iTunes-Amazon structured: https://colab.research.google.com/drive/1CQFejG3-KeuFmMChsEoOeqypTS7njyJb#scrollTo=W4ixyezcQJPG

Note that neural network models may sometimes be unstable especially on small datasets - so you may need to run it multiple times. For the purposes of our sigmod paper, if I remember correctly, we ran each experiment 3-5 times, and reported the median.

@sidharthms
Copy link
Collaborator

sidharthms commented Jul 19, 2020

Please re-open if you have any other questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants