Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of hinge loss does not match the paper #7

Open
Yingdong-Hu opened this issue Feb 7, 2021 · 3 comments
Open

Implementation of hinge loss does not match the paper #7

Yingdong-Hu opened this issue Feb 7, 2021 · 3 comments

Comments

@Yingdong-Hu
Copy link

Hello Kipf, I find there is a discrepancy between the loss mentioned in the paper.

According to Eq(5) in paper, for negative samples, you calculate the Euclidean distance between negative state sample at timestep t and state at timestep t+1.

However, in the code below, state and neg_state are both at timestep t.

self.neg_loss = torch.max(
    zeros, self.hinge - self.energy(
        state, action, neg_state, no_trans=True)).mean()

I noticed that the same question was also asked here.

I want to know if this is a bug ? Does the discrepancy affect the final performance ?

@tkipf
Copy link
Owner

tkipf commented Feb 7, 2021 via email

@Yingdong-Hu
Copy link
Author

Yingdong-Hu commented Feb 24, 2021

Hi, thanks for your answer.
I understand that that both formulations are equivalent.
But I have another question:
For the Atari games, even if I set the same random seed, multiple executions of the application behave differently.
According to my understanding, when I set the same seed, multiple calls to the program will produce the same result.
Are there other nondeterministic operations in the code that cause the randomness ?

@tkipf
Copy link
Owner

tkipf commented Feb 24, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants