Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Results decrease after shuffling the dataset #11

Open
ChenAris opened this issue Sep 3, 2021 · 1 comment
Open

Results decrease after shuffling the dataset #11

ChenAris opened this issue Sep 3, 2021 · 1 comment

Comments

@ChenAris
Copy link

ChenAris commented Sep 3, 2021

Hi,

I have a question about the result. I run the code (UGformerV1_PyTorch/train_UGformerV1_UnSup.py) with shuffled dataset, and the result decreases sharply compared to the dataset without shuffling (Please correct me if I run it wrongly and the result remains the same with shuffling). I wonder what the reason is...

I found that the graph order, if the dataset is not shuffled, is strongly related to the graph labels in the original dataset (e.g., the former half of the dataset have label 0), so is the global node id. But I don't know where the model (Transformer or SampledSoftmax) uses the global node id information...

Thanks

@podismine
Copy link

I have found the same problem. I think it is caused by the sampled softmax, which is a biased estimate. The embedded features are in a normal distribution. The nearby features have little differences and could get a fine result with the labels in order.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants