Results decrease after shuffling the dataset #11

ChenAris · 2021-09-03T02:13:35Z

Hi,

I have a question about the result. I run the code (UGformerV1_PyTorch/train_UGformerV1_UnSup.py) with shuffled dataset, and the result decreases sharply compared to the dataset without shuffling (Please correct me if I run it wrongly and the result remains the same with shuffling). I wonder what the reason is...

I found that the graph order, if the dataset is not shuffled, is strongly related to the graph labels in the original dataset (e.g., the former half of the dataset have label 0), so is the global node id. But I don't know where the model (Transformer or SampledSoftmax) uses the global node id information...

Thanks

podismine · 2021-11-04T07:48:56Z

I have found the same problem. I think it is caused by the sampled softmax, which is a biased estimate. The embedded features are in a normal distribution. The nearby features have little differences and could get a fine result with the labels in order.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Results decrease after shuffling the dataset #11

Results decrease after shuffling the dataset #11

ChenAris commented Sep 3, 2021

podismine commented Nov 4, 2021

Results decrease after shuffling the dataset #11

Results decrease after shuffling the dataset #11

Comments

ChenAris commented Sep 3, 2021

podismine commented Nov 4, 2021