Can Scatter Be Reproducible? #226

yuxiang-guo · 2021-07-09T14:11:18Z

I used scatter function to implement GCN like this:

re = scatter(embed, index, dim=0, out=None, dim_size=embed.size, reduce='mean')

I found that even though the input (embed and index) are the same when I run the code twice, the output are still different. So I want to know if the scatter method has any random seed, which lead to the result cannot be reproduced? How can I make the result of scatter deterministic? Thank you very much.

rusty1s · 2021-07-10T08:14:03Z

Scatter is a non-deterministic operation by design since it makes use of atomic operations in which the order of aggregation is non-deterministic, leading to minor numerical differences. As an alternative, you can make use of the segment_csr operation of torch_scatter.

yuxiang-guo · 2021-07-10T08:23:19Z

Scatter is a non-deterministic operation by design since it makes use of atomic operations in which the order of aggregation is non-deterministic, leading to minor numerical differences. As an alternative, you can make use of the segment_csr operation of torch_scatter.

Thanks very much. But how can I get the same result by using the segment_csr operation instead of `scatter' operation?

rusty1s · 2021-07-10T08:33:51Z

segment_csr expects indices to be sorted, and the ptr tensor denotes the compressed index representation similar to a sparse matrix CSR representation:

index = torch.tensor([0, 0, 0, 1, 2, 2])
ptr = torch.tensor([0, 3, 4, 6])

scatter(x, index, dim=0) == segment_csr(x, ptr)

yuxiang-guo · 2021-07-10T08:35:44Z

segment_csr expects indices to be sorted, and the ptr denotes the compressed index representation similar to a sparse matrix CSR representation:
index = torch.tensor([0, 0, 0, 1, 2, 2])
ptr = torch.tensor([0, 3, 4, 6])

scatter(x, index, dim=0) == segment_csr(x, ptr)

Thank you very much!

yuxiang-guo · 2021-07-11T13:31:59Z

segment_csr expects indices to be sorted, and the ptr tensor denotes the compressed index representation similar to a sparse matrix CSR representation:
index = torch.tensor([0, 0, 0, 1, 2, 2])
ptr = torch.tensor([0, 3, 4, 6])

scatter(x, index, dim=0) == segment_csr(x, ptr)

By the way, if I cannot guarantee the indices are in order. And sorting is too time-consuming. How can I do or is there any other method that can achieve the same result as `scatter'?

rusty1s · 2021-07-12T06:53:00Z

If you cannot guarantee that indices are in order, there exists no parallelized operation that can group elements while still being deterministic. This is just the natural imprecision of floating point arithmetic.

github-actions · 2022-01-09T01:09:18Z

This issue had no activity for 6 months. It will be closed in 2 weeks unless there is some new activity. Is this issue already resolved?

github-actions bot added the stale label Jan 9, 2022

github-actions bot closed this as completed Jan 24, 2022

jzhou316 mentioned this issue Nov 11, 2022

Question about result harvardnlp/botnet-detection#24

Open

hongsukchoi mentioned this issue Mar 4, 2023

return_index option for torch.unique pytorch/pytorch#36748

Open

lverret mentioned this issue Mar 21, 2023

[Question] Is there an algo for converting index to ptr efficiently? #361

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can Scatter Be Reproducible? #226

Can Scatter Be Reproducible? #226

yuxiang-guo commented Jul 9, 2021

rusty1s commented Jul 10, 2021

yuxiang-guo commented Jul 10, 2021

rusty1s commented Jul 10, 2021 •

edited

yuxiang-guo commented Jul 10, 2021

yuxiang-guo commented Jul 11, 2021

rusty1s commented Jul 12, 2021

github-actions bot commented Jan 9, 2022

Can Scatter Be Reproducible? #226

Can Scatter Be Reproducible? #226

Comments

yuxiang-guo commented Jul 9, 2021

rusty1s commented Jul 10, 2021

yuxiang-guo commented Jul 10, 2021

rusty1s commented Jul 10, 2021 • edited

yuxiang-guo commented Jul 10, 2021

yuxiang-guo commented Jul 11, 2021

rusty1s commented Jul 12, 2021

github-actions bot commented Jan 9, 2022

rusty1s commented Jul 10, 2021 •

edited