Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter out empty tensors inside trim_to_layer #7942

Merged
merged 14 commits into from
Sep 11, 2023

Conversation

kaixuanliu
Copy link
Contributor

filter out empty tensor for x, edge_index, edge_attr after calling trim_to_layer function. This can avoid unnecessary computation when some node/edge types get empty output. For example: when I train igbh-tiny dataset with 3 hops sampler and use trim_to_layer function, I get a lot of empty edge_index tensor for edge type '('author', 'affiliated_to', 'institute')', but the feature tensor for 'author' node type is still sent to compute in HeteroConv implementation.

… `trim_to_layer` function

Signed-off-by: Liu,Kaixuan <kaixuan.liu@intel.com>
@kaixuanliu kaixuanliu requested a review from wsad1 as a code owner August 28, 2023 08:47
Signed-off-by: Liu,Kaixuan <kaixuan.liu@intel.com>
@rusty1s
Copy link
Member

rusty1s commented Aug 28, 2023

Can we add an additional test for this? Do you have some benchmark on the speed-up?

@rusty1s rusty1s changed the title filter out empty tensor for x, edge_index and edge_attr after calling… Filter out empty tensors inside trim_to_layer Aug 28, 2023
@kaixuanliu
Copy link
Contributor Author

Can we add an additional test for this? Do you have some benchmark on the speed-up?

Sure, will add related test case and benchmark data afterwards.

@kaixuanliu
Copy link
Contributor Author

Can we add an additional test for this? Do you have some benchmark on the speed-up?

I tried to use examples/hetero/hierarchical_sage.py to do benchmark, but as the dataset(ogbn-mags) is not big enough, so there is no obvious speed up. However, when I changed to larger dataset(igbh-medium), which is called from GLTorch, the time consumption drops from 7:23:39 to 6:43:24.

@andreazanetti
Copy link
Contributor

andreazanetti commented Aug 31, 2023

Hi @kaixuanliu @rusty1s , I have some questions about this.
I have been working on the --trim support for the homo case, whereas on its hetero-case implications I am still not 100% crystal clear on all cases, and I am developing my understanding :)
So I would really appreciate if I could get clarified about the following:

a) Are the two cases tested (with and without filtering-out) numerically equivalent? I suppose they should.

b) Is there any chance of corner cases for which, after some 'trimming' only one kind of tensor is empty (say edge_index) but not another (say x is not)? Would that situation be problematic once the removal happens only for one of the 2?

c) Is it reasonable to think that the performance gain may be connected to the size AND the topology of the graph?

d) Are any drop in performance possible (in some scenarios) for the additional calls (and for loops) added here?

Thanks very much!

@kaixuanliu
Copy link
Contributor Author

kaixuanliu commented Aug 31, 2023

Hi @andreazanetti , I try to answer your questions. And @rusty1s maybe you can help correct my answer if I have some mis-understandings.

a) Are the two cases tested (with and without filtering-out) numerically equivalent? I suppose they should.
Yes. I tried my best to keep them numerically equivalent when do the testing, except that there is still some difference because of the randomness during sampling. However, I think the randomness cannot cause above 8% perf difference for igbh-medium dataset.

b) Is there any chance of corner cases for which, after some 'trimming' only one kind of tensor is empty (say edge_index) but not another (say x is not)? Would that situation be problematic once the removal happens only for one of the 2?
There may be the case that some edge type's edge index is empty, but one of the two related node side feature(src or dst ) is not empty. But I think it should not be a problem to remove them.

c) Is it reasonable to think that the performance gain may be connected to the size AND the topology of the graph?
Yes. It is related to the size and topology of the graph, as well as the num_neighbors and hops of sampler.

d) Are any drop in performance possible (in some scenarios) for the additional calls (and for loops) added here?
The additional calls is negligible, they are just called for each layer and the for loop is for the node/edge type. I think we can ignore it.

@andreazanetti
Copy link
Contributor

Apologies for the delay in my reply. I was off sick.

a) Are the two cases tested (with and without filtering-out) numerically equivalent? I suppose they should.
Yes. I tried my best to keep them numerically equivalent when do the testing, except that there is still some difference because of the randomness during sampling. However, I think the randomness cannot cause above 8% perf difference for igbh-medium dataset.

a) AZ: Could you clarify how do you estimate the 8%? thanks

b) Is there any chance of corner cases for which, after some 'trimming' only one kind of tensor is empty (say edge_index) but not another (say x is not)? Would that situation be problematic once the removal happens only for one of the 2?
There may be the case that some edge type's edge index is empty, but one of the two related node side feature(src or dst ) is not empty. But I think it should not be a problem to remove them.

b) AZ: Could you clarify how this removal would happen? I am not sure it is crystal clear to me, apologies. Also, I was wondering if those cases and their removal should be handled in this PR.

c) Is it reasonable to think that the performance gain may be connected to the size AND the topology of the graph?
Yes. It is related to the size and topology of the graph, as well as the num_neighbors and hops of sampler.

c) AZ: Makes sense to me, thanks

d) Are any drop in performance possible (in some scenarios) for the additional calls (and for loops) added here?
The additional calls is negligible, they are just called for each layer and the for loop is for the node/edge type. I think we can ignore it.

d) AZ: Intuitively makes sense to me. Some data from tests would be great to show this however. Also the removal of the corner cases mentioned above might add some additional checks, I guess.

@kaixuanliu
Copy link
Contributor Author

kaixuanliu commented Sep 5, 2023

Apologies for the delay in my reply. I was off sick.

a) Are the two cases tested (with and without filtering-out) numerically equivalent? I suppose they should. Yes. I tried my best to keep them numerically equivalent when do the testing, except that there is still some difference because of the randomness during sampling. However, I think the randomness cannot cause above 8% perf difference for igbh-medium dataset. a) AZ: Could you clarify how do you estimate the 8%? thanks
Yes. I did performance benchmark using this igbh example from gltorch, which utilize pyg's trim_to_layer function. The command line is like this: numactl -C 0-55 -m 0,1,2,3 python train_rgnn.py --dataset_size='medium' --path /lfs/lfs14/kaixuan/igbh/ --with_trim=True --epochs 2, and the original time consumption is 7:23:39, after applying this patch, time consumption drops to 6:45:32, hence get ~8% perf improvement.

b) Is there any chance of corner cases for which, after some 'trimming' only one kind of tensor is empty (say edge_index) but not another (say x is not)? Would that situation be problematic once the removal happens only for one of the 2? There may be the case that some edge type's edge index is empty, but one of the two related node side feature(src or dst ) is not empty. But I think it should not be a problem to remove them. b) AZ: Could you clarify how this removal would happen? I am not sure it is crystal clear to me, apologies. Also, I was wondering if those cases and their removal should be handled in this PR.
**For example, before applying this patch, one output after trim_to_layer function is edge type:('institute', 'rev_affiliated_to', 'author'), edge shape:torch.Size([2, 0]), src shape:torch.Size([0, 128]), dst shape:torch.Size([19723, 128]), and this output will still be sent to HeteroConv class to do convolution computation, which is not necessary. After applying this patch, the edge_index_dict will not have this edge type item ('institute', 'rev_affiliated_to', 'author'), and hence this can avoid unnecessary computation for this edge type in HeteroConv. But the feature of node type 'author' still remains, as there may be other edges exist with edge type ('paper', 'written_by', 'author'). **

c) Is it reasonable to think that the performance gain may be connected to the size AND the topology of the graph? Yes. It is related to the size and topology of the graph, as well as the num_neighbors and hops of sampler. c) AZ: Makes sense to me, thanks

d) Are any drop in performance possible (in some scenarios) for the additional calls (and for loops) added here? The additional calls is negligible, they are just called for each layer and the for loop is for the node/edge type. I think we can ignore it. d) AZ: Intuitively makes sense to me. Some data from tests would be great to show this however. Also the removal of the corner cases mentioned above might add some additional checks, I guess.
I also tested some other dataset include igbh-tiny, igbh-small ,ogbn-mags, there are no performance drop in all these datasets. For example, if we have a 3-layer model, the additional calls are invoked only 3 times for each batch, which can be ignored in the performance perspective as I think. For the corner cases, I think if the edge_index is empty, it means there is no link for specified edge type, and maybe we don't have to process the node feature related to it.

Signed-off-by: Liu,Kaixuan <kaixuan.liu@intel.com>
@kaixuanliu
Copy link
Contributor Author

@rusty1s @andreazanetti @rfdavid , Hi, I added related unit test, but seems there is sth wrong with test/loader/test_imbalanced_sampler.py file, so cannot get all test cases passed.

@rusty1s
Copy link
Member

rusty1s commented Sep 6, 2023

Should be fixed now.

@codecov
Copy link

codecov bot commented Sep 6, 2023

Codecov Report

Merging #7942 (c1a8f52) into master (dea1577) will increase coverage by 0.00%.
The diff coverage is 100.00%.

❗ Current head c1a8f52 differs from pull request most recent head 12bcd8a. Consider uploading reports for the commit 12bcd8a to get more accurate results

@@           Coverage Diff           @@
##           master    #7942   +/-   ##
=======================================
  Coverage   89.58%   89.58%           
=======================================
  Files         461      461           
  Lines       26970    26980   +10     
=======================================
+ Hits        24160    24170   +10     
  Misses       2810     2810           
Files Changed Coverage Δ
torch_geometric/utils/trim_to_layer.py 90.76% <100.00%> (+1.67%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@kaixuanliu
Copy link
Contributor Author

@rusty1s @wsad1 , Hi, do you have other comments or suggestions for this PR? Seems it needs another approval before merge.

torch_geometric/utils/trim_to_layer.py Outdated Show resolved Hide resolved
kaixuanliu and others added 3 commits September 11, 2023 11:32
@rusty1s rusty1s enabled auto-merge (squash) September 11, 2023 06:31
@rusty1s rusty1s merged commit 0438d3a into pyg-team:master Sep 11, 2023
14 checks passed
JakubPietrakIntel pushed a commit that referenced this pull request Sep 27, 2023
filter out empty tensor for `x`, `edge_index`, `edge_attr` after calling
`trim_to_layer` function. This can avoid unnecessary computation when
some node/edge types get empty output. For example: when I train
`igbh-tiny` dataset with 3 hops sampler and use `trim_to_layer`
function, I get a lot of empty edge_index tensor for edge type
'('author', 'affiliated_to', 'institute')', but the feature tensor for
'author' node type is still sent to compute in `HeteroConv`
implementation.

---------

Signed-off-by: Liu,Kaixuan <kaixuan.liu@intel.com>
Co-authored-by: Matthias Fey <matthias.fey@tu-dortmund.de>
Co-authored-by: Jintang Li <cnljt@outlook.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants