Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RandomNodeLoader Unequal number of nodes in each batch #9403

Open
GARV-k opened this issue Jun 6, 2024 · 1 comment
Open

RandomNodeLoader Unequal number of nodes in each batch #9403

GARV-k opened this issue Jun 6, 2024 · 1 comment
Labels

Comments

@GARV-k
Copy link

GARV-k commented Jun 6, 2024

🐛 Describe the bug

For the following code:
for split in range(splits):
print(f"for loop in split_{split+1 }:")
data_pass = Data(x=data_obj.x, edge_index = data_obj.edge_index, num_classes=max(data_obj.y).item() + 1, num_features = data_obj.x.shape[1], y=data_obj.y, train_mask=data_obj.train_mask[:,split], test_mask=data_obj.test_mask[:,split])

#loader = GraphSAINTRandomWalkSampler(data_pass, batch_size=batch_size, walk_length=walk_length,
       #                          num_steps=num_steps, sample_coverage=sample_coverage)
loader = RandomNodeLoader(data_pass,10)
#loader = ShaDowKHopSampler(data_obj, depth=2, num_neighbors=5,
                         #    node_idx=data_obj.train_mask)

# Usage
#loader = FixedSizeNodeLoader(data_pass, batch_size=760, shuffle=True)
print(data_pass.train_mask.sum()+data_pass.test_mask.sum())
print(data_pass.num_nodes)
for data in loader:
    num_nodes = data.num_nodes
    break
for idx, data in enumerate(loader): 
    adj_t = adj_t = to_dense_adj(data.edge_index,data.edge_weight)
    print(f"for {idx +1} batch the shape of adj matrix is"+ str(adj_t.shape))

The output is :
Device: cuda:0
Optimization started....
for loop in split_1:
tensor(5168)
7600
for 1 batch the shape of adj matrix istorch.Size([1, 760, 760])
for 2 batch the shape of adj matrix istorch.Size([1, 760, 760])
for 3 batch the shape of adj matrix istorch.Size([1, 751, 751])
for 4 batch the shape of adj matrix istorch.Size([1, 760, 760])
for 5 batch the shape of adj matrix istorch.Size([1, 759, 759])
for 6 batch the shape of adj matrix istorch.Size([1, 760, 760])
for 7 batch the shape of adj matrix istorch.Size([1, 758, 758])
for 8 batch the shape of adj matrix istorch.Size([1, 759, 759])
for 9 batch the shape of adj matrix istorch.Size([1, 758, 758])
for 10 batch the shape of adj matrix istorch.Size([1, 760, 760])

My doubt :
if the total number of nodes is 7600 and num_batches = 10 then each of these adj matrix shape should contain 760 nodes right.
But why isn't it the case.
Let me know if any other info is required.

P.S : There was this warning in the output although I don't think this will affect anything :
home/iplab/.local/lib/python3.10/site-packages/torch_geometric/typing.py:63: UserWarning: An issue occurred while importing 'torch-scatter'. Disabling its usage. Stacktrace: /home/iplab/.local/lib/python3.10/site-packages/torch_scatter/_version_cuda.so: undefined symbol: _ZN3c1017RegisterOperatorsD1Ev
warnings.warn(f"An issue occurred while importing 'torch-scatter'. "
/home/iplab/.local/lib/python3.10/site-packages/torch_geometric/typing.py:101: UserWarning: An issue occurred while importing 'torch-sparse'. Disabling its usage. Stacktrace: /home/iplab/.local/lib/python3.10/site-packages/torch_sparse/_version_cuda.so: undefined symbol: _ZN3c1017RegisterOperatorsD1Ev
warnings.warn(f"An issue occurred while importing 'torch-sparse'. "

Versions

Pytorch = 2.3
Ubuntu = 22.04

@GARV-k GARV-k added the bug label Jun 6, 2024
@rusty1s
Copy link
Member

rusty1s commented Jun 14, 2024

What does

for data in loader:
    print(data.num_nodes)

return? I would expect that one adjacency matrix reports a smaller number of nodes due to isolated nodes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants