RandomNodeLoader Unequal number of nodes in each batch #9403

GARV-k · 2024-06-06T10:23:57Z

🐛 Describe the bug

For the following code:
for split in range(splits):
print(f"for loop in split_{split+1 }:")
data_pass = Data(x=data_obj.x, edge_index = data_obj.edge_index, num_classes=max(data_obj.y).item() + 1, num_features = data_obj.x.shape[1], y=data_obj.y, train_mask=data_obj.train_mask[:,split], test_mask=data_obj.test_mask[:,split])

#loader = GraphSAINTRandomWalkSampler(data_pass, batch_size=batch_size, walk_length=walk_length,
       #                          num_steps=num_steps, sample_coverage=sample_coverage)
loader = RandomNodeLoader(data_pass,10)
#loader = ShaDowKHopSampler(data_obj, depth=2, num_neighbors=5,
                         #    node_idx=data_obj.train_mask)

# Usage
#loader = FixedSizeNodeLoader(data_pass, batch_size=760, shuffle=True)
print(data_pass.train_mask.sum()+data_pass.test_mask.sum())
print(data_pass.num_nodes)
for data in loader:
    num_nodes = data.num_nodes
    break
for idx, data in enumerate(loader): 
    adj_t = adj_t = to_dense_adj(data.edge_index,data.edge_weight)
    print(f"for {idx +1} batch the shape of adj matrix is"+ str(adj_t.shape))

The output is :
Device: cuda:0
Optimization started....
for loop in split_1:
tensor(5168)
7600
for 1 batch the shape of adj matrix istorch.Size([1, 760, 760])
for 2 batch the shape of adj matrix istorch.Size([1, 760, 760])
for 3 batch the shape of adj matrix istorch.Size([1, 751, 751])
for 4 batch the shape of adj matrix istorch.Size([1, 760, 760])
for 5 batch the shape of adj matrix istorch.Size([1, 759, 759])
for 6 batch the shape of adj matrix istorch.Size([1, 760, 760])
for 7 batch the shape of adj matrix istorch.Size([1, 758, 758])
for 8 batch the shape of adj matrix istorch.Size([1, 759, 759])
for 9 batch the shape of adj matrix istorch.Size([1, 758, 758])
for 10 batch the shape of adj matrix istorch.Size([1, 760, 760])

My doubt :
if the total number of nodes is 7600 and num_batches = 10 then each of these adj matrix shape should contain 760 nodes right.
But why isn't it the case.
Let me know if any other info is required.

P.S : There was this warning in the output although I don't think this will affect anything :
home/iplab/.local/lib/python3.10/site-packages/torch_geometric/typing.py:63: UserWarning: An issue occurred while importing 'torch-scatter'. Disabling its usage. Stacktrace: /home/iplab/.local/lib/python3.10/site-packages/torch_scatter/_version_cuda.so: undefined symbol: _ZN3c1017RegisterOperatorsD1Ev
warnings.warn(f"An issue occurred while importing 'torch-scatter'. "
/home/iplab/.local/lib/python3.10/site-packages/torch_geometric/typing.py:101: UserWarning: An issue occurred while importing 'torch-sparse'. Disabling its usage. Stacktrace: /home/iplab/.local/lib/python3.10/site-packages/torch_sparse/_version_cuda.so: undefined symbol: _ZN3c1017RegisterOperatorsD1Ev
warnings.warn(f"An issue occurred while importing 'torch-sparse'. "

Versions

Pytorch = 2.3
Ubuntu = 22.04

The text was updated successfully, but these errors were encountered:

rusty1s · 2024-06-14T06:22:12Z

What does

for data in loader:
    print(data.num_nodes)

return? I would expect that one adjacency matrix reports a smaller number of nodes due to isolated nodes.

GARV-k added the bug label Jun 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RandomNodeLoader Unequal number of nodes in each batch #9403

RandomNodeLoader Unequal number of nodes in each batch #9403

GARV-k commented Jun 6, 2024

rusty1s commented Jun 14, 2024

RandomNodeLoader Unequal number of nodes in each batch #9403

RandomNodeLoader Unequal number of nodes in each batch #9403

Comments

GARV-k commented Jun 6, 2024

🐛 Describe the bug

Versions

rusty1s commented Jun 14, 2024