Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QM9 and SphereNet example error #77

Closed
yuanqidu opened this issue Jan 5, 2022 · 9 comments
Closed

QM9 and SphereNet example error #77

yuanqidu opened this issue Jan 5, 2022 · 9 comments
Labels
3dgraph Deep Learning on 3D Graphs

Comments

@yuanqidu
Copy link

yuanqidu commented Jan 5, 2022

Great work!

When I copied the code from the README file and run it with the QM9 dataset provided by DIG, it showed me the following error when I attempt to create a SphereNet model.

Traceback (most recent call last):
File "main_qm9.py", line 19, in
model = SphereNet(energy_and_force=False, cutoff=5.0, num_layers=4,
File "/opt/conda/lib/python3.8/site-packages/dig/threedgraph/method/spherenet/spherenet.py", line 265, in init
self.emb = emb(num_spherical, num_radial, self.cutoff, envelope_exponent)
File "/opt/conda/lib/python3.8/site-packages/dig/threedgraph/method/spherenet/spherenet.py", line 23, in init
self.dist_emb = dist_emb(num_radial, cutoff, envelope_exponent)
File "/opt/conda/lib/python3.8/site-packages/dig/threedgraph/method/spherenet/features.py", line 178, in init
self.reset_parameters()
File "/opt/conda/lib/python3.8/site-packages/dig/threedgraph/method/spherenet/features.py", line 181, in reset_parameters
torch.arange(1, self.freq.numel() + 1, out=self.freq).mul_(PI)
RuntimeError: a leaf Variable that requires grad is being used in an in-place operation.

@limei0307
Copy link
Collaborator

Hi @yuanqidu,

Thanks for your interest in our work.

I didn't have this issue. Could you try to replace line 176 to 181

        self.freq = torch.nn.Parameter(torch.Tensor(num_radial))

        self.reset_parameters()

    def reset_parameters(self):
        torch.arange(1, self.freq.numel() + 1, out=self.freq).mul_(PI)

with

self.freq = torch.nn.Parameter(
            data=torch.tensor(
                np.pi * np.arange(1, num_radial + 1, dtype=np.float32)
            ),
            requires_grad=True,
        )

Does it solve your problem?

Thanks

@zoexu119
Copy link
Collaborator

zoexu119 commented Jan 6, 2022

Hi @yuanqidu,

I also met this issue and believe it's due to the PyTorch version. You can also try to replace line 181 to be

self.freq.data = torch.arange(1, self.freq.numel() + 1).float().mul_(PI)

@Takaogahara
Copy link

Takaogahara commented Feb 12, 2022

Hello,

I had this issue and replacing the line 176 and 181 with the above suggestion worked for me.
My torch and DIG versions:

torch==1.10.2+cu113
torch-geometric==2.0.3
dive-into-graphs==0.1.2

@limei0307 limei0307 added the 3dgraph Deep Learning on 3D Graphs label Feb 15, 2022
@vinayak2019
Copy link

Hi @yuanqidu,

I also met this issue and believe it's due to the PyTorch version. You can also try to replace line 181 to be

self.freq.data = torch.arange(1, self.freq.numel() + 1).float().mul_(PI)

This did not work for me. Neither did the first solution. There is a dependency in spherenet.py line 29 which still causes an error

class emb(torch.nn.Module):
    def __init__(self, num_spherical, num_radial, cutoff, envelope_exponent):
        super(emb, self).__init__()
        self.dist_emb = dist_emb(num_radial, cutoff, envelope_exponent)
        self.angle_emb = angle_emb(num_spherical, num_radial, cutoff, envelope_exponent)
        self.torsion_emb = torsion_emb(num_spherical, num_radial, cutoff, envelope_exponent)
        self.reset_parameters()

    def reset_parameters(self):
        self.dist_emb.reset_parameters()

@limei0307
Copy link
Collaborator

Hi @vinayak2019,

Could you please provide more detail about the error? Thanks.
Besides, please install 'sympy' via 'pip install sympy'.

Thanks.

@vinayak2019
Copy link

vinayak2019 commented Apr 13, 2022

When I replace line 176 to 181 with the suggestion above I get the following error.

  File "sphere.py", line 20, in <module>
    model = SphereNet(energy_and_force=False, cutoff=5.0, num_layers=4,
  File "/home/vbh226/.local/lib/python3.8/site-packages/dig/threedgraph/method/spherenet/spherenet.py", line 266, in __init__
    self.emb = emb(num_spherical, num_radial, self.cutoff, envelope_exponent)
  File "/home/vbh226/.local/lib/python3.8/site-packages/dig/threedgraph/method/spherenet/spherenet.py", line 26, in __init__
    self.reset_parameters()
  File "/home/vbh226/.local/lib/python3.8/site-packages/dig/threedgraph/method/spherenet/spherenet.py", line 29, in reset_parameters
    self.dist_emb.reset_parameters()
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1185, in __getattr__
    raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'dist_emb' object has no attribute 'reset_parameters'

In this case I delete the following lines in features.py

        self.reset_parameters()

    def reset_parameters(self):
        torch.arange(1, self.freq.numel() + 1, out=self.freq).mul_(PI)

The error is expected as reset_parameters is no longer defined.

When I add lines backs to the file, while still modifying the self.freq I get the following error.

Traceback (most recent call last):
  File "sphere.py", line 20, in <module>
    model = SphereNet(energy_and_force=False, cutoff=5.0, num_layers=4,
  File "/home/vbh226/.local/lib/python3.8/site-packages/dig/threedgraph/method/spherenet/spherenet.py", line 266, in __init__
    self.emb = emb(num_spherical, num_radial, self.cutoff, envelope_exponent)
  File "/home/vbh226/.local/lib/python3.8/site-packages/dig/threedgraph/method/spherenet/spherenet.py", line 23, in __init__
    self.dist_emb = dist_emb(num_radial, cutoff, envelope_exponent)
  File "/home/vbh226/.local/lib/python3.8/site-packages/dig/threedgraph/method/spherenet/features.py", line 181, in __init__
    self.reset_parameters()
  File "/home/vbh226/.local/lib/python3.8/site-packages/dig/threedgraph/method/spherenet/features.py", line 184, in reset_parameters
    torch.arange(1, self.freq.numel() + 1, out=self.freq).mul_(PI)
RuntimeError: a leaf Variable that requires grad is being used in an in-place operation.

If I now set self.freq.data = torch.arange(1, self.freq.numel() + 1).float().mul_(PI). I still get the leaf Variable error.
The only way I could get it working is by doing this

class dist_emb(torch.nn.Module):
    def __init__(self, num_radial, cutoff=5.0, envelope_exponent=5):
        super(dist_emb, self).__init__()
        self.cutoff = cutoff
        self.envelope = Envelope(envelope_exponent)
        self.freq = torch.nn.Parameter(
            data=torch.tensor(
                np.pi * np.arange(1, num_radial + 1, dtype=np.float32)
            ),
            requires_grad=True,
        )
        self.reset_parameters()

    def reset_parameters(self):
       # self.freq.data = torch.arange(1, self.freq.numel() + 1, out=self.freq).mul_(PI)
       pass

I don't know how correct that is.

I am using
torch==1.11.0
dive-into-graphs (cloned from GitHub. The pip install dive-into-graphs has a bug line 53 mask is missing)

@limei0307
Copy link
Collaborator

limei0307 commented Apr 14, 2022

Hi @vinayak2019,

I think for the first solution, you can just remove the "self.dist_emb.reset_parameters()" in spherenet.py line 21 to 24 since the function doesn't need to reset_parameter.

For the second solution, could you provide the detailed error output?

Yes, you can just clone the code from GitHub since we updated the code after the latest pip install version (0.1.2).

Thanks.

@vinayak2019
Copy link

vinayak2019 commented Apr 14, 2022

Thanks, @limei0307

The second problem is the PyPI installed version. It is a different dataset that I have this problem, not the QM9. The error is the following.

Traceback (most recent call last):
  File "sphere.py", line 32, in <module>
    run3d.run(device, train_dataset, valid_dataset, test_dataset, model, loss_func, evaluation,
  File "/home/vbh226/.local/lib/python3.8/site-packages/dig/threedgraph/method/run.py", line 71, in run
    train_mae = self.train(model, optimizer, train_loader, energy_and_force, p, loss_func, device)
  File "/home/vbh226/.local/lib/python3.8/site-packages/dig/threedgraph/method/run.py", line 124, in train
    out = model(batch_data)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/vbh226/.local/lib/python3.8/site-packages/dig/threedgraph/method/spherenet/spherenet.py", line 293, in forward
    dist, angle, torsion, i, j, idx_kj, idx_ji = xyz_to_dat(pos, edge_index, num_nodes, use_torsion=True)
  File "/home/vbh226/.local/lib/python3.8/site-packages/dig/threedgraph/utils/geometric_computing.py", line 54, in xyz_to_dat
    idx_i_t = idx_i.repeat_interleave(num_triplets_t)
RuntimeError: repeats must have the same size as input along dim

I debugged the error to line 52-53

    repeat = num_triplets - 1
    num_triplets_t = num_triplets.repeat_interleave(repeat)

The package was installed with pip install dive-into-graphs
When I looked up GitHub, I found the code was different.

    repeat = num_triplets
    num_triplets_t = num_triplets.repeat_interleave(repeat)[mask]

So I clone the repository and pip install . Now this works after the fix in features.py we discussed above.

@limei0307
Copy link
Collaborator

Hi @vinayak2019,
Ok. I will close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3dgraph Deep Learning on 3D Graphs
Projects
None yet
Development

No branches or pull requests

5 participants