Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Causal model running on GPU #7

Closed
Warvito opened this issue Oct 26, 2020 · 7 comments
Closed

Causal model running on GPU #7

Warvito opened this issue Oct 26, 2020 · 7 comments

Comments

@Warvito
Copy link

Warvito commented Oct 26, 2020

Hi, I am trying to run the LM model with the causal = True on the GPU but I am getting some issues.

I am trying to run the following example:

import torch
from torch import nn
from performer_pytorch import PerformerLM

model = PerformerLM(
    num_tokens = 20000,
    max_seq_len = 2048,             # max sequence length
    dim = 512,                      # dimension
    depth = 6,                      # layers
    heads = 8,                      # heads
    causal = True,                 # auto-regressive or not
    nb_features = 256,              # number of random features, if not set, will default to (d * log(d)), where d is the dimension of each head
    generalized_attention = False,  # defaults to softmax approximation, but can be set to True for generalized attention
    kernel_fn = nn.ReLU(),          # the kernel function to be used, if generalized attention is turned on, defaults to Relu
    reversible = True,              # reversible layers, from Reformer paper
    ff_chunks = 10,                 # chunk feedforward layer, from Reformer paper
).cuda()

x = torch.randint(0, 20000, (1, 2048)).cuda()
model(x) # (1, 2048, 20000)

And I am getting this error:

Traceback (most recent call last):
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3343, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-4-a530c03a976e>", line 20, in <module>
    model(x) # (1, 2048, 20000)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/performer_pytorch/performer_pytorch.py", line 253, in forward
    x = self.performer(x, **kwargs)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/performer_pytorch/performer_pytorch.py", line 238, in forward
    return self.net(x, **kwargs)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/performer_pytorch/reversible.py", line 160, in forward
    out =  _ReversibleFunction.apply(x, blocks, args)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/performer_pytorch/reversible.py", line 113, in forward
    x = block(x, **kwarg)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/performer_pytorch/reversible.py", line 65, in forward
    y1 = x1 + self.f(x2, record_rng=self.training, **f_args)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/performer_pytorch/reversible.py", line 40, in forward
    return self.net(*args, **kwargs)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/performer_pytorch/performer_pytorch.py", line 170, in forward
    return self.fn(self.norm(x), **kwargs)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/performer_pytorch/performer_pytorch.py", line 216, in forward
    out = self.fast_attention(q, k, v)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/performer_pytorch/performer_pytorch.py", line 159, in forward
    out = attn_fn(q, k, v)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/performer_pytorch/performer_pytorch.py", line 110, in causal_linear_attention
    return CausalDotProduct.apply(q, k, v)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/fast_transformers/causal_product/__init__.py", line 48, in forward
    product
TypeError: 'NoneType' object is not callable

My system has:
TITAN RTX
CUDA Version: 10.2
Driver Version: 440.100

@lucidrains
Copy link
Owner

lucidrains commented Oct 26, 2020

@Warvito ahh, so not often spoken about is the fact that the auto-regressive flavor of linear attention actually incurs a pretty big memory cost (x sequence length) and requires special CUDA code to be performant (it is probably why google chose to do this in Jax)

EPFL wrote up a nice implementation, but i think it is somehow failing to be imported on your machine https://github.com/idiap/fast-transformers/blob/master/fast_transformers/causal_product/__init__.py#L12

@lucidrains
Copy link
Owner

@Warvito could you try python-ing into the interactive session and run

> import fast_transformers.causal_product.causal_product_cuda

and see what happens?

@Warvito
Copy link
Author

Warvito commented Oct 26, 2020

@lucidrains Thank you for the quick reply.

I tried the command that you asked and I got the following error:

Traceback (most recent call last):
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3343, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-6-87c39b6d500c>", line 1, in <module>
    import fast_transformers.causal_product.causal_product_cuda
  File "/home/walter/pycharm-2020.1.1/plugins/python/helpers/pydev/_pydev_bundle/pydev_import_hook.py", line 21, in do_import
    module = self._system_import(name, *args, **kwargs)
ModuleNotFoundError: No module named 'fast_transformers.causal_product.causal_product_cuda'

I have the 0.3.0 version installed here, and it works as expected when using casual=False.
I tried to uninstall pytorch-fast-transformers and install it again. But it did not worked.

I had the chance to try also in a system with a V100 and CUDA 11. And it worked as expected.
I also tried it on Google colab with a Tesla T4 and CUDA 10.1. And it worked as expected. Maybe it is something related with the RTX architecture? In any case, it might be a issue from pytorch-fast-transformers.

Thank you again for the quick reply, and thank you very much for all your repositories. ^^

@lucidrains
Copy link
Owner

@Warvito I'm in the dark as much as you are :( I have been putting off custom CUDA code for as long as I could, but the results of this paper was irresistible

@arti32lehtonen
Copy link

I had the same issue. I am not sure what worked for me but after some steps training with casual=True is working.

my steps:

  1. Add CUDA to the PATH variable
    export PATH=/usr/local/cuda-10.1/bin${PATH:+:${PATH}}
  2. Create LD_LIBRARY_PATH
    export LD_LIBRARY_PATH=/usr/local/cuda-10.1/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
  3. Create new environment and install fast-transformers as in that issue comment: causal_product_cuda.cu,Error compiling objects for extension idiap/fast-transformers#23 (comment)
  4. Install performer-pytorch after that

@yygle
Copy link

yygle commented Nov 6, 2020

@arti32lehtonen is right, make sure c++ tool chain (gcc) and cuda tool chain (nvcc) is available in your environment. If not, use export command make it visible (try "nvcc --version" after that), then reinstall the package.

@Warvito
Copy link
Author

Warvito commented Nov 12, 2020

Thx @arti32lehtonen and @yygle !
I tried your suggestions and it worked!

@Warvito Warvito closed this as completed Nov 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants