Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error:CUDA error: no kernal image is available for executio #87

Open
foxtrot99a opened this issue Dec 14, 2023 · 7 comments
Open

Error:CUDA error: no kernal image is available for executio #87

foxtrot99a opened this issue Dec 14, 2023 · 7 comments

Comments

@foxtrot99a
Copy link

foxtrot99a commented Dec 14, 2023

First attempt running v0.5.0 dev 005. No picture preview and renders without picture too. (error below)

flameTimewarpML:Error:CUDA error: no kernal image is available for execution on the deviceCUDA kernal errors might be asynchronously reported as some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1

Setup: Flame 2024 on rocky8.5, A6000 GPU (driver 525.89.02, cuda 12.0)

Have updated PyTorch with no joy.

Shell log attached
Uploading flame2024_flame07_shell.log…

@rhodiel
Copy link

rhodiel commented Dec 15, 2023

Same as above

Setup: Flame 2024 on rocky8.7, A5000 GPU (driver 525.116.04, cuda 12.0)

flame_log.txt

@talosh
Copy link
Owner

talosh commented Dec 15, 2023

Hi guys, this most likely comes from the fact that I'm trying to keep 2023 still supported and pytorch version 1.12.1 is too old to play well with your much more recent driver / gpu setup. I might have to update python 3.10 (Flame 2024) bundled pytorch to more recent version.

In the meantime could you please try to whack it in manually and see if it works?
In order to not to interfere with anything else the best would be to create venv, get pytorch installed there and copy it over to flameTimewarpML.

  1. Navgate somewhere you have full write acces (home folder for example)
  2. /opt/Autodesk/python/2023.3/bin/python -m venv torchenv

Please replace flame version with yours. That would create a folder named torchenv with python virtual environment from your flame python install

  1. ./torchenv/bin/python -m pip install numpy

This will install numpy package into torchenv. You need to have internec connection open for this.

4)./torchenv/bin/python -m pip install torch==1.13.1+cu117 --extra-index-url https://download.pytorch.org/whl/cu117

This will install torch 1.13.1 for cuda 11.7. You can also try the most recent pytorch just by ./torchenv/bin/python -m pip install torch but I think trying out 1.13.1 is more safe approach

remove everything from <your_flame_python_dir>/flameTimewarpML/packages/.lib/python3.10/site-packages/
5) copy the contents of torchenv/lib/python3.9/site-packages/ to <your_flame_python_dir>/flameTimewarpML/flameTimewarpML/packages/.lib/python3.10/site-packages/

This should upgrade pytorch that bundled with flameTimewarpML to 1.13.1

Let me know if this works

@rhodiel
Copy link

rhodiel commented Dec 16, 2023 via email

@foxtrot99a
Copy link
Author

Thanks Talosh. Works perfectly with your instructions. Only thing of note is python3.9 is still the default on vanilla flame2024 rocky8.5 build.

@rhodiel
Copy link

rhodiel commented Dec 19, 2023

Anyone experience this or seen. this?

IMG_5824

Rocky8.7
2024.1
RTX A5000

Package Version


numpy 1.24.1
pip 23.0.1
setuptools 65.5.0
torch 1.13.1+cu117
typing_extensions 4.9.0

[notice] A new release of pip is available: 23.0.1 -> 23.3.2
[notice] To update, run: /opt/Autodesk/python/2024.1/bin/torchenv/bin/python -m pip install --upgrade pip

import torch
import numpy
torch.version
'1.13.1+cu117'
numpy.version
'1.24.1'

NVIDIA-SMI 525.116.04 Driver Version: 525.116.04 CUDA Version: 12.0

@talosh
Copy link
Owner

talosh commented Jan 2, 2024

Thank you for pointing it out, I think it has been changed to 3.10 at some point in 2024 versions but need to double-check it

@amilkis
Copy link

amilkis commented Jan 4, 2024

Screenshot from 2024-01-04 13-06-46
I'm running 2024.2.1 on an A6000 ADA. Just used your fix above to get it working and I'm getting artifacts like this on every other frame.

Just wondering...is it possible to get 0.4.3 to run on this graphics card using the fix above?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants