Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

slam_test error! #82

Open
jcyhcs opened this issue Dec 30, 2021 · 3 comments
Open

slam_test error! #82

jcyhcs opened this issue Dec 30, 2021 · 3 comments

Comments

@jcyhcs
Copy link

jcyhcs commented Dec 30, 2021

hi,professor:
i run slam_test always got error:
image
i also follow your tips that have run auto tuning script, and move auto_tuning_result.txt to resources dir,
my platform is jetson tx2 with jetcpack 4.5.1, i have 8G ram
so, what's the problem? please help me!

@puzzlepaint
Copy link
Collaborator

One possible reason for such a failure is that the concrete CUDA kernel that returns the "too many resources requested for launch" error was not run during the auto-tuning. This seems likely here since the failing test is a PCG test, and PCG is not used by default. If that is the case, it would be necessary to run auto-tuning while ensuring that the specific kernel is used during the run (e.g., using the --use_pcg parameter) and merge the resulting auto-tuning files.

@jcyhcs
Copy link
Author

jcyhcs commented Jan 4, 2022

@puzzlepaint
hi,professor:
i follow your advise,i execute auto tuning script with add --use_pcg , then it run iteration several times,but when got the final iteratio, got error!
image
what's the problem? please help!

@puzzlepaint
Copy link
Collaborator

This looks like a bug in bad slam to me, sorry for that. The CUDA auto tuner class supports using only a single block size per kernel per tuning iteration, but somehow it gets two different block sizes (512 and 1024) for one kernel in your case. Unfortunately, the given stack trace does not contain much information about where that happens.

I suspect that the issue could perhaps occur because the block size gets reduced in one case (after encountering the "too many resources requested for launch" issue) and maybe for some reason it does not get reduced in the other case where that kernel is called, if some other part of the kernel configuration differs that allows it to run with a larger block size in that case. But that is just speculation.

Not sure I can offer any help here. Perhaps the issue could be worked around by manually entering a working configuration in the auto-tuning file for the problematic kernels, if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants