Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--dump-assembly and --dump-optimized don't do anything #887

Closed
jeremyherbert opened this issue Dec 10, 2014 · 7 comments
Closed

--dump-assembly and --dump-optimized don't do anything #887

jeremyherbert opened this issue Dec 10, 2014 · 7 comments

Comments

@jeremyherbert
Copy link

@jeremyherbert jeremyherbert commented Dec 10, 2014

Hi,

I'm running the latest numbapro (numba: 0.15.1, numbapro: 0.16.0) from anaconda with python3.4. If I use either the --dump-assembly or the --dump-optimized flags, I don't actually get any assembly output, and according to my quick strace it is not writing files anywhere. I'm running https://github.com/ContinuumIO/numbapro-examples/blob/master/cuda_memory/pinned.py with

numba --dump-optimized pinned.py

What am I doing wrong?

@sklam
Copy link
Contributor

@sklam sklam commented Dec 11, 2014

The flags were not picked up by the cuda backend in 0.15.1. It is partially fixed in master (dump-assembly doesn't work #889).

In addition, the dump will print to screen if it works.

@sklam
Copy link
Contributor

@sklam sklam commented Dec 18, 2014

Status on master:
dump-optimized works (but numba does not use optimize cuda backend code; it is inside libnvvm)
dump-assembly is printing the final llvm IR, not the PTX. I will be fixing this in a PR that I am working on.

@stuartarchibald
Copy link
Contributor

@stuartarchibald stuartarchibald commented Apr 4, 2018

As of 0.37, using the noted sample, --dump-optimized and --dump-assembly both work. Closing. If this is still a problem please reopen.

@AndiH
Copy link

@AndiH AndiH commented Jul 13, 2018

Using 0.39.0, there's still some issues with dumping LLVM information. Not all environment variables seem to work. Namely NUMBA_DUMP_ANNOTATION, NUMBA_DUMP_ASSEMBLY, NUMBA_DUMP_LLVM, and NUMBA_DUMP_OPTIMIZED.

Using same pinned.py example file from above:

$ for e in BYTECODE CFG IR ANNOTATION ASSEMBLY LLVM FUNC_OPT OPTIMIZED; do echo $e; eval "NUMBA_DUMP_$e"=1 numba pinned.py | grep -v regular | grep -v pinned | wc -l; done
BYTECODE
24
CFG
22
IR
182
ANNOTATION
0
ASSEMBLY
0
LLVM
0
FUNC_OPT
376
OPTIMIZED
0

Browsing the source code, I see that these exact environment variables are used in numba_entry.py:L255-257 and set to their command-lines alternatives. To me, this sound like a bug. IMHO they should only be set to their command-line counterparts if the counterparts themselves are defined – and even then this behaviour should be noted somewhere.

@stuartarchibald
Copy link
Contributor

@stuartarchibald stuartarchibald commented Jul 13, 2018

@AndiH thanks. Please open a new ticket to track the specific DUMP_ options that are not working. I believe they are ok for CPU targets so it's probably something in the CUDA backend that either needs to raise as they aren't supported and be declared as such, or actually work.

@AndiH
Copy link

@AndiH AndiH commented Jul 13, 2018

Done! See #3105!

@stuartarchibald
Copy link
Contributor

@stuartarchibald stuartarchibald commented Jul 13, 2018

Great, thanks @AndiH

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants
You can’t perform that action at this time.