Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unparsed options to tunner are ignored, and better handling of platform/device options #528

Open
baryluk opened this issue Feb 10, 2024 · 1 comment

Comments

@baryluk
Copy link
Contributor

baryluk commented Feb 10, 2024

For example:

--platform=1

is ignored, neither used or reported. Either it should be supported, and if not possible, program should exit and not continue when seeing unhanded options.

CLBlast-1.6.2-linux-x86_64$ LD_LIBRARY_PATH=./lib ./bin/clblast_tuner_transpose_fast --platform=1
* Options given/available:
    -platform 0 [=default]
    -device 0 [=default]
    -precision 32 (single) [=default]
    -m 1024 [=default]
    -n 1024 [=default]
    -alpha 2.00 [=default]
    -fraction 1.00 [=default]
    -runs 10 [=default]
    -max_l2_norm 0.00 [=default]

Additionally, it would be nice for the platform and device used, be actually printed, similar to clinfo -l:

$ clinfo -l
Platform #0: Portable Computing Language
 `-- Device #0: cpu-haswell-AMD Ryzen Threadripper 2950X 16-Core Processor
Platform #1: AMD Accelerated Parallel Processing
 `-- Device #0: gfx1030

Also Device Type property (CPU, GPU), would be good to have.

Because it is easy to mess things up, (especially if for any reason platform and device order is random) and confirmation should help. Option to also filter by device type (disabled by default) would also be good.

@CNugteren
Copy link
Owner

Make sense. I'm happy to accept a PR that improves the command-line argument parsing of the tuner, but I don't have time myself to work on this right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants