-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Making ProjCL build/run on Linux, use GPU #4
Conversation
- Add CBLAS and LAPACK to Linux cmake configuration - Add standard math library to Linux configuraiton - Add API calls and more error checking to pl_context_init - test/projcl_test defaults to GPU, argument -CPU uses CPU - Fix minor syntax error in pl_project_oblique_stereographic.opencl kernel
- Add CBLAS and LAPACK to Linux cmake configuration - Add standard math library to Linux configuraiton - test/projcl_test defaults to GPU, argument -CPU uses CPU - Fix minor syntax error in pl_project_oblique_stereographic.opencl kernel
Because I am a git noob and did not add all changes previously, that's why...
Thanks Bob. The changes look OK from here, so I'll go ahead and merge. This is not a request per se but in the future I would like to see some test coverage (Travis) to ensure things don't break as you propose more invasive changes. |
Btw I'm getting some warnings:
|
The projcl_test program generates nearly the same outputs on macOS and Linux. All projections except transverse mercator report 'ok', so that's something: Testing consistency of Transverse Mercator The above results are slightly different when using OpenCL CPU and GPU on my MacBook Pro, so some investigation is indicated. This project will save me so much grief that writing tests is a small price to pay. I will probably be adding an orthographic projection down the road. |
Thanks for the report. I'm mainly interested in getting |
I've opened separate issues for the various things mentioned in this discussion: https://github.com/evanmiller/ProjCL/issues Feel free to add more. |
Boy, go away for a weekend and the Open Source Fairy flies in and fixes all your problems. I was going to delete the context_properties array which isn't actually used, but you already got it, along with my other oversight. In my Copious FreeTime this week, I will look into an interface for separate lat/lon arrays. It should bottom out in replacing clCreateBuffer(copy) with two clEnqueueWriteBufferRect()s. In separate functions, without disturbing your current API. |
Sounds good @BobBane. Been a while since I gave the project any love, so I guess I was overdue :-) I've made some breaking changes to the project/unproject API to make things simpler, and also added timing results to the test suite, so you can see what the speed-up looks like for each projection. The tests should all now pass on all devices, but the tests are limited to a relatively easy box of points. The American Polyconic and Transverse Mercator inverse algorithms start having trouble when they stray too far from the latitude of origin, so just be warned. The level of accuracy required by the tests isn't terribly stringent (10m in projected coordinates, 2.5 arc-seconds in geodetic coordinates). By the way if you're working with TIFF files, you might check out the (recently documented) image sampling API for doing warping on the GPU. Fun stuff. As for memory layout: Were you thinking of just doing another version of |
I did notice the timing results. The speedup looks really promising! A modified pl_load_projection_data looks like the path of least resistance to me. I'm OK with requiring a copy through the new interface to avoid munging the kernels. I am going to have to change the kernels a bit, though. I really need double precision - the code I want to replace uses it. I was thinking of changing the kernel types like this: float -> KFLOAT and then adding definitions for those symbols at kernel compile time. The interesting part will be having the new kernels live alongside the current ones. Need to contemplate the kernel naming conventions, to extend them carefully. |
Let's move the double-precision discussion to a new issue (will CC you shortly). |
This worked out of the box on macOS, and still does; now the test program uses your OpenCL GPU by default, and can use your OpenCL CPU with the -CPU command-line argument.
My long-range goal is to get GPU-enabled proj4 into our processing pipeline here at the Direct Readout Lab. Our geotiff software spends about half its time inside proj4 right now, so this should be a serious improvement.
To make it work for our data, I need to:
As those changes will probably be pretty invasive, I figured the polite thing to do was push the platform changes separately.