Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Making ProjCL build/run on Linux, use GPU #4

Merged
merged 3 commits into from
Sep 14, 2017

Conversation

BobBane
Copy link
Contributor

@BobBane BobBane commented Sep 14, 2017

This worked out of the box on macOS, and still does; now the test program uses your OpenCL GPU by default, and can use your OpenCL CPU with the -CPU command-line argument.

My long-range goal is to get GPU-enabled proj4 into our processing pipeline here at the Direct Readout Lab. Our geotiff software spends about half its time inside proj4 right now, so this should be a serious improvement.

To make it work for our data, I need to:

  • Make it work for double-precision floats
  • Feed it data as separate lat and lon arrays instead of interspersed

As those changes will probably be pretty invasive, I figured the polite thing to do was push the platform changes separately.

- Add CBLAS and LAPACK to Linux cmake configuration
- Add standard math library to Linux configuraiton
- Add API calls and more error checking to pl_context_init
- test/projcl_test defaults to GPU, argument -CPU uses CPU
- Fix minor syntax error in pl_project_oblique_stereographic.opencl kernel
- Add CBLAS and LAPACK to Linux cmake configuration
- Add standard math library to Linux configuraiton
- test/projcl_test defaults to GPU, argument -CPU uses CPU
- Fix minor syntax error in pl_project_oblique_stereographic.opencl kernel
Because I am a git noob and did not add all changes previously, that's why...
@evanmiller
Copy link
Owner

Thanks Bob. The changes look OK from here, so I'll go ahead and merge. This is not a request per se but in the future I would like to see some test coverage (Travis) to ensure things don't break as you propose more invasive changes.

@evanmiller evanmiller merged commit 3a25978 into evanmiller:master Sep 14, 2017
@evanmiller
Copy link
Owner

Btw I'm getting some warnings:

/Users/emiller/Code/ProjCL/src/projcl.c:58:69: warning: incompatible pointer to integer
      conversion initializing 'cl_context_properties' (aka 'long') with an expression of
      type 'cl_platform_id' (aka 'struct _cl_platform_id *') [-Wint-conversion]
  ...context_properties[] = {CL_CONTEXT_PLATFORM, platform_id[PLATFORM_INDEX], NULL};
                                                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~
/Users/emiller/Code/ProjCL/src/projcl.c:58:98: warning: incompatible pointer to integer
      conversion initializing 'cl_context_properties' (aka 'long') with an expression of
      type 'void *' [-Wint-conversion]
  ...context_properties[] = {CL_CONTEXT_PLATFORM, platform_id[PLATFORM_INDEX], NULL};

/Users/emiller/Code/ProjCL/src/projcl.c:61:60: warning: incompatible pointer to integer
      conversion passing 'void *' to parameter of type 'cl_uint' (aka 'unsigned int')
      [-Wint-conversion]
        error = clGetDeviceIDs(platform_id[PLATFORM_INDEX], type, NULL, NULL, &num...

@BobBane
Copy link
Contributor Author

BobBane commented Sep 14, 2017

The projcl_test program generates nearly the same outputs on macOS and Linux. All projections except transverse mercator report 'ok', so that's something:

Testing consistency of Transverse Mercator
-- Spherical, centered... 2 failures
**** Max longitudinal error: 0.000004 at (45.000000, 0.000000)
**** Max latitudinal error: 0.019782 at (45.000000, 0.000000)
-- Spherical, off-center... 50 failures
**** Max longitudinal error: 0.000015 at (44.489330, 6.760151)
**** Max latitudinal error: 16.308531 at (44.255035, 8.154260)
-- Ellipsoidal, centered... 674 failures
**** Max longitudinal error: 0.087635 at (43.515068, 11.464683)
**** Max latitudinal error: 0.025311 at (43.133480, 12.825867)
-- Ellipsoidal, off-center... 650 failures
**** Max longitudinal error: 1.193104 at (-41.943199, -16.302391)
**** Max latitudinal error: 0.494332 at (-42.884071, -13.636587)

The above results are slightly different when using OpenCL CPU and GPU on my MacBook Pro, so some investigation is indicated.

This project will save me so much grief that writing tests is a small price to pay. I will probably be adding an orthographic projection down the road.

@evanmiller
Copy link
Owner

Thanks for the report. I'm mainly interested in getting projcl_test to pass again, and hooking it up to Travis so I can see if pull requests cause breakage on Mac or Linux or both. I'm most interested in ensuring the projections are accurate and covered by tests; testing various run-time or platform features isn't very important at this stage IMHO.

@evanmiller
Copy link
Owner

I've opened separate issues for the various things mentioned in this discussion:

https://github.com/evanmiller/ProjCL/issues

Feel free to add more.

@BobBane
Copy link
Contributor Author

BobBane commented Sep 20, 2017

Boy, go away for a weekend and the Open Source Fairy flies in and fixes all your problems.

I was going to delete the context_properties array which isn't actually used, but you already got it, along with my other oversight.

In my Copious FreeTime this week, I will look into an interface for separate lat/lon arrays. It should bottom out in replacing clCreateBuffer(copy) with two clEnqueueWriteBufferRect()s.

In separate functions, without disturbing your current API.

@evanmiller
Copy link
Owner

Sounds good @BobBane. Been a while since I gave the project any love, so I guess I was overdue :-)

I've made some breaking changes to the project/unproject API to make things simpler, and also added timing results to the test suite, so you can see what the speed-up looks like for each projection.

The tests should all now pass on all devices, but the tests are limited to a relatively easy box of points. The American Polyconic and Transverse Mercator inverse algorithms start having trouble when they stray too far from the latitude of origin, so just be warned. The level of accuracy required by the tests isn't terribly stringent (10m in projected coordinates, 2.5 arc-seconds in geodetic coordinates).

By the way if you're working with TIFF files, you might check out the (recently documented) image sampling API for doing warping on the GPU. Fun stuff.

As for memory layout: Were you thinking of just doing another version of pl_load_projection_data? A no-copy implementation will be difficult without modifying all the kernels too.

@BobBane
Copy link
Contributor Author

BobBane commented Sep 20, 2017

I did notice the timing results. The speedup looks really promising!

A modified pl_load_projection_data looks like the path of least resistance to me. I'm OK with requiring a copy through the new interface to avoid munging the kernels.

I am going to have to change the kernels a bit, though. I really need double precision - the code I want to replace uses it. I was thinking of changing the kernel types like this:

float -> KFLOAT
float8 -> KFLOAT8
float16 -> KFLOAT18

and then adding definitions for those symbols at kernel compile time.

The interesting part will be having the new kernels live alongside the current ones. Need to contemplate the kernel naming conventions, to extend them carefully.

@evanmiller
Copy link
Owner

Let's move the double-precision discussion to a new issue (will CC you shortly).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants