Allow CeedScalar to be single precision #788

nbeams · 2021-07-07T22:36:37Z

This PR would expand the use of CeedScalar to be double or single precision. See also discussion in #784.

Summary of main changes thus far:

Introduction of two different headers, ceed-f32.h and ceed-f64.h. To change between the types, you change which file is included in ceed.h. These files contain different values for CEED_EPSILON and CEED_ALIGN.
Addition of CeedScalarType enum and CEED_SCALAR_TYPE definition. In libCEED C/C++ code, CEED_SCALAR_TYPE can be used to check for type-dependent code (e.g. whether to use the AVX basis functions, several places in the MAGMA backend, CUDA/HIP vector norms, etc.).
Updating Python bindings: Most notably, places that used to have dtype="float64" now have dtype=scalar_types[lib.CEED_SCALAR_TYPE], where scalar_types is a Python dictionary with type strings.
Moved CeedQFunctionContextGetContextSize to be a public/user function -- this was actually done because of the Python bindings. It was used in get_data (not sure why it wasn't causing errors? It did when I was running tests directly in ipython) but not available through libceed.lib, and not having the correct size (in bytes) was causing failures in one of the Python QFunction tests when using single precision. (This was fixed by 9c4a53c)

Outstanding issues/todos:

Test Python bindings with CUDA -- made some adjustments based on changes required for fix QFunctionContext-related failures mentioned above, but haven't tested yet
Possibly add a macro to avoid compiling MAGMA DGEMM non-tensor basis code when CeedScalar is float (currently gives compiler warnings, but code should never be executed), or add full SGEMM support.
Fix/decide what to do about the following tests: t312, t314, t315, t316, t321, t322, t323 in the basis tests, and t501, t502, t503, t505, t506, t507, t511, t521, t524, t530, t550, t551, t552, t553 in the operator tests. These tests have a tolerance such that adjusting them directly for float as was done for other tests in 2c97b84 would result in a very large value for the tolerance (probably more than we want).
Fix/decide what to do about the tests that compare directly to output files: t300, t301, t302, t304, t305, t320; and t402. (Also affects the Python versions of these tests)
Update other examples as needed to work with float (the examples in /ceed have been updated already)
Fix Fortran tests for float
Update Rust bindings
Update Julia bindings

Closes #784.

… is double

include/ceed/ceed.h

pazner · 2021-07-08T01:07:47Z

I started working on the modifying the Julia interface to allow single precision. The work is in this branch.

For the most part I think it should be relatively straightforward.

It would be useful if there was a runtime function that returned e.g. 32 or 64 depending on the value of CEED_SCALAR_TYPE. That would allow the Julia interface to detect whether or not it was misconfigured. Since the user can supply their own version of the library, we cannot know a priori whether CeedScalar is 32 or 64 bits.

nbeams · 2021-07-08T15:30:23Z

Thanks @pazner!

I'm not sure I understand exactly how the Julia interface works. You mean the user could supply a compiled version of libCEED from a directory other than the one they're using the Julia package from?

There was some discussion on #784 about leaving things open to potentially more "exotic" types in the future, which may not be entirely defined by size. Would this be possible with the Julia interface? I think the idea was that it would be better to know the actual type rather than just its size.

pazner · 2021-07-08T16:09:07Z

I'm not sure I understand exactly how the Julia interface works. You mean the user could supply a compiled version of libCEED from a directory other than the one they're using the Julia package from?

Yeah, the idea would be that the Julia package comes pre-installed with a "basic version" of libCEED that is precompiled. This basic version does not include CUDA or HIP support, is compiled with the "generic" compiler flags, would use double precision for CeedScalar, etc.

If the user wants to use more advanced features (GPU support, architecture-specific compiler flags, single precision, etc.), they would have to compile their own version of libCEED from source, and tell LibCEED.jl to use that library instead of its preinstalled basic version.

There was some discussion on #784 about leaving things open to potentially more "exotic" types in the future, which may not be entirely defined by size. Would this be possible with the Julia interface? I think the idea was that it would be better to know the actual type rather than just its size.

That's a good point. Depending on the type, that should be fine. Any interface would be fine as far as I'm concerned, it could even just return a CeedScalarType enum value.

I just think it would be good to expose this information as a runtime function. If the user provides their own version of libCEED that uses e.g. float instead of double, then the Julia interface needs to be aware of this.

jeremylt · 2021-08-30T23:19:06Z

I added a draft of the code to add to the GitLab CI YAML above. I think running the core tests in f32 with FC= as a separate CI job on Noether should be fine.

nbeams · 2021-08-31T00:19:10Z

But what about the separate compilation and change to ceed.h? Can we ensure this job will not run in parallel with any of the other CI jobs on Noether?

jedbrown · 2021-08-31T01:42:45Z

Each CI job has its own working directory and I think we only do one CI job at a time (could increase number of slots, but this could be an issue for GPUs).

nbeams · 2021-08-31T14:05:24Z

Is the working directory just for building/running the tests, or does it temporarily copy the source as well (since we'd need to edit the header in the source)? Though I guess it wouldn't matter as long as the jobs run in serial.

jedbrown · 2021-08-31T14:59:43Z

We'd just use sed to edit the header. gitlab-runner is responsible for cleaning the repo before the next job runs.

nbeams · 2021-09-01T21:47:19Z

I tried adding a float job based on @jeremylt's suggestions above, but I see it's stuck: "This job is stuck because you don't have any active runners online or available with any of these tags assigned to them: float." When I try to click the link it provides about CI settings, I get a 404 error -- maybe because I don't have admin rights?

jedbrown · 2021-09-01T22:09:49Z

Let's just drop the float label for now. I can add the runner slot, but we'll want to avoid overloading the GPU. I probably need to configure 1-node SLURM or make a different solution to moderate over-subscription. (CPUs have the lovely property that oversubscription just makes them slow, but GPUs crash.)

Co-authored-by: Jeremy L Thompson <jeremy@jeremylt.org>

nbeams · 2021-09-01T22:26:26Z

~~Since it's "stuck," does it have to be manually stopped or something? I'm not seeing GitLab showing up after my force push.~~

Never mind, I guess I'm just impatient. :)

nbeams · 2021-09-01T23:01:05Z

Looks like the float job passed! (Assuming it truly ran in single precision, that is. The sed commands show up on the output, so it should have worked.)

I think I'm going to officially remove the WIP tag from this PR -- in my opinion, we should be ready for review with the intent of squash-merging.

pazner · 2021-09-01T23:39:18Z

Are there plans to also run the tests for the various language bindings in single precision mode? Or just the C test suite?

jeremylt · 2021-09-01T23:43:25Z

It would be nice to have single precision versions of Python, Rust, and Julia CI. These should be easy to add as separate jobs in those GitHub actions. I could do that Tuesday, possibly in a separate PR if we're eager to merge this one.

jedbrown · 2021-09-02T00:22:42Z

As this is experimental and evolving, I'm fine merging it without every-language CI.

jedbrown · 2021-09-02T04:09:33Z

@nbeams Would you like to selectively squash commits (e.g., you have some WIP commit messages) or squash-merge the whole thing?

nbeams · 2021-09-02T14:30:56Z

With so many commits, I was worried that getting a nice rebase might be quite the headache, but I'll give it a go and see if I can clean it up a bit.

jedbrown · 2021-09-02T14:46:48Z

Cool. One (somewhat extreme) approach is to squash the entire history into one, then amend (git citool and some others like Emacs Magit can help) to split out some parts that deserve to be separate. I'm also okay with the whole thing being one commit if you think that makes sense.

nbeams · 2021-09-02T19:31:03Z

Well, I spent almost an hour trying to come up with a nice history through rebase, and I'm not sure it's that much better (and I once again appear to have messed up the Julia conflict resolution process during the rebase, because the final result has some differences from the branch as it is here, so I definitely won't be pushing that version).

Personally, I'm okay with squash-merging the whole thing and replacing the default squash-merge commit message with a descriptive message explaining the major changes (addition of the new headers, addition of new enum, changes to the tests to use epsilon, and various changes to the other language bindings to get CeedScalar information). There are a lot of files changed, but most everything fits within those categories, I think? However, I can try again if there is a strong preference for keeping separate commits. I don't think I understand your suggestion @jedbrown, how to pick out individual commits after a squash -- if you want me to try this, do you have a link to an explanation of this somewhere?

There were some minor bugfixes included in this branch (notably changing the location of CeedQFunctionContextGetContextSize). @jeremylt mentioned that I might want to do a PR with just those changes that points to release rather than main -- are we starting this new workflow yet? The problem is that changes that are strictly bugfixes are not nicely separated from other changes in the commits, since I wasn't yet thinking in terms of this framework when "dividing up" commits. So we'd still ultimately have some differences of git history between release and this branch/main even without a squash-merge over here (can't just cherry-pick commits to go to release).

jedbrown · 2021-09-02T21:24:35Z

I'll just go ahead and merge. We can screenshare demo on a smaller PR sometime. Basically, you run git citool, which brings up a GUI and click "Amend", which allows you to deselect files/hunks/lines from the last commit, commit, then add new commits.

jedbrown · 2021-09-02T21:29:53Z

Congrats @nbeams on this big lift. 🚀 And thanks to @pazner @YohannDudouit @jeremylt for the discussion and contributions.

nbeams · 2021-09-02T21:34:09Z

Yes, a big thank you to everyone who contributed to this PR! A team effort, for sure. 🎉

YohannDudouit · 2021-09-03T20:13:55Z

backends/cuda-gen/ceed-cuda-gen-operator-build.cpp

@@ -28,15 +28,15 @@ static const char *atomicAdd = QUOTE(
 //------------------------------------------------------------------------------
 // Atomic add, for older CUDA
 //------------------------------------------------------------------------------
-__device__ double atomicAdd(double *address, double val) {
+__device__ CeedScalar atomicAdd(CeedScalar *address, CeedScalar val) {


I think this one should stay double or simply be removed with float, atomicAdd for float is always defined.

It's in CUDA-8 and later? We don't test ancient versions and I think it doesn't hurt us to just not support versions less than 8.

nbeams added 13 commits June 17, 2021 19:58

Introduce CeedScalarType enum

89ac9bd

WIP changes to allow different definitions of CeedScalar

f46369a

Introduce new header files for float and double

802c8c8

Only use avx tensor contract and MAGMA non-tensor basis if CeedScalar…

01e73ee

… is double

WIP changes to allow CeedScalar to be float

d8d7fad

WIP start trying to adjust test tolerances for float or double

2c97b84

fix typos in comments

fd9c01f

install ceed-f32/64 headers

d94b641

Fix missing casts for hipMAGMA element restrictions

cbcb488

make CeedQFunctionContextGetContextSize available for Python bindings

3e6db20

Changes to Python bindings to allow CeedScalar to be float

9c4a53c

WIP adjust Python tests for float or double

f059af9

make style

b2c564c

nbeams added design testing fortran 0-WIP Python Rust Julia labels Jul 7, 2021

nbeams requested review from jedbrown, pazner and jeremylt July 7, 2021 22:36

pazner reviewed Jul 8, 2021

View reviewed changes

include/ceed/ceed.h Show resolved Hide resolved

pazner mentioned this pull request Jul 8, 2021

Julia interface updates (Preferences.jl) #789

Merged

nbeams added 2 commits July 8, 2021 14:39

remove QFunctionContextGetContextSize from backend header

38c8540

Use quotes instead of <> in include statement

4386686

nbeams marked this pull request as ready for review September 1, 2021 21:52

Add single precision CI test for Noether

08af2bf

Co-authored-by: Jeremy L Thompson <jeremy@jeremylt.org>

nbeams force-pushed the natalie/ceedscalar-generalize branch from bfbfd12 to 08af2bf Compare September 1, 2021 22:22

nbeams added 1-In Review and removed 0-WIP labels Sep 1, 2021

jedbrown approved these changes Sep 2, 2021

View reviewed changes

pazner approved these changes Sep 2, 2021

View reviewed changes

jedbrown merged commit 80a9ef0 into main Sep 2, 2021

jedbrown deleted the natalie/ceedscalar-generalize branch September 2, 2021 21:28

YohannDudouit reviewed Sep 3, 2021

View reviewed changes

nbeams mentioned this pull request Mar 20, 2024

Minor LCOV fixes #1527

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow CeedScalar to be single precision #788

Allow CeedScalar to be single precision #788

nbeams commented Jul 7, 2021 •

edited

Loading

pazner commented Jul 8, 2021

nbeams commented Jul 8, 2021

pazner commented Jul 8, 2021

jeremylt commented Aug 30, 2021

nbeams commented Aug 31, 2021

jedbrown commented Aug 31, 2021

nbeams commented Aug 31, 2021 •

edited

Loading

jedbrown commented Aug 31, 2021

nbeams commented Sep 1, 2021

jedbrown commented Sep 1, 2021

nbeams commented Sep 1, 2021 •

edited

Loading

nbeams commented Sep 1, 2021 •

edited

Loading

pazner commented Sep 1, 2021

jeremylt commented Sep 1, 2021

jedbrown commented Sep 2, 2021

jedbrown commented Sep 2, 2021

nbeams commented Sep 2, 2021

jedbrown commented Sep 2, 2021

nbeams commented Sep 2, 2021

jedbrown commented Sep 2, 2021

jedbrown commented Sep 2, 2021

nbeams commented Sep 2, 2021

YohannDudouit Sep 3, 2021

jedbrown Sep 3, 2021

Allow CeedScalar to be single precision #788

Allow CeedScalar to be single precision #788

Conversation

nbeams commented Jul 7, 2021 • edited Loading

pazner commented Jul 8, 2021

nbeams commented Jul 8, 2021

pazner commented Jul 8, 2021

jeremylt commented Aug 30, 2021

nbeams commented Aug 31, 2021

jedbrown commented Aug 31, 2021

nbeams commented Aug 31, 2021 • edited Loading

jedbrown commented Aug 31, 2021

nbeams commented Sep 1, 2021

jedbrown commented Sep 1, 2021

nbeams commented Sep 1, 2021 • edited Loading

nbeams commented Sep 1, 2021 • edited Loading

pazner commented Sep 1, 2021

jeremylt commented Sep 1, 2021

jedbrown commented Sep 2, 2021

jedbrown commented Sep 2, 2021

nbeams commented Sep 2, 2021

jedbrown commented Sep 2, 2021

nbeams commented Sep 2, 2021

jedbrown commented Sep 2, 2021

jedbrown commented Sep 2, 2021

nbeams commented Sep 2, 2021

YohannDudouit Sep 3, 2021

Choose a reason for hiding this comment

jedbrown Sep 3, 2021

Choose a reason for hiding this comment

nbeams commented Jul 7, 2021 •

edited

Loading

nbeams commented Aug 31, 2021 •

edited

Loading

nbeams commented Sep 1, 2021 •

edited

Loading

nbeams commented Sep 1, 2021 •

edited

Loading