Bitfields #1616

stevelinton · 2017-08-22T22:57:57Z

This is work in progress on an implementation of fast access to bitfields within small integers.
It seems to work, but is not thoroughly tested. As currently committed it needs a CPU with BEXTR (Haswell or later). I'll set up autoconf for that, of course.

Performance seems pretty decent. My laptop does a loop with 200 million gets in about 1.8s and 200 million sets in about 2.5s.

Relates to issue #1611

codecov · 2017-08-23T09:49:39Z

Codecov Report

Merging #1616 into master will increase coverage by <.01%.
The diff coverage is 89.43%.

@@            Coverage Diff             @@
##           master    #1616      +/-   ##
==========================================
+ Coverage   64.44%   64.44%   +<.01%     
==========================================
  Files        1002     1002              
  Lines      326730   326852     +122     
  Branches    13202    13216      +14     
==========================================
+ Hits       210550   210656     +106     
- Misses     113321   113331      +10     
- Partials     2859     2865       +6

Impacted Files	Coverage Δ
src/intfuncs.c	`90.58% <89.43%> (-0.6%)`	⬇️
src/hpc/threadapi.c	`34.08% <0%> (-0.47%)`	⬇️
src/funcs.c	`70.7% <0%> (-0.29%)`	⬇️
src/hpc/traverse.c	`78.21% <0%> (-0.09%)`	⬇️
src/gap.c	`57.07% <0%> (+0.08%)`	⬆️
src/stats.c	`73.2% <0%> (+0.13%)`	⬆️
src/hpc/thread.c	`46.64% <0%> (+0.19%)`	⬆️
src/objset.c	`82.59% <0%> (+0.23%)`	⬆️

stevelinton · 2017-08-23T13:13:47Z

Now essentially complete and ready for review.

Stefan-Kohl · 2017-08-23T22:21:12Z

How is it possible that merging this reduces global code coverage by about 14 percent? -- I would guess that adding such little feature shouldn't change code coverage in any significant way(?)

olexandr-konovalov · 2017-08-23T22:28:00Z

@Stefan-Kohl https://codecov.io/gh/gap-system/gap/pull/1616 reports coverage at 64.01% - I think the comment above stays from some earlier test which failed (so less code covered) and then codecov failed to update that comment by some reason.

fingolfin · 2017-08-23T22:29:19Z

The coverage data above is indeed meaningless, as some of the tests failed. Ultimately, this is still a bug in codecov (it should report something like "unable to generate coverage percentage" or so). But one we can easily recognize and live with.

olexandr-konovalov · 2017-08-23T22:30:39Z

Yes, maskset.tst fails with

testing: /home/travis/build/gap-system/gap/tst/testinstall/maskset.tst
# line 7 of 41 (17%)gap: ./src/calls.h:233: SET_BODY_FUNC: Assertion `TNUM_OBJ(body) == T_BODY' failed.

fingolfin · 2017-08-23T22:30:08Z

lib/masksets.gd

+##  This function sets up the machinery for a set of bitfields of the given
+##  widths. The total of the widths must not exceed 60 bits on 64-bit architecture
+##  or 28 bits on a 32-bit architecture and the number of widths must not exceed
+##  10 on a 64-bit aerchitecture or 6 on a 32-bit architecture. Also for performance 


Typo "aerchitecture"

fingolfin · 2017-08-23T22:30:23Z

lib/masksets.gd

+##  or 28 bits on a 32-bit architecture and the number of widths must not exceed
+##  10 on a 64-bit aerchitecture or 6 on a 32-bit architecture. Also for performance 
+##  reasons some checks that one might wish to do are ommitted. In particular, 
+## the builder and setter functions do not check if the value[s] passed to them are 


Uneven offset of the comment text

fingolfin · 2017-08-23T22:30:57Z

lib/masksets.gd

+## the builder and setter functions do not check if the value[s] passed to them are 
+## negative or too large.
+##
+##  You can tell which architecture you are running on by acccessing 


"which architecture" -> "which type architecture" (or "kind" of whatever)

fingolfin · 2017-08-23T22:34:26Z

lib/masksets.gd

+##
+##  <Description>
+##
+##  This function sets up the machinery for a set of bitfields of the given


So why not call this a "bitfield" then? The name is well-known, at least for people who know C/C++. Whereas "MaskSet" is not (and I don't think I would have guessed what it means here).

fingolfin · 2017-08-23T22:37:06Z

lib/masksets.gd

+##                 one of the fields from an immediate integer</Item>
+##  <Mark><C>setters</C></Mark> <Item> a list of <M>n</M> functions of two arguments each of which returns the packed value
+##                 in which one of the fields has been replaced by the given value
+##                 note that this does NOT modify the immediate integer.</Item>


It is not immediately clear what "the" immediate integer here is. Perhaps say: "... a list of n functions, each taking two arguments, and ; the return value of the i-th setter is a new packed value, with the value of the i-th field set to ."

All in all, I think it would be a good idea to pick a specific set of terms for the various "parts" involved here, and stick to them. E.g. referring to the same thing sometimes as "immediate integer" and sometimes as "packed value" is confusing. I'd explain the connection once, then stick to one term.

fingolfin · 2017-08-23T22:40:39Z

src/intfuncs.c

+static Obj doMaskGetter(Obj self, Obj data) {
+  UInt mask = NLOC_FUNC(self);
+   UInt start = get_start(mask);
+       UInt len = get_len(mask); 


Funky formatting here.

Perhaps you might be willing to consider using clang-format?

fingolfin · 2017-08-23T22:42:14Z

lib/masksets.gd

+##  10 on a 64-bit aerchitecture or 6 on a 32-bit architecture. Also for performance 
+##  reasons some checks that one might wish to do are ommitted. In particular, 
+## the builder and setter functions do not check if the value[s] passed to them are 
+## negative or too large.


So... what happens then? Is the value cut off? Are bits outside of the fields modified?

At the very least say: "If the caller does this, the result is completely undefined"?

Also: I think this implicitly says: "the field values are non-negative values", but this is never said explicitly (or am I just overlooking it?)

fingolfin · 2017-08-23T22:43:25Z

src/intfuncs.c

+
+/*
+* For the builders we also repurpose BODY_FUNC to hold a GAP integer telling
+* us how many fields there are


"also"? As in: something else was "also" repurposed? But what?

It would be really nice if there was a documentation comment before this code which gave at least a brief overview of what is stored where and why...

fingolfin · 2017-08-23T22:44:52Z

src/intfuncs.c

+*/
+
+#ifdef SYS_IS_64_BIT
+#define LENSTART_WIDTH 8


These days, we try to use enums for constants like this. They give better compiler diagnostics, and one is likely to shoot oneself into the foot, e.g. with #define VALUE 1+2 ... x = VALUE * 2;.

fingolfin · 2017-08-23T22:55:56Z

src/intfuncs.c

+#else
+#define LENSTART_WIDTH 8
+#define WIDTH_WIDTH 5
+#define MAX_FIELDS 6


Why are there at most 6 fields? I have not yet tried to understand the code after this, but I just looked at it, and it seems quite complicated... But why is that necessary? In my imagination, it would be sufficient to do this: For each field, the getter/setter need exactly two things: the offset of the field, and a mask (or alternatively, the number of bits in the fields, but I think the mask is more efficient).

The mask takes up one Obj, the offset another (but the offset fits in a single byte, so it might be easy to stash it somewhere else). If we are really concerned about space usage, we can store an offset and field with in 6 bit each.

Assuming we stored the mask and offset, the generic setter andgetter function would look like this (in untested pseudo code, but I hope it's enough to get the basic idea across); note that this does not use INTOBJ_INT or INT_INTOBJ at, because if we setup the mask and offset suitably, they are not necessary.

Obj SetterFunc(Obj self, Obj packedValue, Obj fieldValue) { GAP_ASSERT(IS_INTOBJ(packedValue)); GAP_ASSERT(IS_INTOBJ(fieldValue)); UInt mask = extractMaskFromFunc(self); UInt offset = extractOffsetFromFunc(self); UInt v = ((UInt)fieldValue ^ 1) << offset; UInt newVal = ((((UInt)packedValue) & ~mask) | v; return (Obj)newVal; } Obj GetterFunc(Obj self, Obj packedValue) { GAP_ASSERT(IS_INTOBJ(packedValue)); UInt mask = extractMaskFromFunc(self); UInt offset = extractOffsetFromFunc(self); UInt newVal = ((((UInt)packedValue) & mask) >> offset) | 1; return (Obj)newVal; }

We should have plenty of fields to store things in: ENVI_FUNC, NLOC_FUNC, BODY_FUNC, FEXS_FUNC

stevelinton · 2017-08-24T11:31:33Z

Renamed to bitfields and improved comments and documentation. Still a few points to deal with.

stevelinton · 2017-08-24T12:18:30Z

I think I've addressed everything from @fingolfin's verty helpful review except one point. This concerns how the getter and setter functions store and use the information that tells them about the specific bitfield they are getting or setting. I use a single word which stores the offset (low 8 bits) and length (rest) of the field. This is compatible with the BEXTR instruction, which we, or a very smart compiler could use for the getter.

@fingolfin suggests using two words (which we have available) one to store the offset and the other a mask to select out the desired bits. This makes the C getter and setter code a bit simpler, and probably also the generated assembler, but doesn't fit with BEXTR and does involve accessing one extra word of memory, although probably not a new cache line.

I'm not sure which approach is better.

markuspf · 2017-08-24T12:38:03Z

Do you already have a variant of your AVL tree code that uses these bitfields (as a showcase if nothing else)?

It seems to be the time for me to do some nitpicking: I think every other place in the GAP kernel uses a capital D in the DoXOperation function names, and generally CamelCase seems to be used more often than snake_case (referring to get_start, get_len and friends).

stevelinton · 2017-08-24T13:17:16Z

I have a variant of the union-find structure. Runs about as fast and uses half the memory Look at https://github.com/stevelinton/datastructures/tree/union-find (in gap/union-find2.gi and in src/uf.c in the functions with 2 in their names.

stevelinton · 2017-08-29T11:47:43Z

@markuspf I've now got my AVL tree code with threading etc. working as far as adding and iterating over the tree. I haven't attacked delete yet, or done the kernel accelerators. The main conclusion for the bitfields is that we really want support for Boolean fields -- that is one bit fields whose getters and setters deal with True and False instead of 1 and 0. Shouldn't make any performance difference worth the mention, and it will make a lot of code more readable.

fingolfin · 2017-08-31T08:49:38Z

The test failure is due to the the manual example not being "clean"

fingolfin · 2017-08-31T08:52:34Z

I have some doubts about focusing on BEXTR. That's on instruction on one architecture. And it may not be what on wants to do at some in the future. So tying the design of the code to that seems a bit short sighted; it's probably OK if there is a huge gain, but otherwise, I am highly sceptical.

I also don't understand why there is this limitation on 6 resp. 10 fields -- at least with the approach I suggested, there is absolutely no reason for that, so if you want to have 60 1-bit fields on 64bit archs, you can have them.

fingolfin · 2017-08-31T08:58:14Z

Aha, I guess the limitation comes from the desire to have a "builder" function which creates a single "bitfield" given a sequence of field values. I somehow missed this when reviewing the previous version of this PR (or perhaps it was not yet documented?)

This all still feels like it is some highly specialized code, only of interest for relatively low-level work. Is it right really that useful beyond the datastructures package? Perhaps it should be in there?

Perhaps it is useful in general, I just have hard time seeing it; to me, it would be much more attractive if GAP had bit manipulation functions which allow users to XOR, OR, AND, NOT immediate ints (and perhaps do bit test/set/clear). Of course that may be somewhat slower than this highly dedicated code for your applications; but OTOH, it is also vastly more general, and easy to exaplain to most programmers.

fingolfin · 2017-08-31T09:12:11Z

For the union find code, builders seem to be only used in constructors (so not time critical), so at least there, a more naive implementation which takes a list of widths (which can be of arbitrary width), and builds up everything from it should be fine. This could be implemented as follows: have a kernel function which takes a list of width, and list of field values (they should be of the same length), and uses them to build the desired immediate values.

Then, in MakeBitfield, create the actual builder on the GAP level, like this pseudo-code:

MakeBitfield := function(arg)
   local bf;
   bf := _kernel_MakeBitfield(arg);
   bf.builder := function(values...) return _kernel_build(bf.widths, values); end
   return bf;
end;

stevelinton · 2017-08-31T17:38:49Z

OK. I'm persuaded to change the data in the function header to mask and offset, and yes, I find myself using builders much less often than I expected (in my avl tree code under development they are used once when a new node is added.

stevelinton · 2017-08-31T17:46:46Z

On the question of generality. Adding all the necessary operations to the language would probably be better, but a lot more work. Also you'd end up wanting to write getter and setter functions for any non-trivial cases anyway and then you're paying GAP function calling overhead for something like

get_subtree_size := word -> word & 0xFFFFFFFFFFFFFF0 >> 4;

whereas with the code I have you essentially get that function directly and it runs is 10-20 ns.

I don't really know how wide an application it will have, but there are plenty of bits of code in the library that use whole words for things that only have a few possible values in large mathematical data structures.

stevelinton · 2017-09-12T14:51:48Z

This version uses the mask and offset setup and gets rid of the limit on the number of fields.

fingolfin

This looks good to me now -- but it better should be, as it feels I kind of bullied Steve into them :/. Anyway...

My remaining remarks all are unimportant quibbles about spaces and indentions, and could be ignored.

fingolfin · 2017-09-13T11:12:18Z

src/intfuncs.c

+static inline void SET_OFFFSET_BITFIELD_FUNC(Obj func, UInt offset)
+{
+  return SET_FEXS_FUNC(func, INTOBJ_INT(offset));
+


Remove empty line?

fingolfin · 2017-09-13T11:12:58Z

src/intfuncs.c

+{
+  UInt mask = MASK_BITFIELD_FUNC(self);
+  UInt offset = OFFSET_BITFIELD_FUNC(self);
+    if (!ARE_INTOBJS(data, val))


Why the differing indentation here and elsewhere?

(BTW, if you have it installed, you can use git clang-format to automatically format your commits)

fingolfin · 2017-09-13T11:13:49Z

src/intfuncs.c

+static Obj FuncBUILD_BITFIELDS(Obj self, Obj args)
+{
+  GAP_ASSERT(IS_PLIST(args));
+  GAP_ASSERT(LEN_PLIST(args) >= 1  && ELM_PLIST(args,1));


Remove extra space before && ?

fingolfin · 2017-09-13T11:14:23Z

src/intfuncs.c

+  UInt i;
+  for (i = nfields; i > 0; i--) {
+    GAP_ASSERT(ISB_LIST(widths, i));
+    Obj y = ELM_LIST(widths,i);


insert space after comma?

fingolfin · 2017-09-13T11:16:15Z

src/intfuncs.c

+    Obj y = ELM_LIST(widths,i);
+    GAP_ASSERT(IS_INTOBJ(y));
+    x <<= INT_INTOBJ(y);
+    GAP_ASSERT(ELM_PLIST(args, i+1));


That GAP_ASSERT is somewhat redundant, as the IS_INTOBJ test later on will simply fail in this case. But of course it doesn't hurt to have here.

fingolfin · 2017-09-13T11:17:24Z

src/intfuncs.c

+    Obj  bgetters = NEW_PLIST(T_PLIST + IMMUTABLE, nfields);
+    for (UInt i = 1; i <= nfields; i++) {
+        UInt mask = (1L << starts[i]) - (1L << starts[i - 1]);
+        Obj  s = NewFunctionCT(T_FUNCTION, SIZE_FUNC, "<field setter>", 2,


You could also just NewFunctionC here and elsewhere (makes it slightly easier to read the code, IMO, but that's perhaps taste)

fingolfin · 2017-09-13T11:18:48Z

tst/testinstall/bitfields.tst

@@ -0,0 +1,47 @@
+gap> START_TEST("bitfields.tst");
+
+# Test correct behaviour for a variety of  numbers of fields


remove extra space before "numbers"?

fingolfin · 2017-09-13T11:19:12Z

tst/testinstall/bitfields.tst

+> Print("Bad return from getter",i," ",j," ",x,"\n");
+> fi; od;
+> if bf.booleanGetters[1](x) <> (vals[2] = 1) then
+> Print("Bad return from Boolean Getter");


Here it is "Boolean Getter", below it is "boolean setter" -- why the differing case?

fingolfin · 2017-09-13T11:21:17Z

Still would be kind of nice if @markuspf or @ChrisJefferson could have a quick look at this, too.

Bitfields provide a fast (by GAP standards) and tidy way of viewing an immediate integer as a set of bitfields. Handy mainly for compact data structures.

markuspf

The only minor thing is of course now that one would want a file in dev/Updates to reflect this change, but maybe I should just hit "merge" for now to not drag this on.

stevelinton added the do not merge PRs which are not yet ready to be merged (e.g. submitted for discussion, or test results) label Aug 22, 2017

stevelinton force-pushed the bitmasks branch 2 times, most recently from 0789834 to a280e5c Compare August 23, 2017 13:12

stevelinton changed the title ~~WIP Bitmasks~~ MaskSets Aug 23, 2017

stevelinton removed the do not merge PRs which are not yet ready to be merged (e.g. submitted for discussion, or test results) label Aug 23, 2017

stevelinton requested review from fingolfin, ChrisJefferson and markuspf August 23, 2017 13:14

fingolfin reviewed Aug 23, 2017

View reviewed changes

stevelinton force-pushed the bitmasks branch from a280e5c to b6b3886 Compare August 24, 2017 11:29

stevelinton changed the title ~~MaskSets~~ Bitfields Aug 24, 2017

stevelinton force-pushed the bitmasks branch 4 times, most recently from ffecdb9 to 24b80a5 Compare August 29, 2017 10:05

stevelinton force-pushed the bitmasks branch from 24b80a5 to 7960187 Compare August 29, 2017 12:52

stevelinton force-pushed the bitmasks branch 2 times, most recently from 6cdbc75 to e0f26d9 Compare September 11, 2017 13:00

stevelinton force-pushed the bitmasks branch from e0f26d9 to 3eed2bf Compare September 12, 2017 12:33

fingolfin approved these changes Sep 13, 2017

View reviewed changes

Bitfields feature. Includes tests and documentation.

951edc6

Bitfields provide a fast (by GAP standards) and tidy way of viewing an immediate integer as a set of bitfields. Handy mainly for compact data structures.

stevelinton force-pushed the bitmasks branch from 3eed2bf to 951edc6 Compare September 13, 2017 11:59

markuspf approved these changes Sep 13, 2017

View reviewed changes

markuspf merged commit eebc9d6 into gap-system:master Sep 13, 2017

olexandr-konovalov added the release notes: not needed PRs introducing changes that are wholly irrelevant to the release notes label Jan 22, 2018

olexandr-konovalov added release notes: added PRs introducing changes that have since been mentioned in the release notes and removed release notes: not needed PRs introducing changes that are wholly irrelevant to the release notes labels Jan 29, 2018

fingolfin mentioned this pull request Jun 5, 2019

Bit masking for small ints #1611

Closed

		@@ -0,0 +1,47 @@
		gap> START_TEST("bitfields.tst");

		# Test correct behaviour for a variety of numbers of fields

Bitfields #1616

Bitfields #1616

Conversation

stevelinton commented Aug 22, 2017

codecov bot commented Aug 23, 2017 • edited Loading

Codecov Report

stevelinton commented Aug 23, 2017

Stefan-Kohl commented Aug 23, 2017

olexandr-konovalov commented Aug 23, 2017

fingolfin commented Aug 23, 2017

olexandr-konovalov commented Aug 23, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stevelinton commented Aug 24, 2017

stevelinton commented Aug 24, 2017

markuspf commented Aug 24, 2017

stevelinton commented Aug 24, 2017

stevelinton commented Aug 29, 2017

fingolfin commented Aug 31, 2017

fingolfin commented Aug 31, 2017

fingolfin commented Aug 31, 2017

fingolfin commented Aug 31, 2017

stevelinton commented Aug 31, 2017

stevelinton commented Aug 31, 2017

stevelinton commented Sep 12, 2017

fingolfin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fingolfin commented Sep 13, 2017

markuspf left a comment

Choose a reason for hiding this comment

codecov bot commented Aug 23, 2017 •

edited

Loading