-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fast method for ElmsBlist when positions are a range with increment 1 #1773
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1773 +/- ##
==========================================
- Coverage 62.92% 62.87% -0.05%
==========================================
Files 968 968
Lines 294059 293097 -962
Branches 12987 12929 -58
==========================================
- Hits 185031 184290 -741
+ Misses 106227 106016 -211
+ Partials 2801 2791 -10
|
While I imagine we'd have realised, some nice test which tests a large range of offsets and lengths would be nice, just to make sure there isn't some nasty corner case somewhere. |
@ChrisJefferson I have a stand along C program which is pretty exhaustive. I'll translate it into GAP. |
I added the "not-for-release-notes" label. But perhaps this PR gives a speed boost to something? Then of course this label should be removed again... but I couldn't tell from the description of the PR. |
src/blister.h
Outdated
@@ -140,6 +140,9 @@ static inline UInt * BLOCKS_BLIST(Obj list) | |||
return BLOCKS_BLIST_UNSAFE(list); | |||
} | |||
|
|||
static inline const UInt * CONST_BLOCKS_BLIST(Obj list) { | |||
return (const UInt *)BLOCKS_BLIST(list); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid this implementation defeats the ultimate purpose: to allow inserting read/write guards in an "optimal" fashion. Really need this (and use 4 spaces for indent):
static inline const UInt * CONST_BLOCKS_BLIST(Obj list)
{
GAP_ASSERT(IS_BLIST_REP_WITH_COPYING(list));
return ((const UInt *)(CONST_ADDR_OBJ(list) + 1));
}
src/blister.h
Outdated
*/ | ||
|
||
/* constructs a mask that selects bits <from> to <to> inclusive of a UInt */ | ||
static inline UInt mask(UInt from, UInt to) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, mask
is an awfully common name, not sure it's a good idea to use that for a global function.
src/blister.h
Outdated
} | ||
|
||
|
||
static inline __attribute__((always_inline)) void CopyBits(const UInt * fromblock, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should use ALWAYS_INLINE
from PR #1779
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR #1779 has been merged, so you can now do this.
src/vecgf2.c
Outdated
} | ||
return; | ||
} | ||
CopyBits(sptr, soff, dptr, doff, nelts); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you conducted tests as to whether (and if so: how exactly) this change to the GF2 code affects performance?
@fingolfin It speeds up ElmsBlist, which was missing this special case, and showed up as 10% of the CPU time in teststandard. More or less equivalent, but much much messier code was already in the vecgf2 applications, so I'd expect little or no change there, but I'll check. |
OK, so it is a performance enhancement. Adjusted labels accordingly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more nitpicks. But the things I really would like to see addressed are: use ALWAYS_INLINE
; correct CONST_BLOCKS_BLIST
; avoid using misc
as name of a globally visible function.
src/blister.h
Outdated
} | ||
/* Now move whole words */ | ||
if ((wholeblocks = nbits / BIPEB)) | ||
memcpy((void *)toblock, (void *)fromblock, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need the typecasts here (and the second one removes the const
qualifier of fromblock
.
src/blister.h
Outdated
*toblock++ = x; | ||
nbits -= BIPEB; | ||
} | ||
/* Finally we may need to fill up a partial block at destination */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is that comment attached to the preceding loop, instead of the following if
(to which it refers, doesn't it?)
I.e.: maybe put the empty line before the comment, not after it?
src/vecgf2.c
Outdated
SyExit(2); | ||
} | ||
soff = (smin-1) %BIPEB; | ||
doff = (dmin-1) %BIPEB; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either add spaces after %
or remove the ones before... Or just use clang-format
?
(If you add spaces, then perhaps also add them around the /
in the next two lines?)
fc7bc80
to
a4ae227
Compare
Added test and addressed all comments, I think |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All my original remarks are addressed, and in principle, we could merge this now.
However, I have two more nitpicks, for which I'd appreciate if they could be addressed. That said, if people are in a rush to merge this, go ahead... :-)
src/blister.h
Outdated
fromblock += frombit / BIPEB; | ||
frombit %= BIPEB; | ||
toblock += tobit / BIPEB; | ||
tobit %= BIPEB; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is called from two places, and both already normalize. Is the optimizer clever enough to get rid of the second, redundant normalization?
Otherwise, I'd either remove the normalization in both calling places, or (the approach I personally favor slightly) replace the four lines above by
GAP_ASSERT(frombit < BIPEB);
GAP_ASSERT(tobit < BIPEB);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine. I put it in because it made it easier to write my C level tests, but you're right it's not needed in GAP.
src/blister.h
Outdated
** `CopyBits' copies <numbits> bits (numbering bits within a UInt | ||
** from the least significant to the most significant) starting with | ||
** bit number <from-starting-bit> of UInt *<fromblock> to a destination | ||
** starting at bit <to-starting-bit> of *<toblock>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the two blocks assumed to be non-overlapping? I think it would be a good idea to document this explicitly.
…ns is a range of increment 1.
… code that was there before.
a4ae227
to
4f581a8
Compare
Also used for copying pieces of GF2 vectors around.