Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix sometimes incorrect prefetches. No functional change.
bench: 6154836
- Loading branch information
cddde31
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be better tested with Elo gaining bounds
cddde31
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@snicolet You cannot really expect this to gain elo?! This seems more like a rather small bugfix which seems to be introduced by you, btw. See official-stockfish@1188141 (Not meant as a reproach!)
cddde31
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I mean is that the patch can only be seen as search for speed-up, since the prefetch
is a only a processor hint to improve the memory latency on modern CPU. There can't be a
bug since the semantics of prefetch is a null instruction.
The cases where
st->rule50 >= 14
are so rare (1%) that it may well be possible that prefechtingearlier for the most common case is in fact faster.
cddde31
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thought was that if we could handle all cases correctly (even rare ones) without adding code or hurting performance that would be a good thing. Let's ask what @vondele thinks. I'll happily do whatever he and @snicolet agree on.
cddde31
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mstembera Agreed!
I don't think there is anything to gain here from doing the prefetch before incrementing
rule50
.cddde31
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this fixes a bug, now that key depends on rule50, we should not access it before key is completely computed. This was easy to overlook when this dependency was introduced, but we better fix this IMO.
cddde31
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok thanks. PR opened official-stockfish#3759.
Also I just realized that if prefetching the common case before the increment was really faster
the non regression test should be enough to catch that.