Skip to content

Commit

Permalink
Fix TT comment and static_assert()
Browse files Browse the repository at this point in the history
Comment is based on a misunderstanding of what unaligned memory access is. Here
is an article that explains it very clearly:
https://www.kernel.org/doc/Documentation/unaligned-memory-access.txt

No matter how we define TTEntry or TTCluster, there will never be any unaligned
memory access. This is because the complier knows the alignment rules, and does
the necessary adjustments to make sure unaligned memory access does not occur.

The issue being adressed here has nothing to do with unaligned memory access. It
is about cache performance. In order to achieve best cache performance:
- we prefetch the cacheline as soon as possible.
- we ensure that TT clusters do not spread across two cachelines. If they did,
  we would need to prefetch 2 cachelines, which could hurt cache performance.

Therefore the true conditions to achieve this are:
1/ start adress of TT is cache line aligned. void TranspositionTable::resize()
enforces this.
2/ TT cluster size should *divide* the cache line size. Currently, we pack 2
clusters per cache lines. It used to be 1 before "TT sardines". Does not matter
what the ratio is, all we want is to fit an integer number of clusters per cache
line.

No functional change.

Resolves #506
  • Loading branch information
lucasart authored and zamar committed Nov 21, 2015
1 parent 9319555 commit 328098d
Showing 1 changed file with 5 additions and 4 deletions.
9 changes: 5 additions & 4 deletions src/tt.h
Expand Up @@ -76,8 +76,9 @@ struct TTEntry {
/// A TranspositionTable consists of a power of 2 number of clusters and each
/// cluster consists of ClusterSize number of TTEntry. Each non-empty entry
/// contains information of exactly one position. The size of a cluster should
/// not be bigger than a cache line size. In case it is less, it should be padded
/// to guarantee always aligned accesses.
/// divide the size of a cache line size, to ensure that clusters never cross
/// cache lines. This ensures best cache performance, as the cacheline is
/// prefetched, as soon as possible.

class TranspositionTable {

Expand All @@ -86,10 +87,10 @@ class TranspositionTable {

struct Cluster {
TTEntry entry[ClusterSize];
char padding[2]; // Align to the cache line size
char padding[2]; // Align to a divisor of the cache line size
};

static_assert(sizeof(Cluster) == CacheLineSize / 2, "Cluster size incorrect");
static_assert(CacheLineSize % sizeof(Cluster) == 0, "Cluster size incorrect");

public:
~TranspositionTable() { free(mem); }
Expand Down

0 comments on commit 328098d

Please sign in to comment.