# KANERVA2009

**Hyperdimensional Computing: An Introduction to Computing in Distributed Representation with High-Dimensional Random Vectors**

---

**Abstract.** The 1990s saw the emergence of cognitive models that depend on very high dimensionality and randomness. They include Holographic Reduced Representations, Spatter Code, Semantic Vectors, Latent Semantic Analysis, Context-Dependent Thinning, and Vector-Symbolic Architecture. They represent things in high-dimensional vectors that are manipulated by operations that produce new high-dimensional vectors in the style of traditional computing, in what is called here hyper-dimensional computing on account of the very high dimensionality. The paper presents the main ideas behind these models, written as a tutorial essay in hopes of making the ideas accessible and even provocative. A sketch of how we have arrived at these models, with references and pointers to further reading, is given at the end. The thesis of the paper is that hyper-dimensional representation has much to offer to students of cognitive science, theoretical neuroscience, computer science and engineering, and mathematics.

---


### Setup


In [1]:
import numpy as np
import math
import random
import functools as ft

### Logging


In [2]:
import logging, sys

logging.basicConfig(
    stream=sys.stderr, level=logging.DEBUG, format="[%(levelname)s] %(message)s"
)

## Hyperdimensional Computer


### Hyperdimensional Representation


In [3]:
DIMENSIONS = 10000

In [4]:
def generateRandomVector(d):
    return np.random.randint(2, size=d)

In [5]:
def similarity(A, B):
    if A.size != B.size:
        raise Exception("A and B have different dimensions.")

    count = 0
    for i in range(0, (A.size - 1)):
        if A[i] == B[i]:
            count += 1

    return count / A.size

In [6]:
def calculateSTDs(A, B):
    return abs((similarity(A, B) * A.size - 5000) / 50)

#### Tests


In [7]:
def test_random_vector():
    logging.debug("test_random_vector")

    A = generateRandomVector(DIMENSIONS)

    similarityAA = similarity(A, A)
    calculateSTDsAA = calculateSTDs(A, A)

    logging.debug("  similarity (A, A): %s", similarityAA)
    assert similarityAA > 0.98
    logging.debug("  calulateSTDs (A, A): %s", calculateSTDsAA)
    assert calculateSTDsAA > 98

In [8]:
def test_random_vectors():
    logging.debug("test_random_vectors")

    A = generateRandomVector(DIMENSIONS)
    B = generateRandomVector(DIMENSIONS)

    similarityAB = similarity(A, B)
    calculateSTDsAB = calculateSTDs(A, B)

    logging.debug("  similarity (A, B): %s", similarityAB)
    assert similarityAB > 0.45
    assert similarityAB < 0.55
    logging.debug("  calculateSTDs (A, B): %s", calculateSTDsAB)
    assert calculateSTDsAB < 3

In [9]:
test_random_vector()
test_random_vectors()

[DEBUG] test_random_vector
[DEBUG]   similarity (A, A): 0.9999
[DEBUG]   calulateSTDs (A, A): 99.98
[DEBUG] test_random_vectors
[DEBUG]   similarity (A, B): 0.5074
[DEBUG]   calculateSTDs (A, B): 1.48


### Hyperdimensional arithmetic


In [10]:
def comparison(A, B):
    dividend = sum(A * B)
    divisor = math.sqrt(sum(A**2) * sum(B**2))

    return dividend / divisor

In [11]:
def applyThreshold(V, n=2):
    if n % 2 == 1:
        threshold = lambda x: 1 if x > n / 2 else 0
    else:
        threshold = (
            lambda x: 1
            if x > n / 2
            else (1 if x == n / 2 and bool(random.getrandbits(1)) else 0)
        )

    vectorized_threshold = np.vectorize(threshold)
    return vectorized_threshold(V)

In [12]:
def arithmeticSum(A, B):
    sum = A + B

    return applyThreshold(sum)

In [13]:
def arithmeticSumVectors(vectors):
    sum = ft.reduce(lambda x, y: x + y, vectors)

    return applyThreshold(sum, len(vectors))

In [14]:
# TODO: fix arithmetic subtraction
def arithmeticSubtraction(A, B):
    compliment = 1 - B

    return arithmeticSum(A, compliment)

In [15]:
def mult(A, B):
    return np.bitwise_xor(A, B)

In [16]:
def generatePermutationMatrix(dimensions):
    P = np.eye(dimensions)
    np.random.shuffle(P)

    return P

In [17]:
def inversePermutationMatrix(P):
    return np.linalg.inv(P)

#### Tests


In [18]:
def test_comparison():
    logging.debug("test_comparison")
    A = generateRandomVector(DIMENSIONS)
    B = generateRandomVector(DIMENSIONS)

    comparisonAA = comparison(A, A)
    comparisonAB = comparison(A, B)

    logging.debug("  comparison (A, A): %s", comparisonAA)
    assert comparisonAA > 0.99

    logging.debug("  comparison (A, B): %s", comparisonAB)
    assert comparisonAB > 0.45
    assert comparisonAB < 0.55

In [19]:
def test_arithemtic_sum():
    logging.debug("test_arithemtic_sum")
    A = generateRandomVector(DIMENSIONS)
    B = generateRandomVector(DIMENSIONS)

    S = arithmeticSum(A, B)

    comparisonSA = comparison(S, A)
    comparisonSB = comparison(S, B)

    assert comparisonSA > 0.70
    logging.debug("  comparison (S, A): %s", comparisonSA)
    assert comparisonSB > 0.70
    logging.debug("  comparison (S, B): %s", comparisonSB)

In [20]:
def test_arithemtic_sum_vectors():
    logging.debug("test_arithemtic_sum_vectors")

    A = generateRandomVector(DIMENSIONS)
    B = generateRandomVector(DIMENSIONS)
    C = generateRandomVector(DIMENSIONS)

    S = arithmeticSumVectors([A, B, C])

    comparisonSA = comparison(S, A)
    comparisonSB = comparison(S, B)
    comparisonSC = comparison(S, C)

    assert comparisonSA > 0.60
    logging.debug("  comparison (S, A): %s", comparisonSA)
    assert comparisonSB > 0.60
    logging.debug("  comparison (S, B): %s", comparisonSB)
    assert comparisonSC > 0.60
    logging.debug("  comparison (S, C): %s", comparisonSC)

In [21]:
def test_arithemtic_sum_vectors_large():
    logging.debug("test_arithemtic_sum_vectors")

    n = 100
    vectors = []

    for i in range(1, n):
        vectors.append(generateRandomVector(DIMENSIONS))

    S = arithmeticSumVectors(vectors)

    comparisonS0 = comparison(S, vectors[0])
    calculateSTDsS0 = calculateSTDs(S, vectors[0])
    comparisonS1 = comparison(S, vectors[1])
    calculateSTDsS1 = calculateSTDs(S, vectors[1])
    comparisonS2 = comparison(S, vectors[2])
    calculateSTDsS2 = calculateSTDs(S, vectors[2])

    assert comparisonS0 > 0.524
    logging.debug("  comparison (S, vectors[0]): %s", comparisonS0)
    assert comparisonS1 > 0.524
    logging.debug("  comparison (S, vectors[1]): %s", comparisonS1)
    assert comparisonS2 > 0.524
    logging.debug("  comparison (S, vectors[2]): %s", comparisonS2)

In [22]:
def test_arithemtic_subtraction():
    logging.debug("test_arithemtic_subtraction")

    A = generateRandomVector(DIMENSIONS)
    B = generateRandomVector(DIMENSIONS)
    # mean = np.mean([A, B], axis=0)

    M = arithmeticSum(A, B)

    S = arithmeticSubtraction(M, B)
    U = mult(M, B)

    comparisonSA = comparison(S, A)
    comparisonSB = comparison(S, B)

    logging.debug("  comparison (S, A): %s", comparisonSA)
    # assert comparisonSA > 0.70
    logging.debug("  comparison (S, B): %s", comparisonSB)
    # assert comparisonSB > 0.70

    comparisonUA = comparison(U, A)
    comparisonUB = comparison(U, B)

    logging.debug("  comparison (U, A): %s", comparisonUA)
    # assert comparisonSA > 0.70
    logging.debug("  comparison (U, B): %s", comparisonUB)
    # assert comparisonSB > 0.70

In [23]:
def test_own_inverse_xor():
    logging.debug("test_own_inverse_xor")

    A = generateRandomVector(DIMENSIONS)
    B = generateRandomVector(DIMENSIONS)

    comparisonAABB = comparison(A, mult(mult(A, B), B))
    assert comparisonAABB > 0.99
    logging.debug("  comparison(A, mult(mult(A, B), B)): %s", comparisonAABB)

In [24]:
def test_permutation_keeps_distance():
    logging.debug("test_permutation_keeps_distance")

    A = generateRandomVector(DIMENSIONS)
    B = generateRandomVector(DIMENSIONS)
    P = generatePermutationMatrix(DIMENSIONS)

    iA = P.dot(A)
    iB = P.dot(B)

    comparisonAB = comparison(A, B)
    comparisoniAiB = comparison(iA, iB)

    logging.debug(
        "  comparison(A, B) - comparison(iA, iB): %s", comparisonAB - comparisoniAiB
    )
    assert comparisonAB == comparisoniAiB

In [25]:
def test_inverse_permutation_cancels():
    logging.debug("test_inverse_permutation_cancels")

    V = generateRandomVector(DIMENSIONS)
    P = generatePermutationMatrix(DIMENSIONS)
    iP = inversePermutationMatrix(P)

    comparisonV = comparison(iP.dot(P.dot(V)), V)
    similarityV = similarity(iP.dot(P.dot(V)), V)

    logging.debug("  comparison(iP.dot(P.dot(V)), V): %s", comparisonV)
    assert comparisonV > 0.99

    logging.debug("  similarity(iP.dot(P.dot(V)), V): %s", similarityV)
    assert similarityV > 0.99

In [26]:
test_comparison()
test_arithemtic_sum()
test_arithemtic_sum_vectors()
test_arithemtic_sum_vectors_large()
test_arithemtic_subtraction()
test_own_inverse_xor()
test_permutation_keeps_distance()
test_inverse_permutation_cancels()

[DEBUG] test_comparison
[DEBUG]   comparison (A, A): 1.0
[DEBUG]   comparison (A, B): 0.4943843654623578
[DEBUG] test_arithemtic_sum
[DEBUG]   comparison (S, A): 0.7479392227715415
[DEBUG]   comparison (S, B): 0.754945021776569
[DEBUG] test_arithemtic_sum_vectors
[DEBUG]   comparison (S, A): 0.7474426871742589
[DEBUG]   comparison (S, B): 0.7552402429334132
[DEBUG]   comparison (S, C): 0.7469909759919405
[DEBUG] test_arithemtic_sum_vectors
[DEBUG]   comparison (S, vectors[0]): 0.5356389319071407
[DEBUG]   comparison (S, vectors[1]): 0.540711289654036
[DEBUG]   comparison (S, vectors[2]): 0.5335216931962784
[DEBUG] test_arithemtic_subtraction
[DEBUG]   comparison (S, A): 0.6177760595670687
[DEBUG]   comparison (S, B): 0.37410286607191295
[DEBUG]   comparison (U, A): 0.35221612559412274
[DEBUG]   comparison (U, B): 0.353232576291572
[DEBUG] test_own_inverse_xor
[DEBUG]   comparison(A, mult(mult(A, B), B)): 1.0
[DEBUG] test_permutation_keeps_distance
[DEBUG]   comparison(A, B) - compariso

### Hyperdimensional memory


**Autoassociative.** Autoassociative storage is achieved by storing each pattern X using X itself as the address. It is useful as it allows the original stored X to be recovered from an approximate or noisy version of it, X'.


In [27]:
class ItemMemory:
    def __init__(self, vectors=[]):
        self.vectors = vectors

    def addVector(self, label, V):
        self.vectors.append((label, V))

    def cleanup(self, V):
        return max(self.vectors, key=lambda x: comparison(V, x[1]))

#### Tests


In [36]:
def test_cleanup():
    logging.debug("test_cleanup")

    itemMemory = ItemMemory()

    X = generateRandomVector(DIMENSIONS)
    A = generateRandomVector(DIMENSIONS)

    Y = generateRandomVector(DIMENSIONS)
    B = generateRandomVector(DIMENSIONS)

    Z = generateRandomVector(DIMENSIONS)
    C = generateRandomVector(DIMENSIONS)

    itemMemory.addVector("A", A)
    itemMemory.addVector("B", B)
    itemMemory.addVector("C", C)
    itemMemory.addVector("X", X)
    itemMemory.addVector("Y", Y)
    itemMemory.addVector("Z", Z)

    H = arithmeticSumVectors([mult(A, X), mult(B, Y), mult(C, Z)])

    V = itemMemory.cleanup(mult(H, X))

    logging.debug("  V[0]: %s", V[0])
    assert V[0] == "A"
    logging.debug("  similarity(mult(H, X), A): %s", similarity(mult(H, X), A))

In [29]:
def test_one_step_sequence():
    logging.debug("test_one_step_sequence")
    itemMemory = ItemMemory()
    D = generateRandomVector(DIMENSIONS)
    E = generateRandomVector(DIMENSIONS)

    itemMemory.addVector("D", D)
    itemMemory.addVector("E", E)

    P = generatePermutationMatrix(DIMENSIONS)
    Dp = P.dot(D)

    S = arithmeticSum(Dp, E)

    itemMemory.addVector("S", S)

    Sr = itemMemory.cleanup(Dp)

    assert Sr[0] == "S"

    Es = arithmeticSubtraction(Sr[1], Dp)

    Er = itemMemory.cleanup(Es)

    assert Er[0] == "E"

In [37]:
test_cleanup()
test_one_step_sequence()

[DEBUG] test_cleanup


AttributeError: 'ItemMemory' object has no attribute 'cleanUp'

#### Representing Sequences by Permuting Sums

As with sets, several elements of a sequence can be represented in a single hypervector. This is called flattening or leveling the sequence. In order to preserve the order, the elements can not be flatened with the sum alone but rather the elements must be labeled according to their position in the sequence.


## Three examples with cognitive connotations


### Context vectors as examples of sets: Random Indexing
