## Chunking

Chunking is used to generate a list of subsets of an SDR.

In [1]:
from src.core.SDR import *

In [10]:
sdr = SDR(binaryArray=[1,0,1,0,1,0,1,0])
print('binary array ',sdr.toBinaryArray())
print('index array ',sdr.toIndexArray())

binary array  [1, 0, 1, 0, 1, 0, 1, 0]
index array  [0, 2, 4, 6]
8


There is a clear repeating pattern in the binary array [1,0]. This pattern can be abstracted out by chunking the SDR into smaller pieces.

In [3]:
print('binary ',sdr.chunk(size=2,offset=2,asBinary=True))
print('indices ',sdr.chunk(size=2,offset=2))

binary  [[1, 0], [1, 0], [1, 0], [1, 0]]
indices  [[0], [0], [0], [0]]


The chunk size is 2 - the same size as the pattern. The index array only shows active bits in the binary array - in this case just the first bit is active [0].

In [4]:
sdr.fromBinaryArray([1,0,0,0,0,1,0,0])
print('binary ',sdr.chunk(size=2,offset=2,asBinary=True))
print('indices ',sdr.chunk(size=2,offset=2))

binary  [[1, 0], [0, 0], [0, 1], [0, 0]]
indices  [[0], [1]]


The same pattern is present, except it is less regular and slightly offset. The offset caused the chunking to miss the second pattern. It can still be discovered by allowing the chunks to overlap.

In [5]:
print('binary ',sdr.chunk(size=2,offset=1,asBinary=True))
print('indices ',sdr.chunk(size=2,offset=1))

binary  [[1, 0], [0, 0], [0, 0], [0, 0], [0, 1], [1, 0], [0, 0], [0]]
indices  [[0], [1], [0]]


In [8]:
sdr = SDR(binaryArray=[1,1,0,0,1,1,1,1,0,0,1,1,1,1,1,1])
print('chunk size 2:')
print('binary ',sdr.chunk(size=2,offset=2,asBinary=True))
print('indices ',sdr.chunk(size=2,offset=2))
print('chunk size 4:')
print('binary ',sdr.chunk(size=4,offset=4,asBinary=True))
print('indices ',sdr.chunk(size=4,offset=4))

chunk size 2:
binary  [[1, 1], [0, 0], [1, 1], [1, 1], [0, 0], [1, 1], [1, 1], [1, 1]]
indices  [[0, 1], [0, 1], [0, 1], [0, 1], [0, 1], [0, 1]]
chunk size 4:
binary  [[1, 1, 0, 0], [1, 1, 1, 1], [0, 0, 1, 1], [1, 1, 1, 1]]
indices  [[0, 1], [0, 1, 2, 3], [2, 3], [0, 1, 2, 3]]


Features of an SDR can be discovered at different chunk-size levels. At chunk-size 2, the repeating feature is [1,1] | [0,1]. At chunk-size 4, the repeating feature is [1,1,1,1] | [0,1,2,3]. The chunk-size-4 patterns are a binary combination of the chunk-size-2 patterns. Features within features lend towards a hierarchy.