### Thoughts on how to represent higher-precision numbers in logic gate networks
During trainig, the model uses real-valued logic that can handle floats. The number of bits to represent the inputs can be configured to optimize training performance like so [link](https://github.com/Felix-Petersen/difflogic/blob/469702c01ff0bfac9cdc6a395134252e11a56bd8/experiments/main.py#L286C9-L286C88):
```
x = x.to(BITS_TO_TORCH_FLOATING_POINT_TYPE[args.training_bit_count]).to('cuda')
```
Compiled logic gate networks apply binary logic gate networks to binary inputs. Hence, all inputs are first transformed to type bool like so [link](https://github.com/Felix-Petersen/difflogic/blob/469702c01ff0bfac9cdc6a395134252e11a56bd8/experiments/main.py#L370
):
```
data = torch.nn.Flatten()(data).bool().numpy()
```
PyTorch's `bool()` maps any non-zero number to `True`. Hence, for MNIST, even very dim pixels with grey scale value of 0.01 are mapped to `True`. This is acceptable as an MNIST digit can still be identified in such aggresive binarization since most pixels outisde the actual drawn digit are exactly zero.

For CIFAR-10, on the other hand, much information would be lost with this approach. The original authors therefore apply the following transformation to the input data [link](https://github.com/Felix-Petersen/difflogic/blob/469702c01ff0bfac9cdc6a395134252e11a56bd8/experiments/main.py#L55C4-L63C11):
```
transform = lambda x: torch.cat([(x > (i + 1) / 4).float() for i in range(3)], dim=0)
```
This splits the float input values into three booleans, effectively creating a 3-bit representation for each float number, which is then concatenated along the first axis. Hence, the 3x32x32 float images are transformed into a 9x32x32 boolean tensor, retaining at least intensity information.

This is very similar to what we independently came up with for CICADA, where we represented the 18x14 maps of 10-bit integers as an 18x14x10 tensor of booleans. Two differences arise: Firstly, we deploy logarithmic intervals instead of linear ones (which allows lossless binary representation). Secondly, we used the tensorflow-typical h,w,c notation instead of the pytorch-typical one of c,h,w.

In [34]:
import torch

print("=== Simple Boolean Conversion (MNIST-style) ===")
mnist_like = torch.tensor([[0.0, 0.01], [0.5, 1.0]])
print(f"Original:\n{mnist_like}")
print(f"Bool:\n{mnist_like.bool()}")

print("\n=== Multi-threshold (CIFAR-10-style) ===")
# 3x2x2 RGB image
cifar_like = torch.tensor([[[0.1, 0.4], [0.7, 0.2]],  # R channel
                           [[0.6, 0.9], [0.3, 0.8]],  # G channel  
                           [[0.2, 0.5], [0.1, 0.7]]])  # B channel
print(f"Original RGB shape: {cifar_like.shape}")
# Apply 3 thresholds and concatenate along channel dimension
thresholded = torch.cat([(cifar_like > (i + 1) / 4).float() for i in range(3)], dim=0)
print(f"Multi-threshold shape: {thresholded.shape}")
print(f"Thresholded (9 channels):\n{thresholded}")

print("\n=== CICADA-style Bit Decomposition ===")
integers = torch.tensor([[5, 12], [3, 15]])  # 4-bit integers
print(f"Original integers: {integers}")
bit_decomposed = torch.stack([((integers // (2**i)) % 2).bool() for i in range(4)], dim=-1)
print(f"Bit decomposed shape: {bit_decomposed.shape}")
print(f"Bit decomposed:\n{bit_decomposed}")

=== Simple Boolean Conversion (MNIST-style) ===
Original:
tensor([[0.0000, 0.0100],
        [0.5000, 1.0000]])
Bool:
tensor([[False,  True],
        [ True,  True]])

=== Multi-threshold (CIFAR-10-style) ===
Original RGB shape: torch.Size([3, 2, 2])
Multi-threshold shape: torch.Size([9, 2, 2])
Thresholded (9 channels):
tensor([[[0., 1.],
         [1., 0.]],

        [[1., 1.],
         [1., 1.]],

        [[0., 1.],
         [0., 1.]],

        [[0., 0.],
         [1., 0.]],

        [[1., 1.],
         [0., 1.]],

        [[0., 0.],
         [0., 1.]],

        [[0., 0.],
         [0., 0.]],

        [[0., 1.],
         [0., 1.]],

        [[0., 0.],
         [0., 0.]]])

=== CICADA-style Bit Decomposition ===
Original integers: tensor([[ 5, 12],
        [ 3, 15]])
Bit decomposed shape: torch.Size([2, 2, 4])
Bit decomposed:
tensor([[[ True, False,  True, False],
         [False, False,  True,  True]],

        [[ True,  True, False, False],
         [ True,  True,  True,  True]]])


### What do `PackBitsTensor` and `num_bits` do then?
The class `CompiledLogicNet` takes an argument `num_bits` at initialization. I think this has sparked some confusion because it looks at first glance like the number of bits used to represent float inputs. That is not the case. Instead, `num_bits` and `PackBitsTensor` only control the batching of inputs. They are only used for performance optimization and have no effect on the precision of the inputs or predictions of the model!

Some more code below to demonstrate this:

In [None]:
import numpy as np
import torch
import torchinfo

from neurodifflogic.difflogic.compiled_model import CompiledLogicNet
from neurodifflogic.models.difflog_layers.linear import GroupSum, LogicLayer


Let's create a super simple model (single AND gate) and verify it acts as expected:

In [None]:
layer = LogicLayer(in_dim=2, out_dim=1, connections="unique", implementation="python", device="cpu")
layer.weight.data = torch.zeros(1, 16)
layer.weight.data[0, 1] = 100  # Set weight for the AND operation (A*B)
model = torch.nn.Sequential(layer, GroupSum(1))

# binary AND truth table that should work in all cases
test_cases = [((0, 0), 0), ((0, 1), 0), ((1, 0), 0), ((1, 1), 1)]
for (x, y), expected in test_cases:
    assert np.isclose(model(torch.tensor([x, y])).item(), expected)

# this should only work for the non-compiled model, as it relies on real-valued logic
test_cases = [((0.5, 0.5), 0.25)]
for (x, y), expected in test_cases:
    assert np.isclose(model(torch.tensor([x, y])).item(), expected)

Okay, let's compile the model

In [49]:
model.train(False)

compiled_model = CompiledLogicNet(
    model=model, num_bits=8, cpu_compiler="gcc", verbose=True
)
compiled_model.compile(save_lib_path="minimal_example.so", verbose=False)

Found GroupSum layer with 1 classes
Parsed 0 conv, 0 pooling, 1 linear layers
Layer execution order: [('linear', 0)]
Compiling finished in 0.088 seconds.


We used num_bits=8, so we need a multiple of 8-bits for the inputs. Since our model takes two inputs, we use a tensor of shape (4, 2). Using only 2 examples or increasing to num_bits=16 would fail.

In [50]:
X = torch.tensor([[0.0, 0.1],
                  [0.2, 0.3],
                  [0.4, 0.5],
                  [0.6, 0.7]])

binary_inputs = X.bool().numpy()

preds = model(X)
preds_compiled = compiled_model(binary_inputs)

print(f"X = \n{X}")
print(f"inputs = \n{binary_inputs}")
print(f"preds = \n{preds}")
print(f"preds_compiled = \n{preds_compiled}")

X = 
tensor([[0.0000, 0.1000],
        [0.2000, 0.3000],
        [0.4000, 0.5000],
        [0.6000, 0.7000]])
inputs = 
[[False  True]
 [ True  True]
 [ True  True]
 [ True  True]]
preds = 
tensor([[0.0000],
        [0.0600],
        [0.2000],
        [0.4200]])
preds_compiled = 
tensor([[0],
        [1],
        [1],
        [1]], dtype=torch.int32)


This was the MNIST-like approach, where quite some information is lost.

In [None]:
layer = LogicLayer(in_dim=4, out_dim=2, connections="unique", implementation="python", device="cpu")
model = torch.nn.Sequential(layer, GroupSum(1))

model.train(False)

compiled_model = CompiledLogicNet(
    model=model, num_bits=8, cpu_compiler="gcc", verbose=True
)
compiled_model.compile(save_lib_path="minimal_example.so", verbose=False)

X = torch.tensor([[0.0, 0.1],
                  [0.2, 0.3],
                  [0.4, 0.5],
                  [0.6, 0.7]])

# apply 1 threshold to each input
thresholded = torch.cat([(X > (i + 1) / 3).float() for i in range(2)], dim=1)
print(thresholded)
preds = model(thresholded)
binary_inputs = thresholded.bool().numpy()

preds = model(thresholded)
preds_compiled = compiled_model(binary_inputs)

print(f"X = \n{X}")
print(f"inputs = \n{binary_inputs}")
print(f"preds = \n{preds}")
print(f"preds_compiled = \n{preds_compiled}")


Found GroupSum layer with 1 classes
Parsed 0 conv, 0 pooling, 1 linear layers
Layer execution order: [('linear', 0)]
Compiling finished in 0.078 seconds.
tensor([[0., 1., 0., 0.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])
X = 
tensor([[0.0000, 0.5000],
        [0.7000, 0.9000],
        [0.7000, 0.9000],
        [0.7000, 0.9000]])
inputs = 
[[False  True False False]
 [ True  True  True  True]
 [ True  True  True  True]
 [ True  True  True  True]]
preds = 
tensor([[2.],
        [1.],
        [1.],
        [1.]])
preds_compiled = 
tensor([[2],
        [1],
        [1],
        [1]], dtype=torch.int32)
