If you have a Cayley database, you can build a machine learning model for such a task:

Given a partially filled Cayley table of a semigroup, restore the full one.

It should be mentioned that a partially filled table sometimes can be filled in several ways to a full associative table. We will consider all such solutions as equally valid.

In `neural-semigroups` package we use `torch` for building deep learning models.

First of all, we need to get some training and validation data.
In this example, we take semigroups of 4 items, and hold 63 Cayley tables (each representing a different class of equivalent semigrous) as our training data, and another 62 tables as validation.
This is a rough 50/50 spplit of all tables of 4 elements available (there are 126 of them up to equivalence).

Here we construct `DataLoaders` for `torch` which will feed a training pipeline with 512 tables at a time.
This number (batch size) can be changed for fine-tuning the model's quality.

In [1]:
from neural_semigroups.training_helpers import get_loaders

cardinality = 4
data_loaders = get_loaders(
    cardinality=cardinality,
    batch_size=512,
    train_size=63,
    validation_size=62
)

augmenting by equivalent tables: 100%|██████████| 63/63 [00:00<00:00, 715.61it/s]
generating train cubes: 100%|██████████| 1754/1754 [00:00<00:00, 60926.46it/s]
generating validation cubes: 100%|██████████| 62/62 [00:00<00:00, 50790.40it/s]
generating test cubes: 100%|██████████| 1/1 [00:00<00:00, 3463.50it/s]


Note that for a training set we:
* take 63 representatives of different equivalence classes
* augment data by adding all equivalent tables
* as a result, we will train on 1754 tables from 63 classes of equivalence

For validation we simply use 62 tables from different classes.

We model each input Cayley table as a three index tensor $a_{ijk}$ such that

$a_{ijk}=P\left\{e_ie_j=e_k\right\}$

where $e_i$ are elements of a semigroup.

In our training data all $a_{ijk}$ are either zeros or ones, so probability distributions involved are degenerate.

When we need to hide a cell with indices $i,j$ from an original Cayley table we set

$a_{ijk}=\dfrac1n$

where $n$ is the semigroup's cardinality. Thus we set a probability distribution of the multiplication result $e_ie_j$ to discrete uniform.

We choose a simple denoising autoencoder as an architecture for our neural network. It simply gets an input tensor of zeros and ones, hide 50% of input cells in a manner described earlier, and applies a linear transformation into a higher dimension ($n^8$ which is contrary to a common idea of autoencoders) with a simple `RuLU` non-linearity. Then another linear transformation with `ReLU` is applied to return back to the original $n^3$ dimension. We also apply batch normalization here. See the package code for the details.

In [2]:
from neural_semigroups import MagmaDAE
from neural_semigroups.constants import CURRENT_DEVICE

dae = MagmaDAE(
    cardinality=cardinality,
    hidden_dims=[
        cardinality ** 8
    ],
    corruption_rate=0.5
).to(CURRENT_DEVICE)

In [3]:
dae

MagmaDAE(
  (decoder_layers): Sequential(
    (linear10): Linear(in_features=65536, out_features=64, bias=True)
    (relu10): ReLU()
    (bn10): BatchNorm1d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  )
  (encoder_layers): Sequential(
    (linear00): Linear(in_features=64, out_features=65536, bias=True)
    (relu00): ReLU()
    (bn00): BatchNorm1d(65536, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  )
)

During the training process we try to minimize a special [associator loss](https://neural-semigroups.readthedocs.io/en/latest/package-documentation.html#associator-loss) on the output of the DAE.

In [4]:
import torch
from torch import Tensor
from neural_semigroups import AssociatorLoss

def loss(prediction: Tensor, target: Tensor) -> Tensor:
    return AssociatorLoss()(prediction)

We use `pytorch-ignite` to write less boilerplate code for a training pipeline.

In [5]:
from ignite.engine import create_supervised_evaluator
from ignite.metrics.loss import Loss

evaluator = create_supervised_evaluator(
    dae,
    metrics={"loss": Loss(loss)}
)

Now it's time to run a pipeline! Here you can tune the learning schedule for better results.

You can construct your own pipeline if you don't want to import one provided by the package.

In the next three cells we will run `tensorboard` to show training/validation curves during training process.

In [6]:
%load_ext tensorboard

In [7]:
!rm -rf runs

In [8]:
%tensorboard --logdir runs --host 0.0.0.0

Reusing TensorBoard on port 6006 (pid 3152), started 2:07:13 ago. (Use '!kill 3152' to kill it.)

In [10]:
%%time
from neural_semigroups.training_helpers import learning_pipeline

params = {
    "learning_rate": 0.001,
    "epochs": 1000,
    "cardinality": cardinality
}
learning_pipeline(params, dae, evaluator, loss, data_loaders)

CPU times: user 1h 24min 30s, sys: 3min 49s, total: 1h 28min 19s
Wall time: 27min 58s


And here is the report of results. It seems to be quite impressive. For it we got 126 of all Cayley tables from 4 elements (for different equivalent classes as always) and constructed 'puzzles' from it.

Level of difficulty for a puzzle is a number of hidden cells. A puzzle is considered to be solved if the model returns a full associative table.

We see that the model generalizes well (it was trained only on a half of equivalence classes).

In [53]:
from neural_semigroups.utils import print_report
from neural_semigroups import CayleyDatabase

cayley_db = CayleyDatabase(cardinality)
cayley_db.load_model(f"semigroups.{cardinality}.model")
print_report(cayley_db.testing_report)

generating and solving puzzles: 100%|██████████| 126/126 [00:06<00:00, 18.41it/s]


Unnamed: 0_level_0,puzzles,solved,(%),hidden cells,guessed,in %
level,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1,126,125,99,126,125,99
2,126,120,95,252,246,97
3,126,115,91,378,364,96
4,126,109,86,504,482,95
5,126,113,89,630,608,96
6,126,101,80,756,708,93
7,126,104,82,882,841,95
8,126,104,82,1008,948,94


Now let's see how it works on several example puzzles. Let's take one of the real tables from the database.

In [24]:
cayley_db.database[100]

array([[0, 0, 0, 0],
       [0, 1, 0, 0],
       [0, 0, 2, 2],
       [0, 0, 3, 3]])

Then we can fill it with `-1` in some cells, creating a puzzle and giving it to the model.

In [48]:
guess, proba = cayley_db.fill_in_with_model([
    [0, 0, 0, 0],
    [0, -1, -1, 0],
    [0, -1, -1, 2],
    [0, 0, 3, -1]
])

The model found not the same table as the original one.

In [50]:
guess

array([[0, 0, 0, 0],
       [0, 0, 0, 0],
       [0, 0, 2, 2],
       [0, 0, 3, 3]])

But it's still a possible completion since it's associative

In [49]:
from neural_semigroups import Magma

Magma(guess).is_associative

True

The model returns also it's probabilities of guess. They can be examined in cases when the model err.

In [51]:
proba

array([[[9.9999702e-01, 1.0000000e-06, 1.0000000e-06, 1.0000000e-06],
        [9.9999702e-01, 1.0000000e-06, 1.0000000e-06, 1.0000000e-06],
        [9.9999702e-01, 1.0000000e-06, 1.0000000e-06, 1.0000000e-06],
        [9.9999702e-01, 1.0000000e-06, 1.0000000e-06, 1.0000000e-06]],

       [[9.9999702e-01, 1.0000000e-06, 1.0000000e-06, 1.0000000e-06],
        [8.3226544e-01, 1.4840166e-01, 1.0235688e-02, 9.0971785e-03],
        [9.0341359e-01, 3.7765019e-02, 2.2309702e-02, 3.6511686e-02],
        [9.9999702e-01, 1.0000000e-06, 1.0000000e-06, 1.0000000e-06]],

       [[9.9999702e-01, 1.0000000e-06, 1.0000000e-06, 1.0000000e-06],
        [9.2575294e-01, 3.8013570e-02, 2.7366873e-02, 8.8666305e-03],
        [1.7959351e-02, 1.3601107e-02, 9.5569611e-01, 1.2743457e-02],
        [1.0000000e-06, 1.0000000e-06, 9.9999702e-01, 1.0000000e-06]],

       [[9.9999702e-01, 1.0000000e-06, 1.0000000e-06, 1.0000000e-06],
        [9.9999702e-01, 1.0000000e-06, 1.0000000e-06, 1.0000000e-06],
        [1.000