<a href="https://colab.research.google.com/github/JurijZ/ai-zip/blob/main/recursive.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
import torch
import torch.nn as nn
import torch.optim as optim

In [2]:
# 1. Data Preparation (100 words)
text = """
Artificial intelligence is a branch of computer science that aims to create intelligent machines.
It has become an essential part of the technology industry. Research associated with artificial
intelligence is highly technical and specialized. The core problems of artificial intelligence
include programming computers for certain traits such as knowledge reasoning problem solving
perception learning planning and the ability to move and manipulate objects. Knowledge engineering
is a core part of AI research. Machines can often act and react like humans only if they have
abundant information relating to the world. Artificial intelligence must have access to objects
categories properties and relations.
"""
words = text.split()[:100]
vocab = sorted(list(set(words)))
word_to_ix = {word: i for i, word in enumerate(vocab)}
ix_to_word = {i: word for i, word in enumerate(vocab)}
target_indices = torch.tensor([word_to_ix[w] for w in words], dtype=torch.long)

In [3]:



# 2. Recursive Transformer Model
class RecursiveTransformerCompressor(nn.Module):
    def __init__(self, vocab_size, seq_len, d_model, nhead, num_iterations):
        super().__init__()
        self.num_iterations = num_iterations

        # This embedding acts as the "address" or "query" for the memory
        self.pos_embedding = nn.Embedding(seq_len, d_model)

        # The recursive block (shared weights)
        self.transformer_layer = nn.TransformerEncoderLayer(
            d_model=d_model,
            nhead=nhead,
            dim_feedforward=d_model * 2,
            batch_first=True
        )

        self.output_layer = nn.Linear(d_model, vocab_size)

    def forward(self):
        # We don't pass data in; we generate from the internal position weights
        x = self.pos_embedding(torch.arange(100).to(self.pos_embedding.weight.device))
        x = x.unsqueeze(0) # Add batch dimension

        # Recursion: The same weights are used num_iterations times
        for _ in range(self.num_iterations):
            x = self.transformer_layer(x)

        return self.output_layer(x).squeeze(0)

In [4]:
# 3. Hyperparameters for Compression
# To prove compression, we keep d_model small.
D_MODEL = 64
N_HEAD = 4
ITERATIONS = 5 # How many times to loop through the same weights
VOCAB_SIZE = len(vocab)

model = RecursiveTransformerCompressor(VOCAB_SIZE, 100, D_MODEL, N_HEAD, ITERATIONS)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

In [5]:
# 4. Training Loop (Overfitting/Memorizing)
print(f"Storing information in {sum(p.numel() for p in model.parameters())} recursive parameters...")

for epoch in range(1201):
    optimizer.zero_grad()
    output = model()
    loss = criterion(output, target_indices)
    loss.backward()
    optimizer.step()

    if epoch % 300 == 0:
        _, predicted = torch.max(output, 1)
        correct = (predicted == target_indices).sum().item()
        print(f"Epoch {epoch:4d} | Loss: {loss.item():.4f} | Accuracy: {correct}%")



Storing information in 44877 recursive parameters...
Epoch    0 | Loss: 4.5100 | Accuracy: 2%
Epoch  300 | Loss: 0.0167 | Accuracy: 100%
Epoch  600 | Loss: 0.0051 | Accuracy: 100%
Epoch  900 | Loss: 0.0024 | Accuracy: 100%
Epoch 1200 | Loss: 0.0015 | Accuracy: 100%


In [6]:
# 5. Restoration
model.eval()
with torch.no_grad():
    final_output = model()
    restored_indices = torch.argmax(final_output, dim=1)
    restored_text = [ix_to_word[idx.item()] for idx in restored_indices]

print("\n--- Restored Text ---")
print(" ".join(restored_text))


--- Restored Text ---
Artificial intelligence is a branch of computer science that aims to create intelligent machines. It has become an essential part of the technology industry. Research associated with artificial intelligence is highly technical and specialized. The core problems of artificial intelligence include programming computers for certain traits such as knowledge reasoning problem solving perception learning planning and the ability to move and manipulate objects. Knowledge engineering is a core part of AI research. Machines can often act and react like humans only if they have abundant information relating to the world. Artificial intelligence must have access to objects categories properties and
