Skip to content

MrPan2048/GeometricTransformer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

⚖️ seeking an endorsement to publish a formal paper on this architecture

I am seeking an endorsement to publish a formal paper on this architecture to arXiv (CS.ML). I believe these results demonstrate that "Living Cell" resonance models can outperform Attention in parameter-constrained and latency-sensitive environments. If you are a researcher willing to endorse or review the mathematical formalization, please contact me via GitHub.

⚔️ Takeover Battle: Baseline Summary

The Experiment: A competition between Hierarchical Depth and Parallel Resonance.

1. STD (Standard Transformer)

  • Strategy: Deep Logic.
  • Structure: A 9-layer vertical stack of Transformer blocks.
  • Mechanism: Relies on sequential Self-Attention to "reason" through text patterns.
  • Trade-off: High computational cost on CPU due to layer depth ($O(L \cdot N^2)$).

2. GEO (Geometric Manifold)

  • Strategy: Broad Resonance.
  • Structure: A single-layer manifold with 6 parallel competitive cells.
  • Mechanism:
    • Bio-Pulse: Uses Sine-wave activations (torch.sin) to mimic neural firing.
    • Inhibition: Parallel cells compete via an inhibition matrix to reduce redundancy.
    • Prototypes: High-dimensional similarity routing instead of fixed sequential paths.
  • Trade-off: Extremely fast; replaces serial depth with spatial width and rhythmic "rhythm" detection.

📊 Structural Comparison

Feature STD (Standard) GEO (Geometric)
Philosophy Deep Sequential Broad Parallel
Logic Self-Attention Competitive Resonance
Activation GELU (Standard) Sine-Pulse (Bio-inspired)
CPU Speed Slower (Serial) Faster (Parallel)

Goal: Determine if a broad, competitive manifold can achieve a lower loss than a traditional deep stack by treating language as a resonant rhythm rather than a logical sequence.

Results

⚔️ ByteFight

The Experiment: A head-to-head architectural battle between two "brains" processing the same raw byte stream.

1. STD (Standard Transformer)

  • Philosophy: Deep Logic.
  • Mechanism: Global Attention—every byte looks at every other byte.
  • Performance: Highly capable but computationally heavy. It suffers from Quadratic Complexity ($O(N^2)$), leading to slower CPU speeds (150ms–240ms).

2. SGR (Sovereign)

  • Philosophy: Broad Manifold.
  • Mechanism: Local Convolution + Parallel Expert Cells. It mimics biological local connectivity.
  • Performance: Highly efficient. It operates with Linear Complexity ($O(N)$), running consistently 3x faster (~55ms) than the standard model.

🏆 The Result

The SGR (Sovereign) model is currently winning on speed and hardware efficiency. It proves that for byte-level logic, a wide, parallel spatial map can outperform a deep, sequential stack while using significantly fewer computational resources.

Results

🧬 Sovereign Geometric Routing (SGR): The "Living Cell" Alternative

The battle

1. The Core Philosophy

Traditional AI architectures (Transformers) act as a "Black Box"—a brain with no inherent structure, relying on a massive soup of statistical signals to calculate global attention ($O(n^2)$). This is non-biological and computationally wasteful.

SGR (Sovereign Geometric Routing) proposes the Living Cell theory:

  • The Soma: Every token embedding is a physical neuron body fixed in a high-dimensional territory.
  • The Pulse: Each cell maintains an internal temporal memory (the "path").
  • Synaptic Recruitment: Instead of firing every connection, the cell performs Calculated Recruitment. It only activates the specific synapses it needs to reach the next logical state.

Results

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published