A drop-in pytorch module to replace the standard linear layer in neural networks by Ensemble AI.
Ensemble Platform
Want smaller, faster models without accuracy loss? Tried pruning or quantization and hit a wall?
👉 Try the full Ensemble platform → (10M free credits on signup)
A PyTorch module that replaces the standard linear layer in neural networks.
- ✅ Plug-and-play replacement for
nn.Linear
- 📦 Lightweight, parameter-efficient
- 🧠 Preserves multivariate structure natively
Example: A 130M parameter DiT model using NdLinear outperformed a 457M baseline on the FID benchmark for ImageNet100
Upload any model – get back a smaller, faster version.
No accuracy loss.
- 🔁 Automatically swaps layers and tunes hyperparams
- 📉 Shrinks model size (parameter count) by up to 8x
- 🛠 Tailor uploaded models to your hardware & finetuning constraints
- 🧰 Export to ONNX, TensorRT, SNPE, and more
- 💡 Designed to work alongside other compression techniques(pruning, quantization, distillation)
- 🎁 Includes 10M free credits on signup
👉 Try the Ensemble Platform for free→
📺 Or see a demo →
NdLinear preserves the multi-dimensional structure of data, enhancing representational power with fewer parameters.
Rather than flattening tensors, it transforms them across a structured set of vector spaces—capturing dependencies standard fully connected layers discard.
- Structure Preservation: Retains the original data format and shape.
- Parameter Efficiency: Reduces parameter count while improving performance.
- Minimal Overhead: Maintains the same complexity as conventional linear layers.
- Flexible Integration: Seamlessly replaces existing linear layers.
To integrate NdLinear into your projects, clone the repository and install the necessary dependencies:
git clone https://github.com/ensemble-core/NdLinear.git
cd NdLinear
pip install .
Alternatively, if packaged, install via pip:
pip install ndlinear
Or, via conda:
conda install conda-forge::ndlinear
NdLinear can be utilized in various neural network architectures such as CNNs, RNNs, and Transformers.
import torch
from ndlinear import NdLinear
input_tensor = torch.randn(32, 28, 28, 3) # Batch of images
ndlinear_layer = NdLinear(input_dims=(28, 28, 3), hidden_size=(64, 64, 6))
output = ndlinear_layer(input_tensor)
In transformer architectures, you might need to manipulate multi-dimensional tensors for efficient linear operations. Here's how you can use NdLinear
with a 3D input tensor:
import torch
from ndlinear import NdLinear
input_tensor = torch.randn(32, 28, 28) # Input with shape : (batch_size, num_tokens, token_dim)
# Reshape the input tensor for linear operations
input_tensor = input_tensor.reshape(-1, 28, 1) # New shape: (batch_size * num_tokens, token_dim, 1)
# Define an NdLinear layer with suitable input and hidden dimensions
ndlinear_layer = NdLinear(input_dims=(28, 1), hidden_size=(32, 1))
# Perform the linear transformation
output = ndlinear_layer(input_tensor)
# Reshape back to the original dimensions after processing
output = output.reshape(32, 28, -1) # Final output shape: (32, 28, 32)
This example illustrates how NdLinear
can be integrated into transformer models by manipulating the tensor shape, thereby maintaining the structure necessary for further processing and achieving efficient projection capabilities.
This example demonstrates how to use the NdLinear
layers in a forward pass setup, making integration into existing MLP structures simple and efficient.
import torch
from ndlinear import NdLinear
input_tensor = torch.randn(32, 128)
# Define the first NdLinear layer for the MLP with input dimensions (128, 8) and hidden size (64, 8)
layer1 = NdLinear(input_dims=(128, 8), hidden_size=(64, 8))
# Define the second NdLinear layer for the MLP with input dimensions (64, 8) and hidden size (10, 2)
layer2 = NdLinear(input_dims=(64, 8), hidden_size=(10, 2))
x = F.relu(layer1(input_tensor))
output = layer2(x)
When input_dims
and hidden_size
are one-dimensional, NdLinear
functions as a conventional nn.Linear
layer, serving as an edge case where n=1
.
from ndlinear import NdLinear
# Defining NdLinear with one-dimensional input and hidden sizes
layer1 = NdLinear(input_dims=(32,), hidden_size=(64,))
NdLinearGated extends the core functionality of NdLinear by incorporating sophisticated gating mechanisms to control information flow. This allows models to selectively transform input dimensions, enhancing representational power and efficiency.
- Selective Information Flow: Dynamic gating mechanisms that control which transformations are applied
- Multiple Gating Modes: Support for soft (continuous) and hard (binary) gating approaches
- Dimension Selection: Apply gating to all dimensions, only the first dimension, or automatically to the most important dimensions
NdLinearGated can be integrated into neural networks with fine-grained control over the gating behavior:
import torch
from ndlinear import NdLinearGated
# Create input tensor
input_tensor = torch.randn(32, 28, 28, 3) # Batch of images
# Initialize NdLinearGated with soft gating on all dimensions
gated_layer = NdLinearGated(
input_dims=(28, 28, 3),
hidden_size=(64, 64, 6),
gating_mode="soft",
gated_modes="all"
)
# Forward pass with gating
output = gated_layer(input_tensor)
NdLinearGated offers various configurations to suit different modeling needs:
# Apply hard gating only to the first dimension
first_dim_gated = NdLinearGated(
input_dims=(128, 8),
hidden_size=(64, 8),
gating_mode="hard",
gated_modes="first"
)
# Apply soft gating to top-k dimensions with highest standard deviation
topk_gated = NdLinearGated(
input_dims=(28, 28, 3),
hidden_size=(64, 64, 6),
gating_mode="soft",
gated_modes="topk"
)
Based on extensive experimentation, we recommend the following configuration for optimal performance:
optimal_gated = NdLinearGated(
input_dims=(28, 28, 3),
hidden_size=(64, 64, 6),
gating_mode="soft",
gated_modes="topk",
gating_hidden_dim=16 # Adjust based on your model size
)
This configuration (soft gating + top-k modes) consistently delivers:
- Higher Accuracy: Improves performance by ~0.5-0.7% over baseline NdLinear
- Compute Efficiency: Reduces computational load by 50-75% by activating only the most useful projections
- Training Stability: Shows stable training and smooth gate entropy decay
- Enhanced Interpretability: Provides clearer insights into which dimensions are most important
The soft gating mechanism offers better stability compared to hard gating, while top-k mode selection focuses computational resources on the most informative dimensions. The gating_hidden_dim parameter can be adjusted based on your specific model architecture and dataset requirements.
NdLinear is versatile and can be used in:
- Image Classification: Run
cnn_img_classification.py
. - Time Series Forecasting: Use
ts_forecast.py
. - Text Classification: Launch
txt_classify_bert.py
. - Vision Transformers: Execute
vit_distill.py
.
Join the community and enhance your projects using NdLinear in Hugging Face, Kaggle, and GitHub.
Join our Discord! https://discord.gg/6DWHusWN
For questions or collaborations, please contact Alex Reneau.
This project is distributed under the Apache 2.0 license. View the LICENSE file for more details.