# üß¨ Iseer Architecture Training

**Mamba SSM + Mixture of Experts ‚Äî From Scratch**

Built by Iseer & Co.

---

‚ö†Ô∏è **Before running:** Go to `Runtime > Change runtime type > GPU (T4)`

## 1Ô∏è‚É£ Setup

In [1]:
# Check GPU
!nvidia-smi

import torch
print(f"\nPyTorch: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"VRAM: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB")

/bin/bash: line 1: nvidia-smi: command not found

PyTorch: 2.9.0+cu126
CUDA available: False


In [2]:
# Install dependencies
!pip install -q einops wandb datasets

In [3]:
# Clone the Iseer repository
!git clone https://github.com/InanXR/IseerArchitecture.git
%cd IseerArchitecture

Cloning into 'IseerArchitecture'...
fatal: could not read Username for 'https://github.com': No such device or address
[Errno 2] No such file or directory: 'IseerArchitecture'
/content


## 2Ô∏è‚É£ Load Model & Tokenizer

In [4]:
import sys
sys.path.insert(0, '.')

from iseer.model.config import ISEER_SM, ISEER_MD
from iseer.model.iseer import Iseer
from iseer.tokenizer.bpe import BPETokenizer

# Load tokenizer
tokenizer = BPETokenizer.load('iseer/tokenizer/vocab.json')
print(f"Vocabulary size: {len(tokenizer):,}")

# Create model
config = ISEER_SM
config.vocab_size = len(tokenizer)
model = Iseer(config)

total, active = model.count_parameters()
print(f"Total params: {total:,}")
print(f"Active params: {active:,}")

ModuleNotFoundError: No module named 'iseer'

## 3Ô∏è‚É£ Load Training Data

In [None]:
# Option A: Load from HuggingFace
from datasets import load_dataset

# Bengali data
print("Loading Bengali data...")
bn_data = load_dataset("cc100", lang="bn", split="train", streaming=True)
bn_texts = [item['text'][:2000] for i, item in enumerate(bn_data) if i < 25000]
print(f"  Loaded {len(bn_texts):,} Bengali texts")

# English data
print("Loading English data...")
en_data = load_dataset("cc100", lang="en", split="train", streaming=True)
en_texts = [item['text'][:2000] for i, item in enumerate(en_data) if i < 25000]
print(f"  Loaded {len(en_texts):,} English texts")

texts = bn_texts + en_texts
print(f"\nTotal: {len(texts):,} texts")

In [None]:
# Create DataLoader
from iseer.data.dataset import create_dataloader

train_loader = create_dataloader(
    texts=texts,
    tokenizer=tokenizer,
    batch_size=8,
    seq_len=512,
)

## 4Ô∏è‚É£ Train!

In [None]:
from iseer.training.trainer import Trainer, TrainingConfig

# Training config
train_config = TrainingConfig(
    learning_rate=3e-4,
    max_steps=5000,
    warmup_steps=100,
    batch_size=8,
    gradient_accumulation_steps=4,
    mixed_precision=True,
    dtype="float16",
    log_steps=50,
    save_steps=1000,
    output_dir="checkpoints",
    use_wandb=False,  # Set True to log to wandb
)

# Create trainer
trainer = Trainer(
    model=model,
    train_dataloader=train_loader,
    config=train_config,
)

# Train!
trainer.train()

## 5Ô∏è‚É£ Test Generation

In [None]:
# Test the trained model
model.eval()

prompts = [
    "‡¶¨‡¶æ‡¶Ç‡¶≤‡¶æ‡¶¶‡ßá‡¶∂ ‡¶è‡¶ï‡¶ü‡¶ø",
    "The capital of France is",
    "In the beginning",
]

for prompt in prompts:
    print(f"Prompt: {prompt}")
    output = model.generate(
        tokenizer.encode(prompt, add_special_tokens=False),
        max_new_tokens=50,
        temperature=0.8,
    )
    print(f"Output: {tokenizer.decode(output)}")
    print()

## 6Ô∏è‚É£ Save & Download

In [None]:
# Save final model
torch.save(model.state_dict(), 'iseer_sm_trained.pt')
print("Model saved to iseer_sm_trained.pt")

# Download
from google.colab import files
files.download('iseer_sm_trained.pt')

---

**üß¨ Built with Iseer Architecture**

Mamba SSM + MoE | From Scratch | By Iseer & Co.