First Arabic-originated Molecular Inference Engine for LLMs أول مكتبة عربية توفر محرك استدلال جزيئي للنماذج اللغوية الكبيرة
pip install amf-core
AMF (Atomic Model Fragmentation) is a Python library that enables running large language models (LLMs) on severely resource-constrained hardware — including machines with less than 1 GB of VRAM and as little as 300 MB of RAM.
Instead of loading an entire model into memory, AMF:
- Fragments the model into independent semantic cells
- Analyzes each user prompt to determine which cells are needed
- Loads only the required cells on demand via memory-mapping
- Infers using only the active subset of weights
Validated on Kaggle: Qwen2.5-7B (9 GB model) running with < 500 MB RAM ✅
pip install amf-coreOptional extras:
pip install amf-core[safetensors] # Safetensors / HuggingFace support
pip install amf-core[dev] # Development toolsRequirements: Python ≥ 3.10, numpy, gguf, rich, scikit-learn
import amf
eng = amf.engine("path/to/model.gguf")
eng.load()
print(eng.predict("Hello")) # → "resilient"
print(eng.generate("Hello", n=5)) # → "Hello resilient strong bold clear"
eng.close()import amf
with amf.engine("model.gguf") as eng:
print(eng.predict("The future of AI is"))import amf
model = amf.load_universal("path/to/model.gguf")
cells = amf.fragment(model, strategy="functional", output_dir="./cells")
print(f"Generated {cells.total_cells} cells ({cells.total_bytes / 1e6:.0f} MB)")
for cell in cells.cells:
print(f" {cell.cell_id:30s} DNA: {cell.dna_tag} ({cell.size_mb:.1f} MB)")amf fragment --model qwen2.5:0.5b # Fragment a model
amf chat # Interactive chat
amf info # System infoAMF operates in 6 distinct phases:
User Prompt
│
▼
┌──────────────────────────────────────────────────────────┐
│ 1. Universal Parsing │ ModelLoader reads GGUF / │
│ │ Safetensors → UniversalModel │
├──────────────────────────┼───────────────────────────────┤
│ 2. Weight Analysis │ WeightAnalyzer classifies │
│ │ tensors by zone & function │
├──────────────────────────┼───────────────────────────────┤
│ 3. DNA Tagging │ SortingAlgorithm assigns │
│ │ tags: A-L-003-Q │
│ │ (Attention-Linguistic-L3-Q) │
├──────────────────────────┼───────────────────────────────┤
│ 4. Intent Analysis │ IntentAnalyzer determines │
│ │ which layer zones are needed │
├──────────────────────────┼───────────────────────────────┤
│ 5. Molecular Engine │ Loads only required cells │
│ │ via selective mmap │
├──────────────────────────┼───────────────────────────────┤
│ 6. AMFEngine Inference │ Forward pass on active │
│ │ weight subset (NumPy) │
└──────────────────────────┴───────────────────────────────┘
| Zone | Layers | Specialisation |
|---|---|---|
| Linguistic | 0 – 7 | Grammar, syntax, tokens |
| Semantic | 8 – 15 | Meaning, context |
| Reasoning | 16–23 | Logic, math, code |
A - L - 003 - Q
│ │ │ └─ Component (Q/K/V/O/Gate/Up/Down/Norm)
│ │ └──────── Layer index (zero-padded)
│ └─────────────── Zone (L=Linguistic / S=Semantic / R=Reasoning)
└───────────────────── Type (A=Attention / F=FFN / C=Core)
from engine.amf_engine import AMFEngine
eng = AMFEngine("model.gguf", inference_layer=20)
eng.load()
eng.predict("Hello") # → str
eng.generate("Hello", n=5) # → str
eng.set_layer(16) # switch inference layer
eng.list_layers() # → [0, 1, ..., 23]
eng.info() # → dict
eng.close()| Format | Status |
|---|---|
| F32 | ✅ Full |
| F16 | ✅ Full |
| Q8_0 | ✅ Full |
| Q4_K_M | ✅ Approximate |
| Q4_0 | ✅ Approximate |
| Model | Full RAM | AMF RAM | Reduction |
|---|---|---|---|
| Qwen2.5-0.5B | ~400 MB | ~80 MB | 5× |
| Qwen2.5-7B (Q4) | ~4.5 GB | ~500 MB | 9× |
| Qwen2.5-32B (Q4) | ~20 GB | ~500 MB | 40× |
CPU-only. No GPU required.
| Model Family | Format | Status |
|---|---|---|
| Qwen 2.5 (all sizes) | GGUF | ✅ Tested |
| Qwen 3 / 3.5 | GGUF | ✅ Compatible |
| LLaMA 3 | GGUF | ✅ Compatible |
| Mistral | GGUF | ✅ Compatible |
| HuggingFace models | Safetensors | 🔄 Beta |
git clone https://github.com/jadcrypto/amf-core.git
cd amf-core
pip install -e .[dev]
pytest tests/MIT — see LICENSE.
@software{amf_core,
title = {AMF: Atomic Model Fragmentation},
author = {Jad},
year = {2026},
url = {https://github.com/jadcrypto/amf-core}
}