4GB RAMμμ 1.2B νλΌλ―Έν°μ λ²½μ νλ¬Όλ€ / Breaking the 1.2B Parameter Barrier on 4GB RAM
slmakerλ κ·Ήλλ‘ μ νλ νλμ¨μ΄ νκ²½(4GB RAM, CPU-only)μμ κ±°λ μΈμ΄ λͺ¨λΈ(LLM)μ νμ΅νκ³ κ΅¬λνκΈ° μν κ·Ήνμ μμ§λμ΄λ§ νλ‘μ νΈμ λλ€. / slmaker is an extreme engineering project aimed at training and running Large Language Models (LLMs) in ultra-constrained hardware environments (4GB RAM, CPU-only). d efficiency on low-end hardware (CPU, 4GB RAM).
- π Monster (v0.3.0): 4.5M νλΌλ―Έν°μ μ΄κ³ ν¨μ¨ κ΄΄λ¬Ό μμ§. μ μ¬μ½ κΈ°κΈ°μμλ 민첩ν μλ΅μ±μ 보μ₯ν©λλ€. / 4.5M ultra-efficient engine. Guarantees agile response on low-end hardware.
- π Odyssey v1.0.0 Engine: νκ΅μ΄, μμ΄, μ½λ©μ μ§μνλ λ°μ΄νΈ λ¨μ μ μ¬ ν ν¬λμ΄μ λ° KV μΊμ±(10x μλ ν₯μ). / Byte-level Latent Tokenizer for KR/EN/Code & KV Caching (10x speedup).
- π Dual-Interface Full Parity: CLIμ GUI λͺ¨λμμ νμ΅ λ° μ€μκ° μΆλ‘ (Generation) μλ²½ μ§μ. / 100% parity for Training & Real-time Inference in both CLI and GUI.
- π₯οΈ Odyssey Propulsion Dashboard: μ€μκ° ν λ λ©νΈλ¦¬μ μμ± μΈν°νμ΄μ€κ° ν΅ν©λ νλ‘νμ λ λμ보λ. / Professional dashboard with real-time telemetry and generation interface.
- π¦ Global CI/CD: GitHub Actionsλ₯Ό ν΅ν λ©ν° OS(Ubuntu, Windows, MacOS) μλ λ¦΄λ¦¬μ€ λ° λ컀 λ°°ν¬. / Automated multi-OS releases and Docker deployment via GitHub Actions.
- π‘οΈ Auto-Healing Intelligence: μΆλ‘ μ κ°μ€μΉ λλ½μ κ°μ§νλ©΄ μλμΌλ‘ μμ€ν λ©μμ§λ₯Ό μΆλ ₯νκ³ μ¬νμ΅ ν μ¦μ μΆλ‘ μ μ¬κ°ν©λλ€. / Automatically detects missing weights during inference, triggers auto-retraining, and resumes generation.
- π‘οΈ Secure Archiving: μ μ μ§μΉ¨μ λ°λ₯Έ μΈμ λΈλ μΈ λ° λν μ΄λ ₯ μλ κ΄λ¦¬. / Automated management of session brain and conversation history as per global rules.
python3 -m venv new_venv
source new_venv/bin/activate
pip install -r requirements.txt# GUI (slmaker Dashboard) μ€ν
./run.sh
# CLI (slmaker Engine) μ€ν
./run_cli.sh- Target Hardware: Intel/AMD CPU, 4GB RAM
- Training Loss: 4.11 β 0.12 (Optimized v0.2.0)
- Extreme Speed: JIT μ»΄νμΌ λ° SDPA μ μ©μΌλ‘ μ°μ° μλ 500% ν₯μ. / 500% speed increase via JIT compilation and SDPA.
- μ μκΆ / Copyright: Rheehose (Rhee Creative) 2008-2026
- λΌμ΄μ μ€ / License: Apache License 2.0
"μ‘°μ ν νμ§μ νμ©νμ§ μμ΅λλ€. μλ²½μ λμ΄μ μνμ λ§€ μκ° μ¦λͺ ν©λλ€." - Antigravity Gemini
graph TD
Root["slmaker (v1.0.0)"] --> Core["Core Logic"]
Root --> UI["Interface"]
Root --> Data["Data & Weights"]
Core --> M["model.py (Transformer)"]
Core --> T["train.py (Engine)"]
Core --> TK["tokenizer.py (Byte-level)"]
UI --> GUI["gui.py (Dashboard)"]
UI --> CLI["cli.py (TUI Engine)"]
UI --> SH["run.sh / run_cli.sh"]
Data --> W["data/weights/ (Odyssey 1.2B)"]
Data --> TXT["data/*.txt (Corpus)"]
W --> BIN["*.bin (Disk-mapped weights)"]
subgraph ERD ["Data Relationship (ERD)"]
Corpus["Corpus (.txt)"] -- "Tokens" --> Model["NanoSLM Model"]
Model -- "Read/Write" --> Weights["Weights (.bin / .pth)"]
Weights -- "Mmap" --> SSD["Hardware SSD"]
end
sequenceDiagram
participant User
participant UI as GUI/CLI Interface
participant Model as model.py (Weight Check)
participant Train as train.py (Auto-Retrain)
participant Gen as engine_inference
User->>UI: Trigger Inference (Generation)
UI->>Model: check_weights_complete()
alt Weights Exist
Model-->>UI: OK
UI->>Gen: Start Generation
else Weights Missing
Model-->>UI: Missing Shards
UI->>User: Show "SYSTEM: Auto-Retraining..."
UI->>Train: Start engine_train()
Train-->>UI: Training Complete & Weights Saved
UI->>Gen: Resume original inference request
end
Gen-->>User: Return Generated Text
model.py: Odyssey(1.2B) λ° Monster μν€ν μ² ν΅μ¬ μ½λ. SSD λ§€ν(MmapLinear) κΈ°μ λ΄μ₯.train.py: νμ΅ μμ§ λ° KV μΊμ± κΈ°λ° μΆλ‘ λ‘μ§ ν΅ν©.tokenizer.py: v1.0 λ°μ΄νΈ λ¨μ μ μ¬ ν ν¬λμ΄μ (ν/μ/μ½λ μ΅μ ν).gui.py/cli.py: μ€μκ° ν λ λ©νΈλ¦¬ λμ보λ λ° μΆλ‘ μΈν°νμ΄μ€.data/weights/: Odyssey (1.2B) λͺ¨λΈ κ°μ€μΉ.np.memmapμ ν΅ν΄ SSDμ μ§μ λ§€νλμ΄ RAM μ μ λ₯Ό μ΅μνν©λλ€.
run_cli.shλ₯Ό ν΅ν΄ μ€ννλ©°, λ€μν λͺ
λ Ήν μΈμλ₯Ό μ§μν©λλ€. / Run via run_cli.sh with various command-line arguments.
| Command / Argument | Description | Example |
|---|---|---|
mode |
train λλ inference μ ν |
./run_cli.sh inference |
prompt |
μΆλ‘ λͺ¨λμμ μ¬μ©ν ν μ€νΈ | ./run_cli.sh inference "Hello" |
--model |
Monster λλ Odyssey μ ν |
--model Monster |
--tokens |
μμ±ν μ΅λ ν ν° μ | --tokens 200 |
Example:
# Monster λͺ¨λΈλ‘ 'Once upon a time' μΆλ‘ μλ
python3 cli.py inference "Once upon a time" --model Monster --tokens 50run.shλ₯Ό ν΅ν΄ μ€ννλ©°, μ λ¬Έμ μΈ ν
λ λ©νΈλ¦¬λ₯Ό μ 곡ν©λλ€. / Run via run.sh, providing professional telemetry.
- π METRICS ν: λνμ κ·Έλνλ₯Ό ν΅ν΄ μ€μκ° νμ΅ μμ€(Loss) λ° μ±λ₯ μ§νλ₯Ό λͺ¨λν°λ§ν©λλ€.
- π INFERENCE ν: ν μ€νΈλ₯Ό μ λ ₯νκ³ λͺ¨λΈμ μ€μκ° μμ± κ²°κ³Όλ₯Ό νμΈν©λλ€.
- βοΈ Active Model Selector: λλ‘λ€μ΄μ ν΅ν΄ μμ§μ μ¦μ μ νν μ μμ΅λλ€.
- π‘οΈ Auto-Healing: κ°μ€μΉκ° μμ λ μΆλ‘ μ λλ₯΄λ©΄ νλ¨ λ‘κ·Έμ κ²½κ³ κ° λ¨κ³ μλμΌλ‘ λ³΅κ΅¬κ° μμλ©λλ€.
slmakerλ μ¬μμ λ°λΌ λ κ°μ§ μμ§μ μ 곡ν©λλ€. GUIμ λλ‘λ€μ΄ μ νκΈ° λλ CLIμ μΈμλ₯Ό ν΅ν΄ μμ½κ² μ νν μ μμ΅λλ€. / slmaker offers two engines. Switch easily via the GUI dropdown or CLI arguments.
-
Monster (4.5M Lite):
- μ©λ: μ΄κ³ μ νμ΅ λ° μΆλ‘ , λ°λͺ¨μ©. / Ultra-fast training & inference, for demos.
- μ₯μ : 4.5M νλΌλ―Έν°λ‘ μ΄λ€ CPUμμλ μ§μ° μμ΄ λμ. / 4.5M params, zero latency on any CPU.
- GUI: 'Active Model' λλ‘λ€μ΄μμ Monster μ ν. / Select Monster in the 'Active Model' dropdown.
- CLI: `python3 cli.py --model Monster`
-
Odyssey (1.2B Pro):
- μ©λ: 본격μ μΈ μΈμ΄ μ§λ₯ μ€ν, ν/μ/μ½λ μ§μ. / Serious LLM experiments, KR/EN/Code support.
- μ₯μ : 1.2B νλΌλ―Έν°, SSD λ§€ν κΈ°μ λ‘ 4GB RAMμμ ꡬλ κ°λ₯. / 1.2B params, runs on 4GB RAM via SSD-mapping.
- GUI: 'Active Model' λλ‘λ€μ΄μμ Odyssey μ ν. / Select Odyssey in the 'Active Model' dropdown.
- CLI: `python3 cli.py --model Odyssey`
Important
λͺ¨λΈ μ ν μ κ°μ€μΉ κ΅¬μ‘°κ° λ³κ²½λλ―λ‘ κΈ°μ‘΄ νμ΅ λ°μ΄ν°(.pth λ±)λ νΈνλμ§ μμ μ μμ΅λλ€. / Switching models changes weight structure; existing weights may not be compatible.