Skip to content

v0.1.0

@nshkrdotcom nshkrdotcom tagged this 26 Dec 02:43
This commit introduces the CrucibleTrain library, a comprehensive machine
learning training orchestration framework designed for the BEAM ecosystem.
CrucibleTrain provides platform-agnostic infrastructure for supervised
fine-tuning, reinforcement learning, preference optimization (DPO), and model
distillation.

Key features included in this release:

1. Training Orchestration
   - Supervised Learning (SFT) loop with configurable optimizers and scheduling.
   - Reinforcement Learning (RL) loop supporting environment rollouts and PPO.
   - Direct Preference Optimization (DPO) loop for preference learning.
   - Distillation support for on-policy teacher-student training.

2. Model Rendering System
   - Robust renderer architecture for converting messages to token sequences.
   - Support for major model families: Llama 3, Qwen 2.5/3, DeepSeek V3,
     Kimi K2, and GPT-OSS.
   - Handling of special tokens, chat templates, and tool calls.
   - Configurable training targets (e.g., train on last assistant message only).

3. Data Management
   - Unified type system including Datum, ModelInput, and TensorData.
   - Dataset abstractions for memory-efficient batching and shuffling.
   - Support for multimodal inputs (text and image chunks).

4. Infrastructure and Integration
   - Ports and Adapters architecture for swappable backends (TrainingClient,
     VectorStore, BlobStore, HubClient).
   - Integration with the Crucible pipeline framework via Stage implementations.
   - Multiplexed logging system supporting JSONL, console, and custom backends.

5. Utilities and QA
   - Deterministic PCG64 PRNG for NumPy-compatible reproducibility.
   - Parity instrumentation tools to verify behavior against Python reference
     implementations.
   - Comprehensive test suite including mock tokenizers and renderers.

This foundation enables building complex, distributed ML workflows in Elixir
while maintaining compatibility with existing tensor frameworks and training
backends.
Assets 2
Loading