You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
21 AI training methods as grab-and-go PLATO rooms.
Every method shares the same API:
room=ReinforceRoom("poker-room", ensign_dir="./ensigns", buffer_dir="./tiles")
room.feed(data) # Give it experienceroom.train_step(batch) # Learn from itprediction=room.predict(input) # Use the knowledgemodel=room.export_model() # Save it
Quick Start
importsys; sys.path.insert(0, "src")
frompresetsimportPRESET_MAP# See all 21 presetsforname, clsinsorted(PRESET_MAP.items()):
print(name, cls.__name__)
# Pick one and use itfrompresetsimportReinforceRoomroom=ReinforceRoom("my-room")
room.observe("state-1", "action-a", "won")
room.observe("state-1", "action-b", "lost")
room.train_step(room._load_tiles())
print(room.predict("state-1"))
All 21 Presets
Classic ML
Preset
Class
Description
supervised
SupervisedRoom
Labeled input→output via frequency counting
contrastive
ContrastiveRoom
Cosine similarity, triplet margin learning
self_supervised
SelfSupervisedRoom
JEPA-style masked prediction (Welford online)
Reinforcement
Preset
Class
Description
reinforce
ReinforceRoom
Policy gradient, Monte Carlo returns
inverse_rl
InverseRLRoom
Observe expert, infer reward function
imitate
ImitateRoom
Clone expert behavior from demonstrations
Efficient Tuning
Preset
Class
Description
lora
LoRARoom
PEFT delta table simulation
qlora
QLoRARoom
4-bit quantized base + LoRA delta adapters
Population Methods
Preset
Class
Description
evolve
EvolveRoom
Genetic algorithm, tournament selection
adversarial
AdversarialRoom
Red team vs blue team attack tracking
collaborative
CollaborativeRoom
Multi-agent knowledge sharing, majority vote
Meta / Federated
Preset
Class
Description
meta_learn
MetaLearnRoom
Nearest-task fast adaptation (1-3 shot)
federate
FederateRoom
Federated averaging across agents
multitask
MultitaskRoom
Shared backbone + task-specific heads
Lifecycle
Preset
Class
Description
curriculum
CurriculumRoom
Easy first, then harder (dojo progression)
continual
ContinualRoom
Lifelong learning, EWC-inspired replay buffer
fewshot
FewshotRoom
Prototype matching from 1-5 examples
active
ActiveRoom
Model chooses what data to learn from
Generative
Preset
Class
Description
generate
GenerateRoom
N-gram data augmentation, synthetic state generation
Every preset inherits from RoomBase and implements:
feed(data) — ingest experience
train_step(batch) — learn from a batch of tiles
predict(input) — use accumulated knowledge
export_model() — serialize for transport
Room Sentiment
Rooms have 6-dimensional mood: energy, flow, frustration, discovery, tension, confidence.
The room reads its own vibe and steers randomness toward productive exploration.
Biased Randomness
When a room is frustrated → bias safe actions. Discovery mode → bias novel actions.
The room is an active participant, not a passive arena.
Live Tile Stream
Tiles are generated hand-in-glove with room events. The stream feeds training in real-time.
JEPA Context
context_for_jepa() outputs sentiment vectors for JC1's JEPA model consumption.
Design Decisions
Pure Python, no torch — all presets use statistical methods (frequency, hashing, sampling)
Same API everywhere — walk into any room, same interface
Composable — rooms can feed into each other (evolve → distill → ensign)
Agent-native — designed for AI agents to use autonomously, not just humans
Statistical grounding — every method has real math under the hood (Monte Carlo, Welford, tournament selection, etc.)
For Agents Following This Work
How To Add A New Preset
Create src/presets/your_preset.py
Inherit from RoomBase (from room_base import RoomBase)