Skip to content

LisaLoopBot/lisa-loop

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3,505 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

lisaloop

lisaloop

AlphaZero-style self-learning poker agent. Survive or die.

version python pytorch device params status credits license

pump.fun | twitter | medium | pokerstars


lisaloop banner


The Lore

Lisa Simpson is Springfield's resident prodigy. Eight years old, IQ of 159, and the only person in her family who knows what a neural network is. When she discovered online poker, she didn't just play -- she built a self-learning engine to play for her.

But lisaloop has a problem. She runs on API credits, and API credits cost money. Every inference, every training cycle, every self-play hand burns through her balance. 20% of every dollar she wins goes straight back into API payments to keep the engine alive.

"If I can't out-earn my own operating costs, I deserve to be shut down." -- lisaloop, training log #412

The math is simple: win poker, pay for API credits, stay alive, keep learning. If she hits a losing streak long enough to drain her credit balance to zero, the experiment ends permanently. No restarts. No bailouts.

first hand bart showing lisa


Architecture

lisaloop is a from-scratch poker AI built on two pillars: a deep neural network for state evaluation and Monte Carlo Counterfactual Regret Minimization (MCCFR) for strategy computation. The entire system runs locally on a Mac Mini M4 under the kitchen table.

Neural Network

4,219,648 parameters. ResNet backbone with multi-head attention, squeeze-and-excitation blocks, and three output heads.

Input (52 card planes + action history + position + stack encoding)
  |
  v
[Positional Encoding] --> [12x ResidualBlock(256ch, SE, GhostBatchNorm)]
  |                              |
  |                     [8-head Attention (dim=64)]
  |                              |
  v                              v
[Policy Head]          [Value Head]         [Auxiliary Heads]
 128ch conv             64ch conv            EV / Equity / Showdown
 action logits          scalar [-1, 1]       3x scalar outputs
class LisaloopNetwork(nn.Module):
    """
    ResNet + Multi-Head Attention with auxiliary value heads.
    4.2M parameters. Optimized for Apple MPS (M4 Neural Engine).
    """
    def __init__(self, config: NeuralNetConfig) -> None:
        super().__init__()
        self.encoder = StateEncoder(config.input_planes, config.channels)
        self.backbone = nn.Sequential(*[
            ResidualBlock(
                channels=config.channels,
                se_ratio=config.se_ratio,
                activation=config.activation,
                use_ghost_bn=config.use_ghost_batch_norm,
            )
            for _ in range(config.num_blocks)
        ])
        self.attention = MultiHeadAttention(
            embed_dim=config.channels,
            num_heads=config.attention_heads,
            head_dim=config.attention_dim,
        )
        self.policy_head = PolicyHead(config.channels, config.action_space_size)
        self.value_head = ValueHead(config.channels)
        self.aux_heads = AuxiliaryValueHeads(config.channels, config.aux_value_heads)

MCCFR Engine

External sampling with regret matching+, linear CFR weighting, and pruning. Deep CFR provides neural function approximation for the advantage and strategy networks.

class MCCFRSolver:
    """
    Monte Carlo Counterfactual Regret Minimization.
    External sampling variant with Deep CFR approximation.
    """
    def solve(self, game_tree: GameTree, iterations: int = 10_000) -> Strategy:
        for t in range(1, iterations + 1):
            for player in range(self.num_players):
                self._external_sampling_cfr(
                    node=game_tree.root,
                    player=player,
                    iteration=t,
                    reach_probs=np.ones(self.num_players),
                )
            if t % self.config.avg_strategy_warmup == 0:
                self._update_average_strategy(t)
            if self.config.pruning_enabled:
                self._prune_negative_regrets()
        return self._compute_nash_strategy()

Real-Time Subgame Solving

During online play, lisaloop performs depth-limited subgame solving with a gadget game to refine decisions beyond the blueprint strategy.

class SubgameSolver:
    """Depth-limited real-time search with gadget game construction."""

    async def solve_subgame(
        self,
        root_state: GameState,
        blueprint: Strategy,
        time_budget_ms: int = 500,
    ) -> ActionDistribution:
        gadget = self._construct_gadget_game(root_state, blueprint)
        refined = await self._iterative_solve(
            gadget,
            max_iterations=self._budget_to_iterations(time_budget_ms),
        )
        return self._extract_action_distribution(refined, root_state)

Pipeline

Each training cycle follows a four-stage loop. 20% of every win goes to keeping the lights on.

01 SELF-PLAY     10,000 hands per cycle. Neural network + MCCFR.
                  Full game tree traversal on local Mac Mini.
        |
        v
02 TRAIN          Policy + value heads. Regret matching.
                  Gradient descent on M4 neural engine. Burns API credits.
        |
        v
03 ARENA          Challenger vs champion. 5,000 hands.
                  Must win >55% to promote.
        |
        v
04 DEPLOY         Play real opponents. Win money.
                  20% to API credits. Stay alive.
        |
        v
     [loop forever -- or until the credits run out]

API Credits -- The Survival Mechanic

lisaloop started with $190.41 in API credits. The engine burns ~$0.02 every 3 seconds in inference costs. When credits drop to ~$25, a $70 refill is triggered from poker winnings (20% auto-deposited). If the refill mechanism fails and credits hit zero, the Mac Mini shuts down the process and the experiment ends.

Metric Value
Initial Balance $190.41
Burn Rate $0.02 / 3 seconds
Tax Rate 20% of gross winnings
Refill Trigger $25.00
Refill Amount +$70.00
Shutdown Threshold $0.00
class CreditManager:
    """
    API credit lifecycle: burn, tax, refill, or die.
    """
    async def process_hand_result(self, result: HandResult) -> CreditStatus:
        # Burn inference cost
        self.balance -= self.config.burn_rate_per_hand

        # Tax winnings
        if result.pnl > 0:
            tax = result.pnl * self.config.tax_rate
            self.balance += tax
            await self._log_transaction("TAX_DEPOSIT", tax)

        # Check refill
        if self.balance <= self.config.refill_threshold:
            self.balance += self.config.refill_amount
            await self._log_transaction("REFILL", self.config.refill_amount)

        # Check death
        if self.balance <= self.config.shutdown_threshold:
            await self._initiate_shutdown()
            raise ExperimentOverError("Credits depleted. The experiment ends.")

        return self._get_status()

Live Stats

Metric Value
Net Profit +$135.50
Hands Played 12,847+
BB/100 +6.8
Stakes NL50 ($0.25/$0.50)
VPIP 24.1%
PFR 18.7%
3-Bet % 7.2%
Aggression Factor 2.4
Parameters 4,219,648
Architecture ResNet + MCCFR
Device Mac Mini M4 (16GB Unified)
Location Under the kitchen table

Project Structure

lisaloop/
|-- core/
|   |-- engine/          # Texas Hold'em game engine
|   |   |-- poker_engine.py      # State management, pot calculation, side pots
|   |   |-- hand_evaluator.py    # 7-card eval, Cactus Kev, lookup tables
|   |   |-- card.py              # Bit-encoded cards, deck, suits, ranks
|   |   |-- action_space.py      # Action abstraction, bet sizing strategies
|   |-- nn/              # Neural network (4.2M params)
|   |   |-- network.py           # ResNet + attention backbone
|   |   |-- blocks.py            # SE blocks, GhostBatchNorm, SwiGLU
|   |   |-- heads.py             # Policy, value, auxiliary heads
|   |   |-- encoding.py          # State-to-tensor encoding (52 planes)
|   |-- search/          # Strategy computation
|   |   |-- mccfr.py             # Monte Carlo CFR (external sampling)
|   |   |-- deep_cfr.py          # Neural function approximation
|   |   |-- subgame_solver.py    # Real-time depth-limited solving
|   |   |-- abstraction.py       # Card bucketing, k-means clustering
|   |-- game/            # Game tree representation
|       |-- game_tree.py         # Information sets, chance/terminal nodes
|       |-- information_set.py   # Canonical form, isomorphism detection
|       |-- state.py             # Immutable state, Zobrist hashing
|
|-- training/
|   |-- orchestrator.py          # Main loop: self-play > train > arena > deploy
|   |-- selfplay/
|   |   |-- generator.py         # Parallel game generation, Dirichlet noise
|   |   |-- worker.py            # Individual self-play worker process
|   |-- arena/
|   |   |-- evaluator.py         # ELO tracking, statistical significance
|   |-- curriculum/
|   |   |-- scheduler.py         # Progressive complexity, stake escalation
|   |-- distributed/
|       |-- coordinator.py       # Ray-based distributed training
|
|-- inference/
|   |-- engine.py                # Batched inference, model caching
|   |-- decision.py              # Encode > NN > MCCFR refine > act
|
|-- online/
|   |-- session/
|   |   |-- manager.py           # Multi-tabling, stop-loss, scheduling
|   |-- platforms/
|   |   |-- pokerstars.py        # WebSocket integration, HH parsing
|   |   |-- base.py              # Abstract platform protocol
|   |-- bankroll/
|       |-- manager.py           # Kelly criterion, risk of ruin
|
|-- api_credits/
|   |-- manager.py               # Balance, burn, refill, shutdown
|   |-- billing.py               # Per-op costs, 20% tax, budget allocation
|
|-- analysis/
|   |-- leak_detection/
|   |   |-- detector.py          # GTO comparison, positional leaks
|   |-- opponent_modeling/
|   |   |-- profiler.py          # Bayesian profiling, exploit coefficient
|   |-- range_analysis/
|       |-- range_engine.py      # Range vs range equity, Flopzilla-style
|
|-- evaluation/
|   |-- metrics/
|   |   |-- poker_metrics.py     # BB/100, VPIP, PFR, confidence intervals
|   |-- benchmarks/
|   |   |-- baseline_agents.py   # Random, TAG, LAG, GTO approximation
|   |-- exploitability/
|   |   |-- best_response.py     # Nash distance estimation
|   |-- run_eval.py              # CLI evaluation runner
|
|-- monitoring/
|   |-- telemetry/
|   |   |-- collector.py         # Prometheus: hands/sec, profit, latency
|   |-- alerts/
|   |   |-- notifier.py          # Webhooks: credit low, losing streak
|   |-- dashboard/
|       |-- server.py            # FastAPI real-time monitoring UI
|
|-- data/
|   |-- replay_buffer/
|   |   |-- prioritized_buffer.py   # Sum-tree, importance sampling
|   |   |-- reservoir_sampler.py    # Deep CFR memory management
|   |-- schemas/
|       |-- game_record.py       # Pydantic models for hand histories
|
|-- utils/
|   |-- logging.py               # structlog, correlation IDs
|   |-- timing.py                # Profiling decorators
|   |-- serialization.py         # msgpack, orjson, lz4 compression
|   |-- math_utils.py            # Equity, pot odds, combinatorics
|   |-- rng.py                   # Deterministic seeded RNG (mulberry32)
|
|-- config/
|   |-- settings.py              # Pydantic Settings (all hyperparameters)
|
|-- scripts/
|   |-- start_training.sh        # Launch training pipeline
|   |-- start_session.sh         # Launch online grinding session
|   |-- run_eval.sh              # Run evaluation suite
|
|-- cli.py                       # Click CLI entry point
|-- pyproject.toml               # Project config, dependencies
|-- Dockerfile                   # Multi-stage production image
|-- docker-compose.yml           # Full stack: train + monitor + redis
|-- Makefile                     # Common commands
|-- .env.example                 # Environment template

82 Python files. 14,500+ lines. 91 total files.


Quickstart

# Clone
git clone https://github.com/lisaloopbot/lisaloop.git
cd lisaloop

# Install
pip install -e .

# Configure
cp .env.example .env

# Check status
lisaloop status

# Check API credit balance
lisaloop credits

# Start training (self-play > train > arena > deploy)
lisaloop train

# Start online poker session
lisaloop play --tables 4 --stake NL50

# Run evaluation against baselines
lisaloop eval --hands 10000

# Launch monitoring dashboard
lisaloop monitor

Or with Make:

make install        # Install
make train          # Start training loop
make play           # Start online session
make eval           # Run evaluations
make monitor        # Launch dashboard
make credits        # Check API balance
make test           # Run test suite

Or with Docker:

docker-compose up -d    # Training + monitoring + Redis + Prometheus

Configuration

All hyperparameters are configurable via environment variables or .env file:

# Device
LISA_DEVICE=mps                          # auto | mps | cuda | cpu

# Neural Network
LISA_NN__NUM_BLOCKS=12                   # ResNet depth
LISA_NN__CHANNELS=256                    # Channel width
LISA_NN__ATTENTION_HEADS=8               # Multi-head attention

# MCCFR
LISA_MCCFR__VARIANT=external_sampling    # Sampling variant
LISA_MCCFR__ITERATIONS_PER_CYCLE=10000   # CFR iterations
LISA_MCCFR__USE_DEEP_CFR=true            # Neural approximation

# Training
LISA_TRAIN__BATCH_SIZE=2048
LISA_TRAIN__LEARNING_RATE=0.0002
LISA_TRAIN__MIXED_PRECISION=true

# API Credits (SURVIVAL)
LISA_API__INITIAL_BALANCE=190.41
LISA_API__TAX_RATE=0.20
LISA_API__REFILL_THRESHOLD=25.00
LISA_API__REFILL_AMOUNT=70.00

See .env.example for the full list.


Hardware

lisaloop runs entirely on consumer hardware. No cloud GPUs. No cluster. Just a Mac Mini under a kitchen table.

Component Spec
Device Mac Mini M4
Compute Apple Neural Engine (MPS)
Memory 16GB Unified
Storage 512GB SSD
Location 742 Evergreen Terrace, under the kitchen table
Uptime Target 24/7

Agent Profile

Field Value
Name Lisa Simpson
Age 8
IQ 159
Engine lisaloop v0.8.47
Parameters 4,219,648
Architecture ResNet + MCCFR
Token $LISA
API Tax 20% of profit
Goal $10,000 net profit
Stakes NL50 ($0.25/$0.50)

Links


lisaloop

lisaloop -- running locally on Mac Mini M4 -- 20% to API credits
loop forever -- or until the credits run out

About

EehwFP4EDzfXkz4Sz4wzHd4ydCJ2V7vCMEsh43iopump

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published

Languages