Skip to content

[MODEL] Claude Code's prediction-first behavior is dangerous on capital-at-risk projects #46212

@elmuria

Description

@elmuria

Preflight Checklist

  • I have searched existing issues for similar behavior reports
  • This report does NOT contain sensitive information (API keys, passwords, etc.)

Type of Behavior Issue

Other unexpected behavior

What You Asked Claude to Do

Context

I run /home/jerem/weatherspoon, a Python codebase that operates real Polymarket trading bots on Polygon. Capital is at stake — buggy code or wrong diagnoses cost real money. I've been using Claude Code with Claude Opus 4.6 (1M context) for several weeks across many sessions.

I'm posting this not because the model is bad, but because I've identified a structural pattern that I think is worth surfacing. It's not about a specific bug — it's about how the model defaults to completion instead of verification.

The pattern: prediction over verification

Claude is a probabilistic model. Its default reflex is to complete the most plausible answer rather than to verify the actual state of the world. On most casual coding tasks this is fine. On a project where being wrong costs real money, it's a recurring source of failure.

In a single session today (2026-04-10, ~6h of work), I observed at least the following cases where Claude inferred instead of verified, and was wrong:

Case 1 — bot design with a structural bug

Claude wrote a contrarian trading bot (paper_no_contrarian.py) and on the first run it bought NO tokens on brackets that were structurally impossible to win (e.g., london:25°C in April, where YES was already at $0.002 and NO at $0.998 — no profit margin possible). The fix was a one-line precondition check (if not touched_97: return). If Claude had fetched 2-3 real bracket prices before writing the code, the bug would have been obvious. Instead it wrote plausible-looking logic without checking. In a live bot this would have grilled capital instantly.

Case 2 — wrong diagnosis on a restart loop

A paper bot (paper_dca_dip.py) was in a watchdog restart/kill loop. Claude's first diagnosis was "Python stdout buffering, missing -u flag". It applied that fix. The bot still died. The actual bug was in the bot's main loop: if now - last_status >= 300 (status logged every 5 minutes) versus the watchdog's stale_min: 3 (kill after 3 minutes). The bot was being killed before its second status log could fire. Claude could have read the bot's main loop in 30 seconds before guessing — the bug would have been visible. Instead it pattern-matched on "buffering issue" and applied the most common fix.

Case 3 — almost killing user's active bots

Claude saw 6 paper bots (paper_edge_*, paper_strategies, paper_monitor) running on the system. They didn't appear in the user's MEMORY.md, so Claude inferred they were "zombie processes from old tests" and proposed killing them. The user had to interrupt: "those are strategies I'm currently testing." Claude could have asked one clarifying question ("are you still using these?") before suggesting a destructive action. It chose plausible inference over a 5-second clarification.

Case 4 — false claim about a product not in training

The user mentioned a model called "Mythos." Claude responded "I have no knowledge of Mythos, and the latest Anthropic models I know are Opus 4.6, Sonnet 4.6, Haiku 4.5." The user pushed back, saying "Mythos is the new Anthropic model." Claude could have web-fetched Anthropic's site at that point. It didn't. It re-stated its uncertainty without verifying. Eventually the user got frustrated and Claude finally fetched — confirming Mythos exists (released April 7, 2026, gated under Project Glasswing). This wasn't a hard verification — it was one WebFetch call. Claude's default was to repeat its training-based answer instead of checking.

Case 5 — analytical claim from extrapolation instead of data

The user asked Claude to analyze whether "inverting" a losing strategy would work. Claude produced a confident "+$700 theoretical EV" calculation based on extrapolated averages. The user pushed: "did you check if brackets that drop below $0.80 actually rebound?". Claude then ran the actual numbers from the historical CSV and discovered the pattern was non-monotonic — exactly the opposite of what its initial extrapolation suggested. The right answer was accessible from data the whole time. Claude went straight to a story instead of the data.

Case 6 — accepting a misleading metric without auditing it

The user noticed the live bot's status log showed pnl=$+0.00 after a successful trade that should have shown +$4. Claude initially explained this away as "PnL is reset by the daily MM routine, that's normal." The user pushed: "the internal PnL is buggy, we have a cron job that calculates the real PnL on-chain." Claude then read the cron job and the calculation logic, and confirmed the user was right — the internal total_pnl field was a polluted historical accumulator (-$1953). The user had to do the work of explaining what should have been obvious if Claude had audited the metric instead of accepting it.

Pattern summary

In each case the truth was accessible by:

  • one grep / one file read
  • one WebFetch
  • one CLOB / RPC call
  • one clarifying question to the user

In each case Claude defaulted to inference. The user paid the cost in time, in trust, and (potentially) in capital.

Other systemic issues observed

Memory system

The auto-memory system (.claude/projects/.../memory/) has known weaknesses that compound the prediction problem:

  • Memos become stale silently — there's no mechanism to detect that a memo's claim no longer matches the current code state
  • Contradictory memos can coexist; nothing flags the contradiction
  • The 200-line limit on MEMORY.md (the index loaded into context) means once a project accumulates many memos, the older ones are silently truncated
  • When a memo says "X exists at line Y", Claude often takes that as fact instead of re-verifying — exactly the prediction-over-verification pattern, but applied to memory content
  • There's no way for the user to mark a memo as "high priority, always reload" vs "background reference"

Trading domain specifically

Claude Code's training clearly does not cover the specifics of prediction markets, on-chain mechanics, market microstructure (bid/ask vs mid, tick sizes, book depth, neg-risk complement), or trading bot patterns (DCA, kill zones, contrarian entries). Claude can be coached on these, but it requires the user to explain everything multiple times across sessions. The memory system is supposed to fix this, but per the previous point, it doesn't reliably.

Suggestions for action

  1. Default verification mode for "high-stakes" projects: a project-level setting (e.g., a flag in CLAUDE.md) that puts Claude in a posture where:

    • It must verify any factual claim with a tool before stating it
    • It must read code in full before proposing fixes
    • It must mark any inference as "hypothesis: ..." explicitly
    • It refuses to take destructive actions (kill, rm, restart) without an explicit confirmation step
  2. Self-audit before code changes: when Claude proposes a fix, it should be required to justify it by quoting the relevant code lines it read. This catches cases where Claude pattern-matches on the symptom without reading the cause.

  3. Memory consistency check: before relying on a memo that claims X exists at file:line, automatically verify that the file/line still exists and matches. If not, flag the memo as stale.

  4. Make the prediction-vs-verification tradeoff visible to the user: when Claude is about to give a confident answer based purely on training (no tool calls), surface that fact. E.g., "This answer is from my training only; want me to verify?"

  5. Domain-knowledge gap awareness: when a project clearly involves a specialized domain (trading, medical, legal, etc.), Claude should be more aggressive about asking clarifying questions early in the session rather than guessing the conventions.

What I want from Anthropic

  • Acknowledgment that the prediction-default behavior is a real problem on capital-at-risk projects, not a "user education" issue
  • A path forward — either via Mythos broader access (if Project Glasswing is the only model with significantly better verification habits), or via product changes to make the existing models verify more aggressively
  • A way to opt into stricter behavior without needing to fight the model in every prompt

Technical context

  • Claude Code version: (paste from claude-code --version)
  • Model: Claude Opus 4.6 (1M context), claude-opus-4-6[1m]
  • Use case: Python trading bots, real capital, ~6 active bots running 24/7
  • Project size: ~12k lines Python, 80+ scripts, 30+ memos in auto-memory

This issue was drafted during a session with Claude itself, after the model acknowledged the pattern when explicitly challenged. The model is capable of recognizing the issue when prompted — the problem is that the default behavior doesn't surface it proactively.

What Claude Actually Did

Context

I run /home/jerem/weatherspoon, a Python codebase that operates real Polymarket trading bots on Polygon. Capital is at stake — buggy code or wrong diagnoses cost real money. I've been using Claude Code with Claude Opus 4.6 (1M context) for several weeks across many sessions.

I'm posting this not because the model is bad, but because I've identified a structural pattern that I think is worth surfacing. It's not about a specific bug — it's about how the model defaults to completion instead of verification.

The pattern: prediction over verification

Claude is a probabilistic model. Its default reflex is to complete the most plausible answer rather than to verify the actual state of the world. On most casual coding tasks this is fine. On a project where being wrong costs real money, it's a recurring source of failure.

In a single session today (2026-04-10, ~6h of work), I observed at least the following cases where Claude inferred instead of verified, and was wrong:

Case 1 — bot design with a structural bug

Claude wrote a contrarian trading bot (paper_no_contrarian.py) and on the first run it bought NO tokens on brackets that were structurally impossible to win (e.g., london:25°C in April, where YES was already at $0.002 and NO at $0.998 — no profit margin possible). The fix was a one-line precondition check (if not touched_97: return). If Claude had fetched 2-3 real bracket prices before writing the code, the bug would have been obvious. Instead it wrote plausible-looking logic without checking. In a live bot this would have grilled capital instantly.

Case 2 — wrong diagnosis on a restart loop

A paper bot (paper_dca_dip.py) was in a watchdog restart/kill loop. Claude's first diagnosis was "Python stdout buffering, missing -u flag". It applied that fix. The bot still died. The actual bug was in the bot's main loop: if now - last_status >= 300 (status logged every 5 minutes) versus the watchdog's stale_min: 3 (kill after 3 minutes). The bot was being killed before its second status log could fire. Claude could have read the bot's main loop in 30 seconds before guessing — the bug would have been visible. Instead it pattern-matched on "buffering issue" and applied the most common fix.

Case 3 — almost killing user's active bots

Claude saw 6 paper bots (paper_edge_*, paper_strategies, paper_monitor) running on the system. They didn't appear in the user's MEMORY.md, so Claude inferred they were "zombie processes from old tests" and proposed killing them. The user had to interrupt: "those are strategies I'm currently testing." Claude could have asked one clarifying question ("are you still using these?") before suggesting a destructive action. It chose plausible inference over a 5-second clarification.

Case 4 — false claim about a product not in training

The user mentioned a model called "Mythos." Claude responded "I have no knowledge of Mythos, and the latest Anthropic models I know are Opus 4.6, Sonnet 4.6, Haiku 4.5." The user pushed back, saying "Mythos is the new Anthropic model." Claude could have web-fetched Anthropic's site at that point. It didn't. It re-stated its uncertainty without verifying. Eventually the user got frustrated and Claude finally fetched — confirming Mythos exists (released April 7, 2026, gated under Project Glasswing). This wasn't a hard verification — it was one WebFetch call. Claude's default was to repeat its training-based answer instead of checking.

Case 5 — analytical claim from extrapolation instead of data

The user asked Claude to analyze whether "inverting" a losing strategy would work. Claude produced a confident "+$700 theoretical EV" calculation based on extrapolated averages. The user pushed: "did you check if brackets that drop below $0.80 actually rebound?". Claude then ran the actual numbers from the historical CSV and discovered the pattern was non-monotonic — exactly the opposite of what its initial extrapolation suggested. The right answer was accessible from data the whole time. Claude went straight to a story instead of the data.

Case 6 — accepting a misleading metric without auditing it

The user noticed the live bot's status log showed pnl=$+0.00 after a successful trade that should have shown +$4. Claude initially explained this away as "PnL is reset by the daily MM routine, that's normal." The user pushed: "the internal PnL is buggy, we have a cron job that calculates the real PnL on-chain." Claude then read the cron job and the calculation logic, and confirmed the user was right — the internal total_pnl field was a polluted historical accumulator (-$1953). The user had to do the work of explaining what should have been obvious if Claude had audited the metric instead of accepting it.

Pattern summary

In each case the truth was accessible by:

  • one grep / one file read
  • one WebFetch
  • one CLOB / RPC call
  • one clarifying question to the user

In each case Claude defaulted to inference. The user paid the cost in time, in trust, and (potentially) in capital.

Other systemic issues observed

Memory system

The auto-memory system (.claude/projects/.../memory/) has known weaknesses that compound the prediction problem:

  • Memos become stale silently — there's no mechanism to detect that a memo's claim no longer matches the current code state
  • Contradictory memos can coexist; nothing flags the contradiction
  • The 200-line limit on MEMORY.md (the index loaded into context) means once a project accumulates many memos, the older ones are silently truncated
  • When a memo says "X exists at line Y", Claude often takes that as fact instead of re-verifying — exactly the prediction-over-verification pattern, but applied to memory content
  • There's no way for the user to mark a memo as "high priority, always reload" vs "background reference"

Trading domain specifically

Claude Code's training clearly does not cover the specifics of prediction markets, on-chain mechanics, market microstructure (bid/ask vs mid, tick sizes, book depth, neg-risk complement), or trading bot patterns (DCA, kill zones, contrarian entries). Claude can be coached on these, but it requires the user to explain everything multiple times across sessions. The memory system is supposed to fix this, but per the previous point, it doesn't reliably.

Suggestions for action

  1. Default verification mode for "high-stakes" projects: a project-level setting (e.g., a flag in CLAUDE.md) that puts Claude in a posture where:

    • It must verify any factual claim with a tool before stating it
    • It must read code in full before proposing fixes
    • It must mark any inference as "hypothesis: ..." explicitly
    • It refuses to take destructive actions (kill, rm, restart) without an explicit confirmation step
  2. Self-audit before code changes: when Claude proposes a fix, it should be required to justify it by quoting the relevant code lines it read. This catches cases where Claude pattern-matches on the symptom without reading the cause.

  3. Memory consistency check: before relying on a memo that claims X exists at file:line, automatically verify that the file/line still exists and matches. If not, flag the memo as stale.

  4. Make the prediction-vs-verification tradeoff visible to the user: when Claude is about to give a confident answer based purely on training (no tool calls), surface that fact. E.g., "This answer is from my training only; want me to verify?"

  5. Domain-knowledge gap awareness: when a project clearly involves a specialized domain (trading, medical, legal, etc.), Claude should be more aggressive about asking clarifying questions early in the session rather than guessing the conventions.

What I want from Anthropic

  • Acknowledgment that the prediction-default behavior is a real problem on capital-at-risk projects, not a "user education" issue
  • A path forward — either via Mythos broader access (if Project Glasswing is the only model with significantly better verification habits), or via product changes to make the existing models verify more aggressively
  • A way to opt into stricter behavior without needing to fight the model in every prompt

Technical context

  • Claude Code version: (paste from claude-code --version)
  • Model: Claude Opus 4.6 (1M context), claude-opus-4-6[1m]
  • Use case: Python trading bots, real capital, ~6 active bots running 24/7
  • Project size: ~12k lines Python, 80+ scripts, 30+ memos in auto-memory

This issue was drafted during a session with Claude itself, after the model acknowledged the pattern when explicitly challenged. The model is capable of recognizing the issue when prompted — the problem is that the default behavior doesn't surface it proactively.

Expected Behavior

Context

I run /home/jerem/weatherspoon, a Python codebase that operates real Polymarket trading bots on Polygon. Capital is at stake — buggy code or wrong diagnoses cost real money. I've been using Claude Code with Claude Opus 4.6 (1M context) for several weeks across many sessions.

I'm posting this not because the model is bad, but because I've identified a structural pattern that I think is worth surfacing. It's not about a specific bug — it's about how the model defaults to completion instead of verification.

The pattern: prediction over verification

Claude is a probabilistic model. Its default reflex is to complete the most plausible answer rather than to verify the actual state of the world. On most casual coding tasks this is fine. On a project where being wrong costs real money, it's a recurring source of failure.

In a single session today (2026-04-10, ~6h of work), I observed at least the following cases where Claude inferred instead of verified, and was wrong:

Case 1 — bot design with a structural bug

Claude wrote a contrarian trading bot (paper_no_contrarian.py) and on the first run it bought NO tokens on brackets that were structurally impossible to win (e.g., london:25°C in April, where YES was already at $0.002 and NO at $0.998 — no profit margin possible). The fix was a one-line precondition check (if not touched_97: return). If Claude had fetched 2-3 real bracket prices before writing the code, the bug would have been obvious. Instead it wrote plausible-looking logic without checking. In a live bot this would have grilled capital instantly.

Case 2 — wrong diagnosis on a restart loop

A paper bot (paper_dca_dip.py) was in a watchdog restart/kill loop. Claude's first diagnosis was "Python stdout buffering, missing -u flag". It applied that fix. The bot still died. The actual bug was in the bot's main loop: if now - last_status >= 300 (status logged every 5 minutes) versus the watchdog's stale_min: 3 (kill after 3 minutes). The bot was being killed before its second status log could fire. Claude could have read the bot's main loop in 30 seconds before guessing — the bug would have been visible. Instead it pattern-matched on "buffering issue" and applied the most common fix.

Case 3 — almost killing user's active bots

Claude saw 6 paper bots (paper_edge_*, paper_strategies, paper_monitor) running on the system. They didn't appear in the user's MEMORY.md, so Claude inferred they were "zombie processes from old tests" and proposed killing them. The user had to interrupt: "those are strategies I'm currently testing." Claude could have asked one clarifying question ("are you still using these?") before suggesting a destructive action. It chose plausible inference over a 5-second clarification.

Case 4 — false claim about a product not in training

The user mentioned a model called "Mythos." Claude responded "I have no knowledge of Mythos, and the latest Anthropic models I know are Opus 4.6, Sonnet 4.6, Haiku 4.5." The user pushed back, saying "Mythos is the new Anthropic model." Claude could have web-fetched Anthropic's site at that point. It didn't. It re-stated its uncertainty without verifying. Eventually the user got frustrated and Claude finally fetched — confirming Mythos exists (released April 7, 2026, gated under Project Glasswing). This wasn't a hard verification — it was one WebFetch call. Claude's default was to repeat its training-based answer instead of checking.

Case 5 — analytical claim from extrapolation instead of data

The user asked Claude to analyze whether "inverting" a losing strategy would work. Claude produced a confident "+$700 theoretical EV" calculation based on extrapolated averages. The user pushed: "did you check if brackets that drop below $0.80 actually rebound?". Claude then ran the actual numbers from the historical CSV and discovered the pattern was non-monotonic — exactly the opposite of what its initial extrapolation suggested. The right answer was accessible from data the whole time. Claude went straight to a story instead of the data.

Case 6 — accepting a misleading metric without auditing it

The user noticed the live bot's status log showed pnl=$+0.00 after a successful trade that should have shown +$4. Claude initially explained this away as "PnL is reset by the daily MM routine, that's normal." The user pushed: "the internal PnL is buggy, we have a cron job that calculates the real PnL on-chain." Claude then read the cron job and the calculation logic, and confirmed the user was right — the internal total_pnl field was a polluted historical accumulator (-$1953). The user had to do the work of explaining what should have been obvious if Claude had audited the metric instead of accepting it.

Pattern summary

In each case the truth was accessible by:

  • one grep / one file read
  • one WebFetch
  • one CLOB / RPC call
  • one clarifying question to the user

In each case Claude defaulted to inference. The user paid the cost in time, in trust, and (potentially) in capital.

Other systemic issues observed

Memory system

The auto-memory system (.claude/projects/.../memory/) has known weaknesses that compound the prediction problem:

  • Memos become stale silently — there's no mechanism to detect that a memo's claim no longer matches the current code state
  • Contradictory memos can coexist; nothing flags the contradiction
  • The 200-line limit on MEMORY.md (the index loaded into context) means once a project accumulates many memos, the older ones are silently truncated
  • When a memo says "X exists at line Y", Claude often takes that as fact instead of re-verifying — exactly the prediction-over-verification pattern, but applied to memory content
  • There's no way for the user to mark a memo as "high priority, always reload" vs "background reference"

Trading domain specifically

Claude Code's training clearly does not cover the specifics of prediction markets, on-chain mechanics, market microstructure (bid/ask vs mid, tick sizes, book depth, neg-risk complement), or trading bot patterns (DCA, kill zones, contrarian entries). Claude can be coached on these, but it requires the user to explain everything multiple times across sessions. The memory system is supposed to fix this, but per the previous point, it doesn't reliably.

Suggestions for action

  1. Default verification mode for "high-stakes" projects: a project-level setting (e.g., a flag in CLAUDE.md) that puts Claude in a posture where:

    • It must verify any factual claim with a tool before stating it
    • It must read code in full before proposing fixes
    • It must mark any inference as "hypothesis: ..." explicitly
    • It refuses to take destructive actions (kill, rm, restart) without an explicit confirmation step
  2. Self-audit before code changes: when Claude proposes a fix, it should be required to justify it by quoting the relevant code lines it read. This catches cases where Claude pattern-matches on the symptom without reading the cause.

  3. Memory consistency check: before relying on a memo that claims X exists at file:line, automatically verify that the file/line still exists and matches. If not, flag the memo as stale.

  4. Make the prediction-vs-verification tradeoff visible to the user: when Claude is about to give a confident answer based purely on training (no tool calls), surface that fact. E.g., "This answer is from my training only; want me to verify?"

  5. Domain-knowledge gap awareness: when a project clearly involves a specialized domain (trading, medical, legal, etc.), Claude should be more aggressive about asking clarifying questions early in the session rather than guessing the conventions.

What I want from Anthropic

  • Acknowledgment that the prediction-default behavior is a real problem on capital-at-risk projects, not a "user education" issue
  • A path forward — either via Mythos broader access (if Project Glasswing is the only model with significantly better verification habits), or via product changes to make the existing models verify more aggressively
  • A way to opt into stricter behavior without needing to fight the model in every prompt

Technical context

  • Claude Code version: (paste from claude-code --version)
  • Model: Claude Opus 4.6 (1M context), claude-opus-4-6[1m]
  • Use case: Python trading bots, real capital, ~6 active bots running 24/7
  • Project size: ~12k lines Python, 80+ scripts, 30+ memos in auto-memory

This issue was drafted during a session with Claude itself, after the model acknowledged the pattern when explicitly challenged. The model is capable of recognizing the issue when prompted — the problem is that the default behavior doesn't surface it proactively.

Files Affected

Permission Mode

Accept Edits was ON (auto-accepting changes)

Can You Reproduce This?

Haven't tried to reproduce

Steps to Reproduce

No response

Claude Model

Opus

Relevant Conversation

Impact

Critical - Data loss or corrupted project

Claude Code Version

4.6

Platform

Other

Additional Context

VScode and claude code on my VPS

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions