Skip to content

Implementation of Coinjoin Heuristics#76

Merged
soad003 merged 3 commits intodevelopfrom
feature-coinjoin-heuristics
Mar 26, 2026
Merged

Implementation of Coinjoin Heuristics#76
soad003 merged 3 commits intodevelopfrom
feature-coinjoin-heuristics

Conversation

@frederik-raphael
Copy link
Copy Markdown
Contributor

CoinJoin Heuristics

This PR adds structural heuristics for detecting CoinJoin transactions on
UTXO chains (Bitcoin). Heuristics are exposed via the
/txs/{tx_hash}/include_heuristics=all_coinjoin REST endpoint and computed on-demand.


Protocol Overview

Comparison table
Protocol Coordination Denomination Round size Script constraint Confidence
JoinMarket P2P (maker/taker order book) Inferred (most frequent output value) ≥ 2 participants All outputs distinct 20 (n=2) / 49 (n≥3)
Wasabi 1.0 Centralized coordinator (ZeroLink) Fixed ~0.1 BTC (±5 %) Variable All outputs distinct Dynamic (50–100)
Wasabi 1.1 Centralized coordinator (ZeroLink) Multi-level 2^i × 0.1 BTC Variable All outputs distinct Dynamic (scored)
Wasabi 2.0 Centralized coordinator (WabiSabi) Variable (WabiSabi) Large (≥ 50 inputs) All outputs distinct 60
Whirlpool CoinJoin Centralized coordinator (Samourai) Fixed (4 known pool sizes) Exactly 5 × 5 All inputs and outputs distinct 60 (structural) / 90 (lineage verified)
Whirlpool Tx0 Centralized coordinator (Samourai) Pool denomination + ε Variable (≤ 70 pre-mix) N/A (has OP_RETURN) 60 / 90 (forward-verified)
Key protocol differences

JoinMarket is a peer-to-peer, maker/taker protocol. It has no fixed
denomination — the denomination is simply the most frequent output value.
This makes the heuristic the most permissive: Wasabi 1.x and Whirlpool
CoinJoin transactions also satisfy its structural conditions.

Wasabi 1.0 (ZeroLink) uses a centralized coordinator and a fixed 0.1 BTC
denomination. Every participant gets exactly one equal-value output; change
goes to a single additional output per participant. One extra output carries
the coordinator fee.

Wasabi 1.1 extends ZeroLink with mixing levels: post-mix outputs appear
at powers-of-two multiples of the base denomination (0.1, 0.2, 0.4 BTC, …).
This allows participants to mix non-standard amounts by combining levels.

Wasabi 2.0 (WabiSabi) drops the fixed denomination entirely. The
coordinator negotiates a variable denomination set per round. Rounds are large
(≥ 50 inputs), all outputs are above a minimum value (5 000 sat), and a
denomination is identified as any output value that appears at least twice.

Whirlpool CoinJoin is highly constrained: exactly 5 inputs and 5 outputs,
all at a known pool denomination (100 k / 1 M / 5 M / 50 M sat). Inputs are
either remixers (value == d) or new entrants (value in (d, d + 100 000 sat]).
The protocol requires at least one of each.

Whirlpool Tx0 is not a CoinJoin — it is the pre-mix preparation step.
It splits funds into equal pre-mix UTXOs (d + ε), pays a coordinator fee, and
encodes pool selection in an OP_RETURN output. It is deliberately excluded from
the CoinJoin consensus signal.


API Changes

New include_heuristics values on GET /{currency}/txs/{tx_hash}

CoinJoin heuristics are opt-in via the existing include_heuristics query
parameter. The following values are new:

Value Effect
all_coinjoin Run all four CoinJoin heuristics
whirlpool Whirlpool CoinJoin + Tx0
wasabi All Wasabi versions (1.0, 1.1, 2.0)
wasabi_1_0 Wasabi 1.0 only
wasabi_1_1 Wasabi 1.1 only
wasabi_2_0 Wasabi 2.0 only
joinmarket JoinMarket only

Previously available values (one_time_change, direct_change,
multi_input_change, all_change, all) are unchanged.

Response shape

CoinJoin results are returned under heuristics.coinjoin_heuristics.
All sub-fields are optional and only present when the corresponding value
was requested.

GET /btc/txs/{tx_hash}?include_heuristics=all_coinjoin
{
  "tx_hash": "...",
  "heuristics": {
    "coinjoin_heuristics": {
      "consensus": {
        "detected": true,
        "confidence": 90,
        "sources": ["whirlpool_coinjoin"]
      },

      // Only present when detected (or when explicitly requested and not detected)
      "whirlpool_coinjoin": {
        "detected": true,
        "confidence": 90,           // 60 structural / 90 with lineage verification
        "pool_denomination_sat": 1000000,
        "n_remixers": 3,
        "n_new_entrants": 2
      },

      "whirlpool_tx0": {
        "detected": false,
        "confidence": 0,
        "pool_denomination_sat": 0,
        "n_premix_outputs": 0
      },

      "wasabi": {
        "detected": false,
        "confidence": 0,
        "version": "2.0",           // "1.0" | "1.1" | "2.0"
        "n_participants": 0,
        "denominations": []
      },

      "joinmarket": {
        "detected": false,
        "confidence": 0,
        "n_participants": 0,
        "denomination_sat": 0
      }
    }
  }
}

Consensus logic

consensus is the single field. It is computed
after all requested heuristics have run and reflects the combined signal:

  • Sources: joinmarket, wasabi, and whirlpool_coinjoin contribute.
    whirlpool_tx0 is intentionally excluded — it is a preparation transaction,
    not a privacy-enhancing mix.
  • Confidence: max(confidence) across all firing sources.
  • sources list: every protocol that fired, e.g. ["joinmarket", "wasabi"]
    when both match (JoinMarket is a structural superset, so this is expected).
  • Absent when nothing detected: consensus is null / omitted if no
    protocol fires. It is never {"detected": false, ...}.

Detection order and mutual exclusion:

  1. Wasabi 2.0 runs first. If it fires, Wasabi 1.x is skipped (2.0 is more
    specific — large rounds, no fixed denomination).
  2. Wasabi 1.x (_wasabi_11_heuristic) handles both 1.0 and 1.1; the returned
    version field distinguishes them based on whether mixing levels are active.
  3. JoinMarket, Whirlpool CoinJoin, and Whirlpool Tx0 are fully independent and
    run in parallel (no mutual exclusion). Multiple can fire for the same tx.
Example — Wasabi 2.0 + JoinMarket: tx 06f5b0ec... block 562 464 (56 inputs, 83 outputs)
// GET /btc/txs/06f5b0ec1b298bc96d557d3ff0370b0cd2733204ff8b85c3bfa63f5a15f6fb02
//     ?include_heuristics=all_coinjoin
{
  "tx_hash": "06f5b0ec1b298bc96d557d3ff0370b0cd2733204ff8b85c3bfa63f5a15f6fb02",
  "height": 562464,
  "no_inputs": 56,
  "no_outputs": 83,
  "heuristics": {
    "coinjoin_heuristics": {
      "consensus": {
        "detected": true,
        "confidence": 60,        // max(49 JoinMarket, 60 Wasabi 2.0)
        "sources": ["joinmarket", "wasabi"]
      },
      "wasabi": {
        "detected": true,
        "confidence": 60,
        "version": "2.0",
        "n_participants": 5,
        "denominations": [9961956, 9962612, 9978200, 19954262, 39908524, 79817048, 159634096]
      },
      "joinmarket": {
        "detected": true,
        "confidence": 49,
        "n_participants": 47,
        "denomination_sat": 9978200  // most frequent output value = Wasabi denomination
      }
      // whirlpool_coinjoin / whirlpool_tx0 omitted (not detected)
    }
  }
}

JoinMarket firing alongside Wasabi is expected: the most frequent Wasabi output
value satisfies the JoinMarket denomination condition. The consensus confidence
(60) reflects the more specific Wasabi 2.0 detection.

Example — Whirlpool + JoinMarket: tx c73f367c... block 716 576 (5 inputs, 5 outputs, 100 000 sat pool)
// GET /btc/txs/c73f367c8e51513515cb173900f7abe278dc330729bb55182f905b0fca15cd06
//     ?include_heuristics=all_coinjoin
{
  "tx_hash": "c73f367c8e51513515cb173900f7abe278dc330729bb55182f905b0fca15cd06",
  "height": 716576,
  "no_inputs": 5,
  "no_outputs": 5,
  "heuristics": {
    "coinjoin_heuristics": {
      "consensus": {
        "detected": true,
        "confidence": 60,        // max(49 JoinMarket, 60 Whirlpool)
        "sources": ["joinmarket", "whirlpool_coinjoin"]
      },
      "whirlpool_coinjoin": {
        "detected": true,
        "confidence": 60,        // structural only (no lineage verification in cache)
        "pool_denomination_sat": 100000,
        "n_remixers": 3,
        "n_new_entrants": 2
      },
      "joinmarket": {
        "detected": true,
        "confidence": 49,
        "n_participants": 5,
        "denomination_sat": 100000  // == pool denomination, satisfies JoinMarket condition
      }
    }
  }
}

The 100 000 sat Whirlpool pool denomination is also the most frequent output
value, so JoinMarket fires as well. Consensus confidence (60) reflects the more
specific Whirlpool detection.

Confidence values at a glance

Field Not detected Structural Verified
whirlpool_coinjoin.confidence 0 60 90 (lineage)
whirlpool_tx0.confidence 0 60 90 (forward)
wasabi.confidence 0 50–100
joinmarket.confidence 0 20 (n=2) / 49 (n≥3)
consensus.confidence max of firing sources

Caveats

JoinMarket is a superset

Every Wasabi 1.x and Whirlpool CoinJoin transaction structurally satisfies the
JoinMarket conditions (equal-value outputs, distinct scripts, majority
post-mix). The JoinMarket confidence cap (49) is intentionally kept below the
Wasabi and Whirlpool scores so the more specific detection takes precedence in
the consensus. When both fire, the consensus reports both sources.

JoinMarket n=2 false positives

A transaction with exactly two equal-value outputs satisfies the JoinMarket
structural check but can occur by coincidence (e.g. splitting funds equally).
This case is reported with confidence 20.

Wasabi 1.x vs 2.0 disambiguation

Wasabi 2.0 detection runs first. If it fires, the 1.x check is skipped.
This matters because a large Wasabi 2.0 round could in principle also look
like a multi-level Wasabi 1.1 transaction. The 2.0 path is more specific
(≥ 50 inputs, no fixed base denomination) and should be preferred.

Whirlpool confidence boost via lineage and forward verification

The structural Whirlpool checks yield confidence 60. Two additional
verifications can upgrade this:

  • Backward lineage (_verify_whirlpool_lineage): recursively checks that
    every input traces back to a Tx0 (new entrant) or a prior CoinJoin
    (remixer). A single failure anywhere in the input tree cancels the check.
  • Forward verification for Tx0 (_verify_tx0_forward): confirms that at
    least one pre-mix output was later spent in a Whirlpool CoinJoin. Raises
    Tx0 confidence to 90.

Both checks require the optional get_tx / get_spent_in DB callbacks.
If they are not provided (e.g. lightweight call path), structural confidence
(60) is returned as-is.


Known Limitations & Future Work

Exchange batch payout false positives (JoinMarket, Wasabi 1.0, Wasabi 1.1)

Problem: Exchange batch payouts can pass all structural checks for
JoinMarket, Wasabi 1.0, and Wasabi 1.1.

Solution:
Check for exchange tags, if they are present we ignore the match.

Statistics

coinjoin_cumsum
============================================================
Blocks scanned          :        781
Total transactions      :  1,783,857
Total candidates        :    421,695  (23.64% of all txs)
Sampled candidates      :    421,695
CoinJoin detections     :      1,070  (0.25% of candidates)
============================================================
  Whirlpool CoinJoin       :    419
  Whirlpool Tx0            :    109
  Wasabi                   :    162
  JoinMarket               :    840

Wasabi version breakdown:
shape: (3, 2)
┌────────────────┬───────┐
│ wasabi_version ┆ count │
│ ---            ┆ ---   │
│ str            ┆ u32   │
╞════════════════╪═══════╡
│ 1.0            ┆ 47    │
│ 1.1            ┆ 29    │
│ 2.0            ┆ 86    │
└────────────────┴───────┘

============================================================
Tag coverage (detected CoinJoins):
  Already tagged as coinjoin :    545 / 1,070 (50.9%)

    Whirlpool CoinJoin       :   404 /   419 (96.4%)
      → 1a9b7a0a8ff4984eb86328dd9a8e7b8d63bfd16d8484f8bb60fca9efc6336665 block=779040 in=5 out=5 pool=5,000,000sat
      → 78d1405c9a34615a30d75828f974f9cd5e520c872c57d29fbbeef7506694f847 block=783136 in=5 out=5 pool=1,000,000sat
    Whirlpool Tx0            :    79 /   109 (72.5%)
      → 74851fe27525348c3de97db0223805f110691b4052b52aaf361973c367f541d5 block=500512 in=2 out=3 pool=100,000sat
      → e6a522fcc2c45658785af4163b79ea2b9e5f384c059255e02c5d6e05ec829fc0 block=550688 in=2 out=3 pool=100,000sat
    Wasabi                   :    39 /   162 (24.1%)
      → f755b00b05bbdce3fc8a7fd377d2021beed0014aaa28b0d061c0c7e6c3402562 block=503584 in=2 out=2 v=1.1
      → 659ea778e69cf04b5ec752d42c0068811b7c8678889d105a869917b7f3dcfd28 block=504608 in=6 out=9 v=1.0
    JoinMarket               :   450 /   840 (53.6%)
      → da5646d37cf7efd4b194decdf77add63bc993357d2ea9c6e243bfc222370f6fc block=506656 in=2 out=4 participants=2
      → 9497ff204bf111679e83aaaacb536b09a732e68a0ab8e2a063894d278b864f1e block=507168 in=65 out=11 participants=10
  Has exchange input         :     17 / 1,070 (1.6%)

AI Disclosure

While change heuristics were 100% written by me, this time it was a co-working process. However no auto-edits were allowed and each edit was understood. Tests where 100% created by AI. Documentation mostly written by AI, but I drafted the structure.

@soad003
Copy link
Copy Markdown
Member

soad003 commented Mar 23, 2026

Looks great already, did not dig into the heuristics to deep but form a structural perspective, no complaints. Can we add annotate the source of the heuristics in the code (pointer to the paper) for future reference?

Are you still planning to add the additional tag verification step?

Copy link
Copy Markdown
Member

@soad003 soad003 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only nits, otherwise looks great.

Literal[
"all",
"one_time_change", "direct_change", "multi_input_change", "all_change",
"all_coinjoin", "whirlpool", "wasabi", "wasabi_1_0", "wasabi_1_1", "wasabi_2_0", "joinmarket",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also prefix whirlpool etc. with _coinjoin for consistency.

detected: bool
confidence: int
n_participants: int
denomination_sat: int
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be fair to also call this pool_denomination?

return all(results)


WHIRLPOOL_TX0_MAX_FORWARD_CHECKS = 20
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add all constants to the beginning of the file. For things we might want to change some time, please pass them to the function (eg. the max forward checks might be something we want to play with)

I would probably add some config class for the heuristics that we can already pass to get_tx


# forward verification: if Tx0 detected and DB callbacks available,
# check if pre-mix outputs were spent in Whirlpool CoinJoins
if whirlpool_tx0 is not None and get_spent_in and get_tx:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add a logger.warning if whirlpool_tx0 is not not and get_spent_in or get_tx is not avail.


async def calculate_heuristics(
tx, currency, get_address, heuristics: list
tx, currency, get_address, heuristics: list,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might make sense to bundle the parameters? eg. coinjoin holds get_spent_in and get_tx

@soad003 soad003 merged commit 422d748 into develop Mar 26, 2026
3 checks passed
@soad003 soad003 deleted the feature-coinjoin-heuristics branch March 26, 2026 12:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants