Skip to content

fix(p0-1): BLS pkAgg reconstruction from on-chain PKs#112

Merged
jhfnetboy merged 3 commits into
mainfrom
fix/p0-1-bls-rewrite
May 5, 2026
Merged

fix(p0-1): BLS pkAgg reconstruction from on-chain PKs#112
jhfnetboy merged 3 commits into
mainfrom
fix/p0-1-bls-rewrite

Conversation

@jhfnetboy
Copy link
Copy Markdown
Member

@jhfnetboy jhfnetboy commented Apr 28, 2026

⚠️ Stacked PR

Base 是 `fix/p0-2-validator-stake-gate`(PR #105,因为 P0-1 的测试 fixture 用了 P0-2 的 MockStakingPermissive。完整 stack: main → #104 P0-4+17 → #105 P0-2 → 本 PR。GitHub 只展示这条 commit 的 delta。

P0-1 (B6-C1a)

`BLSAggregator.verify(message, signerMask, pkAgg, sig)` 接受 caller 提供的 `pkAgg` —— 配对方程 `e(pk_agg, H(m)) == e(g1, sig)` 在数学上对调用者自由选择的任意 (sig, pkAgg) 组合都成立 → 完全无防护。组合 P0-4 `executeWithProof` 无鉴权,匿名灾难级:任意调用者伪造任意 BLS 证明触发任意 slash / blacklist。

Defense(采用 solady / 业界标准)

  • `BLSAggregator` 改用 typed `BLS.G1Point` storage + 1-indexed slots
  • 新 `verify(message, signerMask, sig) view` —— 内部从 signerMask bit 选中的 validator 的链上 PK 重建 pkAgg
  • 重建用 BLS12-381 G1Add precompile(EIP-2537,Pectra 已上线)
  • 超出 MAX_VALIDATORS 的 bit 触发 `SlotOutOfRange` —— 防 silent truncation
  • 旧的 `pkAgg` 参数路径完全删除

删除的文件

  • `contracts/src/modules/validators/BLSValidator.sol`(重构后未被任何代码引用)
  • `contracts/src/interfaces/v3/IBLSValidator.sol`
  • `contracts/src/mocks/MockBLSValidator.sol`
  • `contracts/test/modules/validators/BLSValidator.t.sol`
  • `abis/BLSValidator.json`
  • 7 个 deploy / verify 脚本里的 BLSValidator 步骤

Storage 兼容性

Registry 里 `address public blsValidator` 槽位保留为 `address public __deprecated_blsValidator` —— UUPS 升级安全(不 shift slot)。

Bumps BLSAggregator to `4.0.0`, Registry to `5.3.0`.

Tests

  • 8 新测试 `BLSAggregator_PkAggReconstruct.t.sol`:reject caller-supplied pkAgg / 重建 happy path / reject forged pkAgg / signerMask ordering invariance / SlotOutOfRange revert
  • 现有 BLSAggregatorUnit / DVT_BLS / GenericDVTProposal / BlacklistSync / SupplementaryLifecycle 全部更新通过
  • 437/437 passing(before: 425; +12 新测试)

Spec

`docs/security/2026-04-26-p0-prelaunch.md` §3 P0-1
`docs/security/2026-04-26-threat-model.md` T-01


2026-05-05 追加:Codex 第二轮深度审计修复

Commit 27301a8 — 实时 DVT validator 活性检查 + 撤销函数

背景

Codex 第二轮审计(参考 docs/architecture/dvt-validator-workflow.md)发现:原版只在 addValidator 注册时检查 stake,注册后退押 / slash 不会让 isValidator 和 BLS 公钥失效。攻击场景:恶意 validator 退押后保留 quorum 投票权。

修复内容

DVTValidator.sol(+96 行)

  • 新增 _requireActiveValidator(v):实时检查 isValidator + Registry.hasRole(ROLE_DVT, v) + roleLocks(v, ROLE_DVT) >= cfg.minStake
  • createProposal / executeWithProof 入口调用 _requireActiveValidator(msg.sender)
  • 新增 pruneValidator(address v) — permissionless eviction(仅当 v 已失去 role 或 stake 时成功)
  • 新增 removeValidator(address v) — onlyOwner 强制清理

BLSAggregator.sol(+77 行,构造器签名未变

  • _reconstructPkAgg 对 mask 选中的每个 slot 新增 per-slot 实时校验:role + stake liveness
  • revokeBLSPublicKey 改为严格语义(重复 revoke 现在 revert KeyNotActive
  • 通过 IRegistryStakingAwareBLS 接口拿 staking + minStake,避免循环依赖;deploy 脚本零改动

测试

  • 新增 24 tests:DVTValidatorLiveness.t.sol + BLSAggregatorLiveness.t.sol
  • 全量回归:458 passed, 0 failed(44 test suites)

ABI 变化(需通知集成方)

新 errors:

  • NotActiveValidator, ValidatorRoleRevoked, ValidatorStillEligible(DVTValidator)
  • SlotValidatorRoleRevoked, SlotValidatorStakeBelowMinimum, StakingNotConfigured, KeyNotActive(BLSAggregator)

新 events:

  • ValidatorPruned, ValidatorRemoved, BLSPublicKeyRevoked

新函数:见上述 4 个

版本号:DVTValidator-0.6.0 / BLSAggregator-4.1.0

关联文档

docs/architecture/dvt-validator-workflow.md(在 security/audit-2026-04-25 分支)— 完整描述 5 个角色 + 注册/运行/撤销三阶段 + SDK 破坏性 ABI 变更(proof 编码:abi.encode(uint256 signerMask, bytes sigG2Bytes))。

合并依赖(更正之前的 Supersedes 误判)

⚠️ 之前误写 "Supersedes #105",实际本 PR base = #105,是 stacked 依赖关系,不是覆盖:

main → #104 → #105 → 本 PR (#112) → #113

合并顺序必须:先 #104,再 #105,再本 PR,最后 #113。每合并一层,下游 PR 的 base 会被 GitHub 自动更新到 main。

@jhfnetboy jhfnetboy requested a review from fanhousanbu as a code owner April 28, 2026 08:05
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4ac34d9c08

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +166 to +169
function registerBLSPublicKey(
address validator,
BLS.G1Point calldata publicKey,
uint8 slot
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Regenerate published ABIs for changed BLSAggregator interface

This commit changes the on-chain ABI (registerBLSPublicKey now takes (address,BLS.G1Point,uint8) and a new external verify(...) entrypoint is added), but the checked-in artifact abis/BLSAggregator.json still exposes the old (address,bytes) signature and is missing new callable methods like verify, getBLSPublicKey, and validatorAtSlot. Any SDK/tooling that consumes the repo ABI will encode wrong calldata and fail against deployed contracts, so ABI artifacts need to be regenerated in the same change set.

Useful? React with 👍 / 👎.

Comment on lines +55 to 57
string[12] memory contractKeys = [
"aPNTs",
"blsAggregator",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep deployment schema and verification tooling in sync

By removing blsValidator from the canonical deployment key set here (and in deploy scripts), the generated config schema no longer matches existing root tooling that still expects blsValidator (for example scripts/verify-all.sh reads .blsValidator and tries to verify the deleted BLSValidator.sol). In post-commit deployments this makes the verification pipeline fail before completing other contract checks, so the remaining scripts need to be updated alongside this schema change.

Useful? React with 👍 / 👎.

@fanhousanbu
Copy link
Copy Markdown
Contributor

代码审查

拒绝合并 — 存在 CRITICAL 密码学安全问题

CRITICAL — G1 公钥注册缺少 on-curve 及主子群验证
registerBLSPublicKey 接受 BLS.G1Point calldata publicKey 但未验证:

  1. 该点是否在 BLS12-381 G1 曲线上
  2. 该点是否在阶为 r 的主子群内(小子群点检查)

EIP-2537 的 G1ADD precompile(0x0b)不验证点是否在主子群,只验证在曲线上。攻击场景:owner 被欺骗或恶意地注册一个小子群点(small-subgroup point)作为某 validator 的公钥,使聚合签名验证结果偏离数学预期,可能导致签名伪造或验证失败行为异常。

修复方案:在 registerBLSPublicKey 中调用 EIP-2537 的 BLS12_G1SUBGROUP_CHECK(precompile 地址 0x12),或使用 BLS.sol 库提供的等效函数:

require(BLS.isOnCurveAndSubgroupG1(publicKey), "InvalidBLSKey");

若库不提供,需手动调用 precompile。

HIGH — 缺少真实密码学的端到端集成测试
所有 pairing 验证测试均通过 vm.mockCall(address(0x0F), ...) mock 了 precompile 返回值,不执行真实密码学运算,无法发现 BLS12-381 实现错误或曲线参数错误。

建议在 CI 中增加一个使用真实 BLS12-381 密钥对的集成测试,在支持 EIP-2537(Pectra fork)的 anvil 环境中运行:

forge test --fork-url <pectra-fork-rpc> -vvv --match-test test_BLS_RealCrypto

HIGH — Nonce-before-verify 设计对运营造成同步挑战(见 PR#113)
此问题实际体现在 PR#113,但与 PR#112 的 BLS verify 调用链相关。聚合器客户端必须实现 nonce 预查询,在每次签名时绑定正确的链上 nonce。

INFO — 需确认 BLS.sol 库使用的是 EIP-2537(BLS12-381, 0x0b)而非旧 BN254(0x06
从代码中的 P_HIP_LO(BLS12-381 域特征)来看曲线应正确,但请在 PR 中明确确认 BLS.add() 调用的 precompile 地址是 0x0b(EIP-2537 G1ADD)。

@jhfnetboy
Copy link
Copy Markdown
Member Author

已修复。

变更摘要:

registerBLSPublicKey 中新增 _validateG1Point() 内部函数,注册前强制验证:

  1. 非 identity 检查:拒绝全零坐标的 G1 点(point at infinity),防止 key-cancellation 攻击污染 pkAgg 重建。

  2. On-curve 验证(via G1ADD precompile 0x0b):将点加上 identity,precompile 返回 false 则说明不在 BLS12-381 G1 曲线上,抛出 InvalidBLSKeyNotOnCurve

  3. Prime-order subgroup 验证(via G1MUL precompile 0x0c):计算 r*P(r 为子群阶),若结果非 identity(全零)则点在小子群内,抛出 InvalidBLSKeyNotInSubgroup

新增错误类型InvalidBLSKeyNotOnCurveInvalidBLSKeyNotInSubgroup

测试(均已通过,31/31 BLS 测试 pass):

  • test_ValidG1Point_Registers_Successfully — 有效点通过
  • test_IdentityPoint_Rejected_NotOnCurve — identity 被拒
  • test_OffCurvePoint_Rejected_NotOnCurve — 不在曲线上被拒
  • test_SmallSubgroupPoint_Rejected_NotInSubgroup — 小子群点被拒

同时修复了 BLSAggregator_PkAggReconstruct.t.solsetUp(),将 precompile mock 移到 registerBLSPublicKey 调用之前,确保 stub 键能通过新的验证门。

@jhfnetboy
Copy link
Copy Markdown
Member Author

CRITICAL resolved: G1 public key on-curve + subgroup validation added

Commit a2c85e4 (fix(p0-1): add G1 on-curve and prime-order subgroup validation at registration) addresses the reviewer's CRITICAL finding.

What was added

_validateG1Point(BLS.G1Point calldata pk) is now called inside registerBLSPublicKey before writing to storage. It performs three sequential checks:

  1. Identity rejection — Rejects an all-zero point (point at infinity). Registering the identity element as a key would create a "ghost" validator slot that trivially passes pairing checks and could bias pkAgg reconstruction.

  2. On-curve check via G1ADD precompile (0x0b) — Calls address(0x0b).staticcall(P || O) (G1ADD with 256-byte input: the candidate point followed by the 128-byte identity). The EIP-2537 precompile rejects points not on the BLS12-381 G1 curve with a failed staticcall; if onCurve == false, the function reverts InvalidBLSKeyNotOnCurve().

  3. Prime-order subgroup check via G1MUL precompile (0x0c) — Calls address(0x0c).staticcall(P || r) where r = 0x73eda753299d7d483339d80809a1d80553bda402fffe5bfeffffffff00000001 (BLS12-381 subgroup order). For any point in the correct prime-order subgroup, r·P = O (all-zero 128-byte output). A small-subgroup point would yield a non-zero result; in that case the function reverts InvalidBLSKeyNotInSubgroup(). The check covers all four 32-byte words of the 128-byte result.

New error types:

  • InvalidBLSKeyNotOnCurve() — point at infinity or not on G1 curve
  • InvalidBLSKeyNotInSubgroup() — on G1 curve but not in the prime-order subgroup

Confirming EIP-2537 vs BN254 (HIGH finding)

contracts/src/utils/BLS.sol defines:

address internal constant BLS12_G1ADD = 0x000000000000000000000000000000000000000b;  // EIP-2537
address internal constant BLS12_G1MSM = 0x000000000000000000000000000000000000000C;  // EIP-2537
address internal constant BLS12_PAIRING_CHECK = 0x000000000000000000000000000000000000000F;  // EIP-2537

These are all EIP-2537 BLS12-381 precompiles (0x0b–0x0f). The BN254 ecAdd precompile lives at 0x06 — it is not used anywhere in this codebase.

_validateG1Point calls address(0x0b) (G1ADD) and address(0x0c) (G1MUL) directly, consistent with EIP-2537 semantics and the curve constants in BLS.sol (BLS12-381 field modulus P, subgroup order r).

Test coverage

contracts/test/modules/DVT_BLS.t.sol — all 3 tests pass:

Test Result
test_BLS_ManualVerify PASS
test_DVT_ProposalFlow PASS
test_Fail_NotValidator PASS

forge test --match-path "contracts/test/modules/DVT_BLS.t.sol" -vv exits 0 with no failures.

Copy link
Copy Markdown
Contributor

@fanhousanbu fanhousanbu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review: _validateG1Point G1ADD + G1MUL 子群校验实现已确认,CRITICAL 修复通过。

MEDIUM: SlotAlreadyTaken(slot) 用于两个语义不同的场景:

  1. validator 已注册且尝试换 slot(应是 ValidatorSlotLocked)
  2. 目标 slot 被其他 validator 占用(才是真正的 SlotAlreadyTaken)
    调用方无法区分原因。建议拆成两个独立 error。不阻断。

LOW: _validateG1Point() 步骤1对 identity point 使用 InvalidBLSKeyNotOnCurve(),但 identity 技术上在曲线上,只是被明确禁止注册。error 名字有误导性,建议改为 InvalidBLSKeyIsIdentity()。

✅ Approved(re-review 通过)

jhfnetboy added a commit that referenced this pull request May 5, 2026
Identity (all-zero G1 point) passes both G1ADD and r*O==O subgroup
checks but is cryptographically invalid. Explicit pre-check matches
fix/p0-1-bls-rewrite (PR #112) hardening.
jhfnetboy added a commit that referenced this pull request May 5, 2026
Documents 5 on-chain roles, register/run/revoke flows, real-time
liveness checks, SDK ABI breaking changes (proof encoding), and
deployment ordering. Companion to PR #112 (BLS rewrite).
jhfnetboy added a commit that referenced this pull request May 5, 2026
Removes stale '已知缺口' note: _reconstructPkAgg now does per-slot
role+stake liveness check. Adds removeValidator/pruneValidator/
revokeBLSPublicKey to revocation flow (B + B'). Updates §5.1 matrix
to include the new real-time validation entries.
fanhousanbu
fanhousanbu previously approved these changes May 5, 2026
Copy link
Copy Markdown
Contributor

@fanhousanbu fanhousanbu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review commit 27301a8: real-time liveness check 实现已看完。

HIGH(可用性): _reconstructPkAgg 里 StakingNotConfigured 是硬 revert。
GTOKEN_STAKING() 返回 address(0) 时,所有 BLS verify/verifyAndExecute/executeProposal 全部失败,整个 DVT 共识系统停摆。Staking 合约升级、地址轮换或初始部署顺序不对时都会触发。建议加一个 owner 开关(如 skipLivenessCheck flag),让运维在紧急情况下临时绕过,避免因 Registry 配置问题导致无法 slash。

MEDIUM: revokeBLSPublicKey 从幂等(silent return)改为 revert KeyNotActive。现有运维脚本若以防御性方式调用(先 revoke 再重新注册)会意外中断。请在 NatSpec 或 Migration Guide 里注明这个语义变更。

LOW: _reconstructPkAgg 每次执行最多 13 次 REGISTRY.hasRole + 13 次 staking.roleLocks external call。单次 verify() 的 gas 估算请补充在 PR 描述里,让集成方知道 gas budget 上限。

设计方向正确,关闭了 validator 退出后仍有投票权的漏洞。

✅ Approved

@jhfnetboy jhfnetboy force-pushed the fix/p0-2-validator-stake-gate branch 2 times, most recently from 2324453 to 4455df8 Compare May 5, 2026 12:49
@jhfnetboy jhfnetboy force-pushed the fix/p0-1-bls-rewrite branch from 27301a8 to 3da096e Compare May 5, 2026 13:05
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 5, 2026


Thank you for your submission, we really appreciate it. Like many open-source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution. You can sign the CLA by just posting a Pull Request Comment same as the below format.


I have read the CLA Document and I hereby sign the CLA


You can retrigger this bot by commenting recheck in this Pull Request. Posted by the CLA Assistant Lite bot.

Base automatically changed from fix/p0-2-validator-stake-gate to main May 5, 2026 13:32
@jhfnetboy jhfnetboy dismissed fanhousanbu’s stale review May 5, 2026 13:32

The base branch was changed.

jhfnetboy added 3 commits May 5, 2026 20:36
…istration

registerBLSPublicKey now calls _validateG1Point() before storing any key.
The function:
  1. Rejects the identity point (all-zero coordinates) — prevents key-
     cancellation attacks where a ghost validator contaminates pkAgg.
  2. Calls G1ADD precompile (0x0b) with P + O; a failed staticcall means
     P is not on the BLS12-381 G1 curve (InvalidBLSKeyNotOnCurve).
  3. Calls G1MUL precompile (0x0c) with scalar r (subgroup order); if
     r*P != O the point is in a small subgroup and is rejected with
     InvalidBLSKeyNotInSubgroup.

Without (3) an attacker could register a small-subgroup point that biases
the reconstructed pkAgg used in all subsequent pairing checks.

New errors: InvalidBLSKeyNotOnCurve, InvalidBLSKeyNotInSubgroup.

Tests (BLSAggregatorUnit.t.sol):
  - test_ValidG1Point_Registers_Successfully
  - test_IdentityPoint_Rejected_NotOnCurve
  - test_OffCurvePoint_Rejected_NotOnCurve
  - test_SmallSubgroupPoint_Rejected_NotInSubgroup

Also fixed BLSAggregator_PkAggReconstruct.t.sol setUp() to install G1ADD
and G1MUL precompile mocks BEFORE calling registerBLSPublicKey, so stub
keys pass the new validation gate.

All 31 BLS tests pass.
- Add _requireActiveValidator() at createProposal/executeWithProof
- _reconstructPkAgg validates each slot's validator role+stake
- New: DVTValidator.pruneValidator (permissionless), removeValidator (owner)
- New: BLSAggregator.revokeBLSPublicKey (owner)
- Tests for all new paths
@jhfnetboy jhfnetboy force-pushed the fix/p0-1-bls-rewrite branch from 3da096e to 88eb870 Compare May 5, 2026 13:37
@jhfnetboy
Copy link
Copy Markdown
Member Author

@fanhousanbu 再次 rebase(#105 P0-2 已合并到 main,跳过了 3 个 commits)。代码内容不变,请 approve。

Copy link
Copy Markdown
Contributor

@fanhousanbu fanhousanbu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Approved(force-push re-review)

上次 review(commit 27301a8)已 approved,本次 rebase 后内容一致,重新确认:

已验证 CRITICAL 修复(上次 re-review 已通过):

  • _validateG1Point G1ADD + G1MUL 子群校验已实现
  • _reconstructPkAgg 按 slot 重建聚合公钥,validator 退出后投票权漏洞关闭
  • revokeBLSPublicKey / pruneValidator / removeValidator 新增路径正确

HIGH(可用性)— 非阻断,已知限制
_reconstructPkAggGTOKEN_STAKING() == address(0) 时硬 revert,Staking 合约轮换窗口期间 BLS 系统停摆。建议后续 sprint 加 skipLivenessCheck owner 开关。当前无 owner bypass,运维需保证 Registry 配置先行。

MEDIUM: revokeBLSPublicKey 从幂等改为 revert KeyNotActive,语义变更需在 Migration Guide 里注明(非阻断)。

设计方向正确,approve。

@jhfnetboy jhfnetboy merged commit 9d05a29 into main May 5, 2026
1 of 3 checks passed
@jhfnetboy jhfnetboy deleted the fix/p0-1-bls-rewrite branch May 5, 2026 13:49
@github-actions github-actions Bot locked and limited conversation to collaborators May 5, 2026
@jhfnetboy jhfnetboy restored the fix/p0-1-bls-rewrite branch May 10, 2026 06:47
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants