Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 11 additions & 4 deletions internal/tools/clickhouse/0002_clickhouse_create_logs_table.sql
Original file line number Diff line number Diff line change
Expand Up @@ -37,16 +37,23 @@ CREATE TABLE IF NOT EXISTS logs (
transaction_index,
log_index
),
PROJECTION chain_topic0_projection
PROJECTION chain_address_block_number_full_projection
(
SELECT
*
ORDER BY
chain_id,
address,
block_number
),
PROJECTION chain_topic0_full_projection
(
SELECT
_part_offset
*
ORDER BY
chain_id,
topic_0,
block_number,
transaction_index,
log_index,
address
),
PROJECTION address_topic0_state_projection
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ CREATE TABLE IF NOT EXISTS token_transfers
`insert_timestamp` DateTime DEFAULT now(),
`is_deleted` UInt8 DEFAULT 0,

INDEX idx_block_number block_number TYPE minmax GRANULARITY 1,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add an upgrade migration to materialize the new index on existing clusters.

CREATE TABLE IF NOT EXISTS won’t retrofit this index; existing deployments will miss idx_block_number unless you run ALTER + MATERIALIZE.

Suggested numbered migration (use ON CLUSTER if applicable):

-- add index
ALTER TABLE token_transfers ADD INDEX IF NOT EXISTS idx_block_number block_number TYPE minmax GRANULARITY 1;

-- build index for historical parts
ALTER TABLE token_transfers MATERIALIZE INDEX idx_block_number;
🤖 Prompt for AI Agents
internal/tools/clickhouse/0006_clickhouse_create_token_transfers.sql around line
20: the new idx_block_number index will not be retrofitted into existing
clusters by CREATE TABLE IF NOT EXISTS, so add an upgrade migration that runs an
ALTER TABLE to add the index if missing and then MATERIALIZE the index for
historical parts; the migration should use "ALTER TABLE ... ADD INDEX IF NOT
EXISTS idx_block_number block_number TYPE minmax GRANULARITY 1" and then "ALTER
TABLE ... MATERIALIZE INDEX idx_block_number", and include ON CLUSTER
<cluster_name> if this repo deploys ClickHouse clusters.

INDEX idx_block_timestamp block_timestamp TYPE minmax GRANULARITY 1,
INDEX idx_from_address from_address TYPE bloom_filter GRANULARITY 3,
INDEX idx_to_address to_address TYPE bloom_filter GRANULARITY 3,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ CREATE TABLE IF NOT EXISTS token_balances
`insert_timestamp` DateTime DEFAULT now(),
`is_deleted` UInt8 DEFAULT 0,

INDEX idx_block_number block_number TYPE minmax GRANULARITY 1,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Also add an upgrade migration for the new block_number index.

Same issue: CREATE IF NOT EXISTS won’t add idx_block_number to existing tables; add ALTER + MATERIALIZE.

ALTER TABLE token_balances ADD INDEX IF NOT EXISTS idx_block_number block_number TYPE minmax GRANULARITY 1;
ALTER TABLE token_balances MATERIALIZE INDEX idx_block_number;

Optional: keep an eye on storage overhead; minmax with GRANULARITY 1 is usually cheap and helps when queries filter by block_number without token/owner predicates.


🏁 Script executed:

#!/bin/bash
set -euo pipefail
echo "PWD: $(pwd)"
echo
echo "== Show file internal/tools/clickhouse/0008_clickhouse_create_token_balances.sql (if present) =="
if [ -f internal/tools/clickhouse/0008_clickhouse_create_token_balances.sql ]; then
  sed -n '1,240p' internal/tools/clickhouse/0008_clickhouse_create_token_balances.sql || true
else
  echo "MISSING: internal/tools/clickhouse/0008_clickhouse_create_token_balances.sql"
fi
echo
echo "== Search for idx_block_number / token_balances / MATERIALIZE INDEX / ALTER TABLE occurrences =="
rg -n --hidden --no-ignore -S "idx_block_number|MATERIALIZE INDEX idx_block_number|MATERIALIZE INDEX|ALTER TABLE token_balances|ADD INDEX idx_block_number|ADD INDEX IF NOT EXISTS idx_block_number|token_balances" || true
echo
echo "== List files in internal/tools/clickhouse (if dir exists) =="
if [ -d internal/tools/clickhouse ]; then
  ls -1 internal/tools/clickhouse | sed -n '1,200p' || true
else
  echo "DIR NOT FOUND: internal/tools/clickhouse"
fi

Length of output: 5878


Add ALTER migration to materialize idx_block_number on existing tables

CREATE TABLE IF NOT EXISTS does not add new indexes to pre-existing tables — add an ALTER + MATERIALIZE migration for token_balances (and token_transfers if already deployed).

Locations:

  • internal/tools/clickhouse/0008_clickhouse_create_token_balances.sql (idx_block_number at line 23)
  • internal/tools/clickhouse/0006_clickhouse_create_token_transfers.sql (same index)

Suggested SQL:

ALTER TABLE token_balances ADD INDEX IF NOT EXISTS idx_block_number block_number TYPE minmax GRANULARITY 1;
ALTER TABLE token_balances MATERIALIZE INDEX idx_block_number;

ALTER TABLE token_transfers ADD INDEX IF NOT EXISTS idx_block_number block_number TYPE minmax GRANULARITY 1;
ALTER TABLE token_transfers MATERIALIZE INDEX idx_block_number;
🤖 Prompt for AI Agents
internal/tools/clickhouse/0008_clickhouse_create_token_balances.sql around line
23: CREATE TABLE IF NOT EXISTS does not materialize new indexes on existing
tables, so add an ALTER migration that adds and materializes idx_block_number
for token_balances (and do the same for token_transfers if deployed). Create a
new migration SQL (or append to existing migration series) that runs ALTER TABLE
token_balances ADD INDEX IF NOT EXISTS idx_block_number block_number TYPE minmax
GRANULARITY 1; followed by ALTER TABLE token_balances MATERIALIZE INDEX
idx_block_number; and likewise for token_transfers (ADD INDEX IF NOT EXISTS ...
and MATERIALIZE INDEX ...) to ensure the index is applied to pre-existing
tables.

INDEX idx_block_timestamp block_timestamp TYPE minmax GRANULARITY 1,
INDEX idx_token_address token_address TYPE bloom_filter GRANULARITY 3,
INDEX idx_owner_address owner_address TYPE bloom_filter GRANULARITY 3,
Expand Down Expand Up @@ -62,6 +63,6 @@ CREATE TABLE IF NOT EXISTS token_balances
)
)
ENGINE = ReplacingMergeTree(insert_timestamp, is_deleted)
PARTITION BY chain_id
PARTITION BY (chain_id, toStartOfQuarter(block_timestamp))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Partition change won’t apply to existing deployments; add an explicit migration plan.

This file uses CREATE TABLE IF NOT EXISTS, so clusters that already have token_balances (partitioned by chain_id) will not be altered. You’ll ship a schema drift where fresh installs get (chain_id, quarter) while existing ones stay on chain_id-only.

Recommended migration (outline):

  • Create a v2 table with the new partition key.
  • Pause or dual-write ingestion.
  • Backfill via INSERT … SELECT.
  • Atomically swap tables; then drop the old one.
  • Do this ON CLUSTER if applicable.

Example SQL (adapt to your deploy tooling and cluster topology):

-- 1) Create new table
CREATE TABLE token_balances_v2
(
  -- same columns, indexes, projections as current token_balances
) ENGINE = ReplacingMergeTree(insert_timestamp, is_deleted)
PARTITION BY (chain_id, toStartOfQuarter(block_timestamp))
ORDER BY (chain_id, owner_address, token_address, token_id, block_number, transaction_index, log_index, direction)
SETTINGS index_granularity = 8192,
         lightweight_mutation_projection_mode = 'rebuild',
         deduplicate_merge_projection_mode = 'rebuild',
         allow_part_offset_column_in_projections = 1;

-- 2) Backfill
INSERT INTO token_balances_v2 SELECT * FROM token_balances;

-- 3) Swap (atomic)
RENAME TABLE token_balances TO token_balances_old, token_balances_v2 TO token_balances;

-- 4) Resume writes; validate; then drop old
DROP TABLE token_balances_old;

If this file is intended only for fresh installs, add a separate numbered migration that performs the above for upgrades to avoid silent divergence.

🤖 Prompt for AI Agents
internal/tools/clickhouse/0008_clickhouse_create_token_balances.sql around line
65: the CREATE TABLE IF NOT EXISTS change adds a new PARTITION BY (chain_id,
toStartOfQuarter(block_timestamp)) but will not alter existing deployments that
already have token_balances partitioned only by chain_id, causing schema drift;
add an explicit migration that creates a token_balances_v2 with the new
partition key (matching all columns, indexes, projections and engine settings),
pause or dual-write ingestion if needed, backfill data via INSERT … SELECT to
token_balances_v2, perform an atomic swap (RENAME TABLE token_balances TO
token_balances_old, token_balances_v2 TO token_balances) and then resume writes
and validate before dropping token_balances_old, and ensure the migration runs
ON CLUSTER where applicable or provide this as an upgrade-only numbered
migration separate from the fresh-install DDL.

ORDER BY (chain_id, owner_address, token_address, token_id, block_number, transaction_index, log_index, direction)
SETTINGS index_granularity = 8192, lightweight_mutation_projection_mode = 'rebuild', deduplicate_merge_projection_mode = 'rebuild', allow_part_offset_column_in_projections=1;