Skip to content

Fix disk_info accounting for hash indexes#631

Merged
adsharma merged 3 commits into
LadybugDB:mainfrom
rahul-iyer:fix-disk-info-hash-accounting
Jun 29, 2026
Merged

Fix disk_info accounting for hash indexes#631
adsharma merged 3 commits into
LadybugDB:mainfrom
rahul-iyer:fix-disk-info-hash-accounting

Conversation

@rahul-iyer

@rahul-iyer rahul-iyer commented Jun 28, 2026

Copy link
Copy Markdown
Contributor

Summary

This PR fixes hash index storage accounting in disk_info() / storage_info() and makes SHOW_INDEXES()
include the built-in _PK hash index so index visibility is consistent with the underlying storage state.

Problem

There were two related inconsistencies:

  1. Hash-backed primary-key index storage was not fully reflected in disk_info() / storage_info().
  2. SHOW_INDEXES() did not surface the built-in _PK hash index, even though that index exists physically
    and shows up in storage-oriented views.

This made it harder to reason about index footprint and led to confusing gaps between catalog and
storage infrormation.

What changed

  1. Hash index storage accounting
  • Expose hash index storage through Index::getStorageEntries()

  • Report logical hash index components such as:

    • hash_index_headers
    • disk_array_headers
    • primary_slots
    • overflow_slots when present
    • string_overflow when present
  • Include the relevant disk-array page ranges so allocated hash index pages are accounted for correctly

  • Aggregate repeated disk_info() entries by logical component name instead of emitting a row per page/
    range

  1. SHOW_INDEXES() visibility
  • Include storage-backed built-in _PK indexes in SHOW_INDEXES()
  • Avoid duplicate rows when a catalog-backed PK index already represents the same logical PK index
  1. Tests
  • Added/updated coverage for:
    • hash index storage reporting in disk_info()
    • _PK visibility in SHOW_INDEXES()
    • ART secondary-index expectations now that _PK is included in SHOW_INDEXES()
    • updated storage_info() row count expectations where PK index rows are now included

Behavior after this change

SHOW_INDEXES() now reflects both user-created indexes and the built-in _PK hash index when present.

Example:

CALL SHOW_INDEXES() WHERE table_name = 't'
RETURN table_name, index_name, index_type, property_names
ORDER BY index_name;

Before:

t | t_name_idx | ART | [name]

After:

t | _PK | HASH | [id]
t | t_name_idx | ART | [name]

disk_info() stays summary-oriented and grouped by logical storage component.

storage_info() stays detailed and emits one row per physical storage range.

@adsharma

adsharma commented Jun 28, 2026

Copy link
Copy Markdown
Contributor

Much needed! Thank you for the contribution.

python3 scripts/run-clang-format.py --clang-format-executable /opt/homebrew/bin/clang-format-18 -i -r src

or something similar should fix the lint. Note that clang-format-18 on ubuntu is broken/patched and behaves differently just in case you're on the platform.

Will try to get this in before the release since it fills an important gap.

@adsharma

Copy link
Copy Markdown
Contributor

Bonus: add the hash index to show_indexes() as well.

lbug> call disk_size_info() return *;
┌────────────┬───────────────────────────────────┬───────────┬────────────┐
│ category   │ name                              │ num_pages │ size_bytes │
│ STRING     │ STRING                            │ UINT64    │ UINT64     │
├────────────┼───────────────────────────────────┼───────────┼────────────┤
│ header     │ database_header                   │ 1         │ 4096       │
│ catalog    │ catalog                           │ 1         │ 4096       │
│ metadata   │ metadata                          │ 2         │ 8192       │
│ node_table │ stats_node                        │ 3         │ 12288      │
│ index      │ stats_node._PK:hash_index_headers │ 2         │ 8192       │
│ index      │ stats_node._PK:disk_array_headers │ 3         │ 12288      │
│ index      │ stats_node._PK:primary_slots      │ 40        │ 163840     │
│ free_space │ free_pages                        │ 0         │ 0          │
│ total      │ file_total                        │ 52        │ 212992     │
└────────────┴───────────────────────────────────┴───────────┴────────────┘
(9 tuples)
(4 columns)
Time: 6.52ms (compiling), 1.00ms (executing)
lbug> call show_indexes() return *;
┌────────────┬────────────┬────────────┬────────────────┬──────────────────┬──────────────────┐
│ table_name │ index_name │ index_type │ property_names │ extension_loaded │ index_definition │
│ STRING     │ STRING     │ STRING     │ STRING[]       │ BOOL             │ STRING           │
├────────────┼────────────┼────────────┼────────────────┼──────────────────┼──────────────────┤
└────────────┴────────────┴────────────┴────────────────┴──────────────────┴──────────────────┘
(0 tuples)
(6 columns)
Time: 0.49ms (compiling), 0.41ms (executing)

@rahul-iyer

Copy link
Copy Markdown
Contributor Author

@adsharma Thanks, yes will update it.

@rahul-iyer rahul-iyer force-pushed the fix-disk-info-hash-accounting branch from a9b13cd to 380ebc2 Compare June 28, 2026 21:07
@rahul-iyer

rahul-iyer commented Jun 28, 2026

Copy link
Copy Markdown
Contributor Author

@adsharma Could you review this. I also have couple of feature I would like to add. Is there any discord channel I can discuss this?

Dependent PR LadybugDB/extensions#19

@rahul-iyer rahul-iyer force-pushed the fix-disk-info-hash-accounting branch from 4e845d6 to d19467e Compare June 29, 2026 00:18
@rahul-iyer rahul-iyer marked this pull request as draft June 29, 2026 00:19
@adsharma

Copy link
Copy Markdown
Contributor

Discord channel is linked from the bottom right corner of ladybugdb.com

@adsharma

adsharma commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Merged the extension PR. Please reset the submodule to LadybugDB/extensions@3a45fc4 and the CI should be green after that.

@adsharma adsharma left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. One minor comment.

Comment thread src/storage/overflow_file.cpp Outdated
@rahul-iyer rahul-iyer marked this pull request as ready for review June 29, 2026 03:38
@adsharma adsharma merged commit fe80a22 into LadybugDB:main Jun 29, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants