`create_node|edge_attr_index` for SQLGraph by yfukai · Pull Request #223 · royerlab/tracksdata

yfukai · 2025-12-10T05:25:40Z

This pull request introduces a new feature for the SQLGraph backend: the ability to create explicit database indexes on node and edge attributes to improve query performance, especially for frequently filtered attributes. The documentation and tests have been updated to reflect and validate this functionality.

SQLGraph indexing improvements:

Added methods ensure_node_attr_index and ensure_edge_attr_index to SQLGraph for creating indexes on node and edge attribute columns, including support for composite and unique indexes. (src/tracksdata/graph/_sql_graph.py)
Updated documentation to describe the new index feature and provide usage examples for creating indexes on attributes. (docs/concepts.md)
Updated the project README to mention SQLGraph's ability to index frequently queried attributes for faster filtering. (README.md)

Testing and validation:

Added tests to ensure index creation works as expected, including checks for composite and unique indexes, and error handling for missing columns. (src/tracksdata/graph/_test/test_graph_backends.py)
Added sqlalchemy import to support index inspection in tests. (src/tracksdata/graph/_test/test_graph_backends.py)

…x_benchmark

Co-authored-by: Jordão Bragantini <jordao.bragantini@gmail.com>

…x_benchmark

codecov-commenter · 2025-12-10T05:37:23Z

Codecov Report

❌ Patch coverage is 83.33333% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.50%. Comparing base (c5af9f3) to head (72000bd).
⚠️ Report is 7 commits behind head on main.

Files with missing lines	Patch %	Lines
src/tracksdata/graph/_sql_graph.py	83.33%	3 Missing and 3 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #223      +/-   ##
==========================================
+ Coverage   88.45%   88.50%   +0.04%     
==========================================
  Files          55       55              
  Lines        3890     3993     +103     
  Branches      674      700      +26     
==========================================
+ Hits         3441     3534      +93     
- Misses        267      275       +8     
- Partials      182      184       +2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

yfukai · 2025-12-10T10:23:52Z

Maybe we need drop functions

JoOkuma

@yfukai, awesome PR, I hadn't thought of adding this.

JoOkuma · 2025-12-10T17:04:38Z

+SQLGraph lets you create indexes on node or edge attributes to keep repeated
+filters fast:


@yfukai this is awesome. Could you briefly mention what kind of speed-up we can expect with this? 2x, 10x?

I benchmarked the performance and added the result to the doc!

Co-authored-by: Jordão Bragantini <jordao.bragantini@gmail.com>

…nto sql_indexing

yfukai

Benchmarked the performance improvement by indexing. Code:

import tracksdata as td
import tempfile
import time

if __name__ == "__main__":
    for node_count in [1_000_000, 100_000_000]:
        print(f"\nBenchmarking SQLGraph with {node_count} nodes")
        graph_db_file = tempfile.NamedTemporaryFile(suffix=".db", delete=False).name
        graph: td.graph.SQLGraph = td.graph.SQLGraph(
            drivername="sqlite",
            database=graph_db_file,
            overwrite=True,
        )
        graph.add_node_attr_key("attr1", 0)
        graph.bulk_add_nodes([{td.DEFAULT_ATTR_KEYS.T: i, "attr1": i % 100} for i in range(node_count)])
        print("Finished adding nodes.")
        # measure time to filter nodes by attr1
        start_time = time.time()
        filtered_graph = graph.filter(td.NodeAttr("attr1") == 0).subgraph()
        end_time = time.time()
        time_without_index = end_time - start_time
        print(f"Time to filter nodes without index: {time_without_index:.2f} seconds")
        graph.ensure_node_attr_index("attr1")
        start_time = time.time()
        filtered_graph = graph.filter(td.NodeAttr("attr1") == 0).subgraph()
        end_time = time.time()
        time_with_index = end_time - start_time
        print(f"Time to filter nodes with index: {time_with_index:.2f} seconds")
        print(f"Speedup factor: {time_without_index / time_with_index:.2f}x")

JoOkuma · 2026-01-09T17:23:13Z

@yfukai, that's an amazing speedup!
One last comment: do you think we could replace ensure with set in the method's name for clarity?
I can do the change if you agree and are busy.
The docs already make sure that nothing will happen if they are already used for indexing.

yfukai · 2026-01-10T12:36:07Z

Sure! Can we use "create_{node|edge}_attr_index" then? This agrees with actual SQL statement.

JoOkuma · 2026-01-12T16:26:08Z

@yfukai, that's even better.
Merged, thanks for this PR.

JoOkuma and others added 30 commits November 5, 2025 10:56

fixing benchmark results parsing and polars version

fa90cb6

testing another fix

11b04a8

Merge branch 'main' of https://github.com/royerlab/tracksdata into fi…

228224b

…x_benchmark

running?

31400e3

updated workflow

4de0e30

appending comment to PR

19d9da3

added indexing functionality for SQLGraph

3f557bf

added mask regionprops computation

8468ab9

Merge branch 'main' of https://github.com/royerlab/tracksdata

4b62a79

max_id per timepoint fix

13f70ae

Merge branch 'fix_add_node_issue_sql'

c874580

patch spatial_filter

9f828a3

added test

4195520

Merge branch 'sqlgraph_bbox_array'

6a95632

Update src/tracksdata/nodes/_mask.py

1651665

Co-authored-by: Jordão Bragantini <jordao.bragantini@gmail.com>

Update src/tracksdata/nodes/_mask.py

741c15f

Co-authored-by: Jordão Bragantini <jordao.bragantini@gmail.com>

Update src/tracksdata/nodes/_mask.py

c8ed899

Co-authored-by: Jordão Bragantini <jordao.bragantini@gmail.com>

import change and changing to function

9cdff52

removed wrapping of regionprops

4555062

Merge branch 'mask_centroid_calc'

15a22d0

Added caching to spatial_filter

947b1de

updated impl using group_by

73c1584

test update

cd33d22

Merge branch 'fix_add_node_issue_sql'

29d6a55

Merge branch 'cache_spatial_filter'

e849db0

Merge remote-tracking branch 'upstream/main' into fix_benchmark

3fcc6df

Merge branch 'main' of https://github.com/royerlab/tracksdata

58b06e4

Merge branch 'main' of https://github.com/royerlab/tracksdata into fi…

d3814e7

…x_benchmark

removed constraint

da83638

trying conda

800b0d7

yfukai added 8 commits December 9, 2025 15:22

reveerted pyproject

8fa64ee

solved reinstallation issue

241cdd6

update

b5bc04c

Merge branch 'fix_benchmark'

8c614e2

Merge branch 'jookuma/sql-graph-performance-improv'

c2588c2

Merge remote-tracking branch 'upstream/main' into sql_indexing

1f60f0e

updated test for adding index

de518ec

fixed test

1960079

JoOkuma approved these changes Dec 10, 2025

View reviewed changes

yfukai and others added 11 commits December 11, 2025 09:55

Merge remote-tracking branch 'upstream/main' into sql_indexing

a5a4efd

added benchmark

472be88

Merge branch 'main' into sql_indexing

ac1dc37

fixed bug

6dadf30

formatted

368185b

Update src/tracksdata/graph/_sql_graph.py

278f94a

Co-authored-by: Jordão Bragantini <jordao.bragantini@gmail.com>

Merge branch 'sql_indexing' of https://github.com/yfukai/tracksdata i…

f901779

…nto sql_indexing

update

bbb6b0a

fixed wrong update

673992a

fixed lint

3594699

added performance to concepts.d

8b43eb2

yfukai commented Jan 9, 2026

View reviewed changes

yfukai added 2 commits January 10, 2026 21:37

renamed from ensure to create

57e7ed9

fixed test

72000bd

JoOkuma changed the title ~~ensure_node|edge_attr_index for SQLGraph~~ create_node|edge_attr_index for SQLGraph Jan 12, 2026

JoOkuma merged commit e4e820f into royerlab:main Jan 12, 2026
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`create_node|edge_attr_index` for SQLGraph#223

`create_node|edge_attr_index` for SQLGraph#223
JoOkuma merged 79 commits intoroyerlab:mainfrom
yfukai:sql_indexing

yfukai commented Dec 10, 2025

Uh oh!

codecov-commenter commented Dec 10, 2025 •

edited

Loading

Uh oh!

yfukai commented Dec 10, 2025

Uh oh!

JoOkuma left a comment •

edited

Loading

Uh oh!

JoOkuma Dec 10, 2025

Uh oh!

yfukai Jan 9, 2026

Uh oh!

Uh oh!

yfukai left a comment

Uh oh!

JoOkuma commented Jan 9, 2026

Uh oh!

yfukai commented Jan 10, 2026

Uh oh!

JoOkuma commented Jan 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		SQLGraph lets you create indexes on node or edge attributes to keep repeated
		filters fast:

Conversation

yfukai commented Dec 10, 2025

Uh oh!

codecov-commenter commented Dec 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

yfukai commented Dec 10, 2025

Uh oh!

JoOkuma left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JoOkuma Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

yfukai Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

yfukai left a comment

Choose a reason for hiding this comment

Uh oh!

JoOkuma commented Jan 9, 2026

Uh oh!

yfukai commented Jan 10, 2026

Uh oh!

JoOkuma commented Jan 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov-commenter commented Dec 10, 2025 •

edited

Loading

JoOkuma left a comment •

edited

Loading