Skip to content

Changed successors and predecessors so that they return IDs.#188

Merged
JoOkuma merged 7 commits intoroyerlab:mainfrom
yfukai:sucessor_predecessor_nodeids
Nov 13, 2025
Merged

Changed successors and predecessors so that they return IDs.#188
JoOkuma merged 7 commits intoroyerlab:mainfrom
yfukai:sucessor_predecessor_nodeids

Conversation

@yfukai
Copy link
Copy Markdown
Contributor

@yfukai yfukai commented Nov 4, 2025

Closes #187.

API and Interface Updates

  • Added a return_attrs parameter (defaulting to False) to successors and predecessors methods in src/tracksdata/graph/_base_graph.py, updating their overloads and documentation to clarify return types for both node IDs and attributes.
  • Updated all graph backend implementations (_rustworkx_graph.py, _sql_graph.py, _graph_view.py) to propagate the return_attrs parameter and handle both return modes, including changes to _get_neighbors and related methods.

Internal Logic and Performance

  • Optimized neighbor retrieval logic so that when return_attrs is False, only node IDs are returned, avoiding unnecessary DataFrame creation and improving efficiency for large graphs.
  • Ensured correct mapping between internal and external node IDs in views and backends for both attribute and ID-only queries.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Nov 4, 2025

Codecov Report

❌ Patch coverage is 88.52459% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.05%. Comparing base (75f28d6) to head (bb50ae3).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
src/tracksdata/graph/_sql_graph.py 64.28% 4 Missing and 1 partial ⚠️
src/tracksdata/graph/_rustworkx_graph.py 93.93% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #188      +/-   ##
==========================================
- Coverage   88.17%   88.05%   -0.13%     
==========================================
  Files          51       51              
  Lines        3612     3632      +20     
  Branches      625      633       +8     
==========================================
+ Hits         3185     3198      +13     
- Misses        256      261       +5     
- Partials      171      173       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread src/tracksdata/graph/_test/test_graph_backends.py Outdated
Comment thread src/tracksdata/graph/_test/test_graph_backends.py Outdated
@yfukai
Copy link
Copy Markdown
Contributor Author

yfukai commented Nov 11, 2025

This is also ready! @JoOkuma

@yfukai yfukai marked this pull request as ready for review November 11, 2025 23:25
Copy link
Copy Markdown
Member

@JoOkuma JoOkuma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @yfukai, thanks for the PR.

I left one comment that could reduce code duplication. I think it might make it easier to maintain. What do you think?

Comment thread src/tracksdata/graph/_graph_view.py Outdated
Comment on lines +667 to +688
if isinstance(attr_keys, str):
attr_keys = [attr_keys]
valid_schema = None
neighbors: dict[int, pl.DataFrame] = {}
for node_id in node_ids:
neighbors_indices = neighbors_func(rx_graph, node_id)
neighbors_data: list[dict[str, Any]] = [rx_graph[i] for i in neighbors_indices]

if attr_keys is not None:
neighbors_data = [
{k: edge_data[k] for k in attr_keys if k != DEFAULT_ATTR_KEYS.NODE_ID}
for edge_data in neighbors_data
]

if attr_keys is not None:
neighbors_data = [
{k: edge_data[k] for k in attr_keys if k != DEFAULT_ATTR_KEYS.NODE_ID}
for edge_data in neighbors_data
]

if len(neighbors_data) > 0:
df = pl.DataFrame(neighbors_data)
if attr_keys is None or DEFAULT_ATTR_KEYS.NODE_ID in attr_keys:
df = df.with_columns(
pl.Series(DEFAULT_ATTR_KEYS.NODE_ID, np.asarray(neighbors_indices, dtype=int)),
)
neighbors[node_id] = df
valid_schema = neighbors[node_id].schema
if len(neighbors_data) > 0:
df = pl.DataFrame(neighbors_data)
if attr_keys is None or DEFAULT_ATTR_KEYS.NODE_ID in attr_keys:
df = df.with_columns(
pl.Series(DEFAULT_ATTR_KEYS.NODE_ID, np.asarray(neighbors_indices, dtype=int)),
)
neighbors[node_id] = df
valid_schema = neighbors[node_id].schema
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could reduce the amount of code duplication when return_attrs=True by querying the indices with a call of the same function with return_attrs=False and then querying the dataframes. What do you think?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that was also an option which is better maybe. Let me try on that track!

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I just renamed a variable. Once the test passes, I'll merge, thanks for the PR!

yfukai and others added 2 commits November 13, 2025 07:21
Co-authored-by: Jordão Bragantini <jordao.bragantini@gmail.com>
@yfukai
Copy link
Copy Markdown
Contributor Author

yfukai commented Nov 13, 2025

Hi @JoOkuma, this may not be exactly the same as what you meant but I tried to reduce the code duplicates! Could you review the udpated code?

Comment thread src/tracksdata/graph/_rustworkx_graph.py
@JoOkuma JoOkuma merged commit 5d110f2 into royerlab:main Nov 13, 2025
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Function to obtain successor / predecessor node IDs.

3 participants