Skip to content

[IR] introduce slice support on graph #2307

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

Johansmm
Copy link
Contributor

Introduce slice support
Close #2302

Copy link
Collaborator

@justinchuby justinchuby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@justinchuby
Copy link
Collaborator

justinchuby commented May 14, 2025

Could you also update the following text to match the implementation?

        although accessing nodes at either end of the set is O(1). I.e.
        ``linked_set[0]`` and ``linked_set[-1]`` are O(1).

in the class docstring above.

also https://github.com/Johansmm/onnxscript/blob/862de43e60f51ee4833104a254da9d8db80420a6/onnxscript/ir/_core.py#L2319 on complexity?

There are two type annotations to update: https://github.com/Johansmm/onnxscript/blob/862de43e60f51ee4833104a254da9d8db80420a6/onnxscript/ir/_core.py#L2715 and https://github.com/Johansmm/onnxscript/blob/862de43e60f51ee4833104a254da9d8db80420a6/onnxscript/ir/_core.py#L2285 . Please follow the recommendation below to create typing overloads for the methods

Copy link
Collaborator

@justinchuby justinchuby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments, thank you!

@github-project-automation github-project-automation bot moved this from Done to In Progress in ONNX Script Review Board May 14, 2025
@justinchuby justinchuby added topic: api module: IR Intermediate representation labels May 14, 2025
@justinchuby justinchuby changed the title [IR] introduce slice support (#2302) [IR] introduce slice support on graph May 14, 2025
Copy link

codecov bot commented May 14, 2025

Codecov Report

Attention: Patch coverage is 72.41379% with 8 lines in your changes missing coverage. Please review.

Project coverage is 73.76%. Comparing base (644e30c) to head (e29419f).
Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
onnxscript/ir/_core.py 60.00% 0 Missing and 6 partials ⚠️
onnxscript/ir/_linked_list.py 75.00% 0 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2307      +/-   ##
==========================================
- Coverage   73.76%   73.76%   -0.01%     
==========================================
  Files         239      239              
  Lines       30906    30930      +24     
  Branches     3494     3503       +9     
==========================================
+ Hits        22799    22815      +16     
  Misses       6907     6907              
- Partials     1200     1208       +8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Johansmm Johansmm force-pushed the support-slices-ellipsis-as-index branch from 862de43 to 2937953 Compare May 15, 2025 20:08
@Johansmm
Copy link
Contributor Author

Last push force with suggestions

@Johansmm Johansmm requested a review from justinchuby May 15, 2025 20:09
@Johansmm Johansmm force-pushed the support-slices-ellipsis-as-index branch from 2937953 to 1005aa7 Compare May 15, 2025 20:13
@justinchuby
Copy link
Collaborator

justinchuby commented May 15, 2025

The final thing I wanted to make sure before merge is that accessing nodes on either ends (graph.node[0], graph.node[-1]) is not much slower (better if it turns out to be faster) than before. Would you be able to run a small test?

@Johansmm
Copy link
Contributor Author

@justinchuby, the new complexity is O(n) for any case.
I made the following python script to evaluate this:

import time

from onnxscript.ir import _linked_list


class _TestElement:
    def __init__(self, value):
        self.value = value

    def __repr__(self) -> str:
        return f"_TestElement({self.value})"


N = 100000
linked_list = _linked_list.DoublyLinkedSet()
elems = [_TestElement(i) for i in range(N)]
linked_list.extend(elems)

time_list_1 = []
idx = -1
for _ in range(1000):
    t = time.time()
    x = tuple(linked_list)[idx]
    t = time.time() - t
    time_list_1.append(t)

time_list_2 = []
for _ in range(1000):
    t = time.time()
    for i in reversed(linked_list):
        x = i
        break
    t = time.time() - t
    time_list_2.append(t)


print(sum(time_list_1) / 1000)
print(sum(time_list_2) / 1000)

Results changing N:

N time previous time now
100 0.0 2.5733470916748046e-05
1000 0.0 0.00013534140586853028
10000 0.0 0.0014723334312438964
100000 1.3112068176269532e-05 0.015811803340911864

Therefore:

  1. do you want me to reverse the code as long as I handle the integers separately?
  2. I leave the code as it is, since it is not common to have models with more than 1000 nodes ?

@justinchuby
Copy link
Collaborator

It is in fact quite common to have models with more than 1000 nodes in the case of LLMs. I would prefer it to run faster. Could you do (1) as you suggested, and revert the comment changes? Thanks a lot! I think it makes sense to handle slices in an if branch.

@Johansmm Johansmm force-pushed the support-slices-ellipsis-as-index branch from 1005aa7 to bb10b1c Compare May 16, 2025 21:16
@Johansmm
Copy link
Contributor Author

Johansmm commented May 16, 2025

@justinchuby, I have updated the code to produce a more optimal response. Now __getitem__ is fully O(n).
Please check it and let me know what you think.
Unfortunately, now __getitem__ returns a list instead of a tuple, because items = items + (item,) has O(n) at each iteration, but items.append(item) has O(1) (at each iteration).
I hope these changes are relevant.

@justinchuby
Copy link
Collaborator

Thank you! The changes look very reasonable. Were you able to run the benchmark again with some examples?

@Johansmm
Copy link
Contributor Author

@justinchuby can you explain what benchmark you are talking about? If you are referring to the IR unit tests, I ran them and they passed without any problem.

@justinchuby
Copy link
Collaborator

justinchuby commented May 17, 2025

I was referring to #2307 (comment) this script in your comment

@github-project-automation github-project-automation bot moved this from In Progress to Done in ONNX Script Review Board May 17, 2025
@justinchuby justinchuby requested a review from Copilot May 18, 2025 02:24
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds slice support for accessing elements from the graph‐related collections and linked list implementation. Key changes include:

  • Addition of parameterized tests for slicing in _linked_list_test.py.
  • Implementation of slice overloads and corresponding logic in DoublyLinkedSet in _linked_list.py.
  • Similar overload enhancements for getitem in node collections in _core.py.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
onnxscript/ir/_linked_list_test.py Added parameterized test to validate slice support
onnxscript/ir/_linked_list.py Updated getitem to support slicing with overloads and custom logic
onnxscript/ir/_core.py Added overload annotations for slice support in node collection access

Copy link
Collaborator

@justinchuby justinchuby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry after taking another look I wonder if the added complexity is worth it (line 160-162, 166-167). Would it be a good idea to (1) keep the original logic for int index, and (2) use tuple for slices?

@Johansmm Johansmm force-pushed the support-slices-ellipsis-as-index branch from bb10b1c to e29419f Compare May 20, 2025 22:00
@Johansmm
Copy link
Contributor Author

@justinchuby, I see that you don't agree with my algorithm, so I have gone back to the initial solution. Note that this one is a bit more expensive.
Do not hesitate to ask me for anything else if necessary 👍

Copy link
Collaborator

@justinchuby justinchuby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I think it’s a good trade off for simplicity

@@ -2282,7 +2282,12 @@
def opset_imports(self) -> dict[str, int]:
return self._opset_imports

def __getitem__(self, index: int) -> Node:
@typing.overload
def __getitem__(self, index: int) -> Node: ...

Check notice

Code scanning / CodeQL

Statement has no effect Note

This statement has no effect.
@typing.overload
def __getitem__(self, index: int) -> Node: ...
@typing.overload
def __getitem__(self, index: slice) -> Sequence[Node]: ...

Check notice

Code scanning / CodeQL

Statement has no effect Note

This statement has no effect.
@@ -2712,7 +2717,12 @@
self._metadata_props: dict[str, str] | None = metadata_props
self._nodes: tuple[Node, ...] = tuple(nodes)

def __getitem__(self, index: int) -> Node:
@typing.overload
def __getitem__(self, index: int) -> Node: ...

Check notice

Code scanning / CodeQL

Statement has no effect Note

This statement has no effect.
@typing.overload
def __getitem__(self, index: int) -> Node: ...
@typing.overload
def __getitem__(self, index: slice) -> Sequence[Node]: ...

Check notice

Code scanning / CodeQL

Statement has no effect Note

This statement has no effect.
@@ -2961,7 +2971,12 @@
def attributes(self) -> OrderedDict[str, Attr]:
return self._attributes

def __getitem__(self, index: int) -> Node:
@typing.overload
def __getitem__(self, index: int) -> Node: ...

Check notice

Code scanning / CodeQL

Statement has no effect Note

This statement has no effect.
@typing.overload
def __getitem__(self, index: int) -> Node: ...
@typing.overload
def __getitem__(self, index: slice) -> Sequence[Node]: ...

Check notice

Code scanning / CodeQL

Statement has no effect Note

This statement has no effect.
@@ -136,11 +136,18 @@
)
return self._length

def __getitem__(self, index: int) -> T:
@overload
def __getitem__(self, index: int) -> T: ...

Check notice

Code scanning / CodeQL

Statement has no effect Note

This statement has no effect.
@overload
def __getitem__(self, index: int) -> T: ...
@overload
def __getitem__(self, index: slice) -> Sequence[T]: ...

Check notice

Code scanning / CodeQL

Statement has no effect Note

This statement has no effect.
@justinchuby justinchuby enabled auto-merge (squash) May 21, 2025 17:36
@justinchuby justinchuby merged commit 6d5a135 into microsoft:main May 21, 2025
25 of 29 checks passed
@justinchuby justinchuby mentioned this pull request May 21, 2025
9 tasks
@Johansmm Johansmm deleted the support-slices-ellipsis-as-index branch May 21, 2025 20:19
@justinchuby
Copy link
Collaborator

Hi @Johansmm, we are in the process of migrating the IR component to https://github.com/onnx/ir-py so that it becomes an official onnx project. Would you like to help us duplicate your PR to the new repository as well? We are looking to create an initial release within about a week or less.

@Johansmm
Copy link
Contributor Author

@justinchuby done: onnx/ir-py#25

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: IR Intermediate representation topic: api
Projects
Development

Successfully merging this pull request may close these issues.

[IR] Unsupported slice and ellipsis indexes to iter a graph
2 participants