Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get Subtree through iteration #270

Closed
Nexucis opened this issue Aug 17, 2020 · 14 comments
Closed

get Subtree through iteration #270

Nexucis opened this issue Aug 17, 2020 · 14 comments

Comments

@Nexucis
Copy link

Nexucis commented Aug 17, 2020

Hi,

When using the method iterate on a SubTree, for some cases I'm not able to get the corresponding SubTree where the iteration stopped.

For example, I have the following tree:

Expr(
  BinaryExpr(
    Expr(
      AggregateExpr(
        AggregateOp(sum),
        FunctionCallBody(
          FunctionCallArgs(
            Expr(
              MetricIdentifier(
                Identifier
              )
            )
         )
       )
     )
   )
)

which matched the following expression:

sum(rate)

For this example, FunctionCallBody, FunctionCallArgs, Expr and MetricIdentifier are started at the same position.

I have the following iteration method:

function getNode(node: Subtree) {
    let i = 0;
    const subChildChained = ['AggregateExpr', 'FunctionCallBody', 'FunctionCallArgs', 'Expr'];
    const child = node.iterate<Subtree>({
      enter: (type, start) => {
        if (i >= subChildChained.length) {
          return node.resolve(start, -1);
        }
        if (type.name === subChildChained[i]) {
          i++;
          // continue to loop on this node
          return undefined;
        }
        // go to the next node
        return false;
      },
    });
}

the idea is to give the path to follow thanks to the array subChildChained to the iteration method.
Then if the last node is found, it returns the associated Subtree.

Unfortunately, so far I didn't find how to get it. I tried by using node.resolve or node.childBefore or node.childAfter but since I have multiple node started at the same position, I'm not able to get the Subtree for the node MetricIdentifier but only for the node FunctionCallBody.

Is there a way to get exactly the node where I stopped the iteration ?

@marijnh
Copy link
Member

marijnh commented Aug 17, 2020

Right, resolve works by position, and if multiple nodes start at the same offset it is ambiguous. You could track tree depth during iteration, and if tree.resolve(start, 1) returns a node that's too deep, back out as far as necessary using the parent pointer. But that's not terribly elegant either. If you have ideas for how this could be improved, I'm open to them.

@Nexucis
Copy link
Author

Nexucis commented Aug 17, 2020

ah ok. I was hopping I was missing something.

Could it be possible to do this:

type EnterFunc<T> = (tree: SubTree) => T | false | undefined

instead of doing this:

type EnterFunc<T> = (type: NodeType, start: number, end: number) => T | false | undefined

It will change a bit the method iterInner, since it will have to build at each iteration the SubTree.

@marijnh
Copy link
Member

marijnh commented Aug 17, 2020

Constructing all those objects would make iteration a lot more expensive, so no, not really.

@Nexucis
Copy link
Author

Nexucis commented Aug 17, 2020

ok got it.

But it could also be possible to create another iteration function that is dedicated to iterate through a Tree by manipulating only the tree node and not other value like the position. With a warning saying this iteration is more expensive than the other one and it should be used cautiously.

Another possibility: what about passing the child in the enter function ?

type EnterFunc<T> = (type: NodeType, start: number, end: number, child: Tree | TreeBuffer ) => T | false | undefined

Then like that it should be possible to call the constructor of the NodeSubtree or of BufferSubtree like it is done in the method findChild.

@Nexucis
Copy link
Author

Nexucis commented Aug 17, 2020

Or even simpler, instead of having a generic iterate method, you have the same iterate method that always returns a Subtree. Like that it doesn't cost more and it doesn't increase the number of parameters + the complexity of how they should be used

@Nexucis
Copy link
Author

Nexucis commented Aug 26, 2020

Just in case if someone would like to do the same things, I implemented the workaround you suggested @marijnh and it works.

So here is the code that do that:

export function walkThrough(node: Subtree, ...path: (number | string)[]): Subtree | undefined | null {
  let i = 0;
  // Somehow, the first type is always the given node.
  // So we have to manually move forward before considering the given path.
  // A way to achieve that is to manually add the given node in the path
  path.unshift(node.type.id);
  return node.iterate<Subtree | null>({
    enter: (type, start) => {
      if (type.id === path[i] || type.name === path[i]) {
        i++;
        if (i >= path.length) {
          // We reached the last node. We should return it then.
          // First get the first node resolved at the `start` position.
          let result: Subtree | null = node.resolve(start, -1);
          if (result.type.id === type.id && result.start === start) {
            return result;
          }
          // In case the first node returned is not the one expected,
          // then go to the deepest node at the `start` position and walk backward.
          // Note: workaround suggested here: https://github.com/codemirror/codemirror.next/issues/270#issuecomment-674855519
          result = node.resolve(start, 1);
          while (result && result.type.id !== type.id) {
            result = result.parent;
          }
          return result;
        }
        // continue to loop on this node
        return undefined;
      }
      // go to the next node
      return false;
    },
  });
}

Note the comments at the beginning of the method. I think that's a bit weird that the iteration method doesn't iterate over the children.
We have to add manually the given node to be able to then iterate over the children

@marijnh
Copy link
Member

marijnh commented Aug 26, 2020

I think that's a bit weird that the iteration method doesn't iterate over the children.

I don't understand what you mean by this. Can you elaborate?

@Nexucis
Copy link
Author

Nexucis commented Aug 26, 2020

So imagine you have the following tree:

a (
  b (
     c ()
  )
)

So now I'm starting to iterate at the node a, like that:

a.iterate()

When it will enter for the first time in the method enter and if you print the type.name it will print a and not b

So that's why I'm saying it doesn't iterate at first over the children. It will but after the 2nd occurrence

@juliusv
Copy link

juliusv commented Aug 26, 2020

@Nexucis If I understand correctly, you are surprised by the tree iteration containing the root node itself before iterating down to the children? To me, this is the expected behavior of any tree iteration, the root node should always be included... or am I missing something?

@Nexucis
Copy link
Author

Nexucis commented Aug 26, 2020

yes that's exactly what I mean.

It is certainly the standard, it's just a bit surprising when you are using it. Specially when you are not at the root of the tree but in the middle.
It's not really an issue, it's something to know, specially when you just want to iterate other the children.

@Nexucis
Copy link
Author

Nexucis commented Oct 6, 2020

Actually sometimes the workaround described above doesn't work every time.

For example, I have the following tree:

Expr(
	BinaryExpr(
		Expr(VectorSelector(MetricIdentifier(Identifier))),
		Mul,
		BoolModifier,
		BinModifier(GroupModifiers(OnOrIgnoring(On,GroupingLabels(GroupingLabelList(GroupingLabelList(GroupingLabel(LabelName)),GroupingLabel(LabelName)))))),
		Expr(VectorSelector(MetricIdentifier(Identifier)))
	)
)

that is matching this expression: foo * on(test,blub) bar

I'm starting the iteration with the node BinaryExpr and I try to get the node OnOrIgnoring.
The iteration is correctly founding the node OnOrIgnoring, but it fails when it try to get the corresponding Subtree.

When it finds the node OnOrIgnoring, start is equal to 5, and end to 19.
In this particular case node.resolve(start, -1) is returning the node corresponding to Mul. So the code is going to the next line and will try the workaround.
But node.resolve(start, 1) is actually returning the node BinaryExpr that has start equal to 0. So it won't be possible to get the SubTree of OnOrIgnoring since BinaryExpr is a parent of the node searched ...

I don't get what happen :/ and at this point, I have no idea how to resolve it.

@marijnh
Copy link
Member

marijnh commented Oct 7, 2020

(As a note, I'm working on overhauling the tree iteration interface to be a bit more flexible, using an external iterator (stateful object instead of method that calls a callback). That'll make it a lot easier to get a subtree at a given iteration position.)

@marijnh
Copy link
Member

marijnh commented Oct 23, 2020

The TreeCursor interface in lezer-tree 0.12.0 allow converting between that and a node object, which should cover this.

@marijnh marijnh closed this as completed Oct 23, 2020
@Nexucis
Copy link
Author

Nexucis commented Oct 24, 2020

nice !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants