Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update QueryCursor API to match Rust library to enable filtering predicates by default #2

Merged
merged 9 commits into from
Jan 4, 2023

Conversation

gordon-klotho
Copy link

@gordon-klotho gordon-klotho commented Jan 3, 2023

Updated to match https://docs.rs/tree-sitter/latest/tree_sitter/struct.QueryCursor.html#method.matches https://www.npmjs.com/package/tree-sitter

Most notably: input text is stored on the Tree so Node can use it in Content() to return the node text without extra inputs.

bindings_test.go Outdated Show resolved Hide resolved
})

// test predicate not match
testCaptures(t, js, `((sum left: _ @left) (#not-match? @left "^[0-9]+$"))`, 0, []string{})
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original implementation fails this test case:

--- FAIL: TestQuery (0.00s)
    /home/gordon/workspace/go-tree-sitter/bindings_test.go:397: 
        	Error Trace:	bindings_test.go:397
        	            				bindings_test.go:340
        	Error:      	Not equal: 
        	            	expected: 0
        	            	actual  : 1
        	Test:       	TestQuery
    /home/gordon/workspace/go-tree-sitter/bindings_test.go:398: 
        	Error Trace:	bindings_test.go:398
        	            				bindings_test.go:340
        	Error:      	Not equal: 
        	            	expected: []string{}
        	            	actual  : []string{"1"}

bindings.go Outdated
@@ -877,9 +878,10 @@ func NewQueryCursor() *QueryCursor {
}

// Exec executes the query on a given syntax node.
func (qc *QueryCursor) Exec(q *Query, n *Node) {
func (qc *QueryCursor) Exec(q *Query, n *Node, text []byte) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer a slightly more descriptive name, like program. I thought at first glance that this referred to the query's text.

Also, is this something we expect to publish back up to smacker? If so, this is a pretty big breaking change.

If we're making a big change like that, I'd suggest we go a big bigger and see if we can add the program code to Node (or its Tree or TSNode). That would keep this method the same, plus let us turn Node.Content([]byte) into a more natural Node.Content().

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer a slightly more descriptive name, like program.

program is only what the top-level node is called in the javascript grammar - I don't know what it is for others. text matches what the Rust library uses (albeit in a slightly different form, as text_provider).

Also, is this something we expect to publish back up to smacker? If so, this is a pretty big breaking change.

Yeah, I'd ideally like to but as you mentioned it'd have to be a major version bump.

If we're making a big change like that, I'd suggest we go a big bigger and see if we can add the program code to Node (or its Tree or TSNode).

I'm just mirroring how the Rust library (https://docs.rs/tree-sitter/latest/tree_sitter/struct.QueryCursor.html#method.matches) although it looks like in the node version, they do attack it to the tree (https://github.com/tree-sitter/node-tree-sitter/blob/master/index.js#L288-L289) at parse time. I like the idea of having Content be simpler so I'll try that and update the PR.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need this param, with adding Tree.input?


before, ok := qc.NextMatch()
qm, ok := qc.nextMatch(false)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we want to test nextMatch(false), given that nextMatch(true) is the only exported form?

At minimum, we should test nextMatch(true), I'd think. But given that nextMatch(false) only exists internally to the module, I would argue for just rm'ing it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's mostly for unit testing, otherwise getting a QueryMatch that has satisfiesTextPredicates() == false to test that function would be annoying to construct.

bindings.go Outdated

return qm, true
if filterPredicates && !qm.satisfiesTextPredicates(qc.q, qc.text) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❔ Maybe also add an && qc.text != nil in between, so that if qc.text is nil, it reverts back to the old behavior?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

qc.text should never be nil, but moving this to the tree will resolve this potential discrepancy. Since none of these functions return an error to use, panicing would be the correct behaviour as it is a programming error.

@@ -565,8 +553,8 @@ func (n Node) Edit(i EditInput) {
}

// Content returns node's source code from input as a string
func (n Node) Content(input []byte) string {
return string(input[n.StartByte():n.EndByte()])
func (n Node) Content() string {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we keep the argument but allow it to be nil, in which case it defaults to n.t.input, then we can keep the API the same. Then, we could propose it upstream in a form that wouldn't require a major version bump.

Definitely not a blocker or anything, just a consideration.

// most probably better save node.id
cache map[C.TSNode]*Node
}

// Copy returns a new copy of a tree
func (t *Tree) Copy() *Tree {
return t.p.newTree(C.ts_tree_copy(t.c))
nt := t.p.newTree(C.ts_tree_copy(t.c))
nt.input = t.input

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this need to use copy()?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah it accidentally got reverted. I tried to update it to work with Input but it ended up getting too complicated.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, do we actually need to copy it? Copying the tree makes sense because you can edit nodes; but as a user, I would expect either that I shouldn't ever change the underlying source array (in which case we don't need the copy), or that I can and it won't affect the tree (in which case we also need the copy when we first parse).

In other words, it feels inconsistent that the Tree creates its own, defensive copy of the input, but only if you call Copy — not on initial parse.

Personally, I think it makes sense to just always use the reference to the original.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(My approval still stands, btw, above comment notwithstanding)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree the discrepancy is weird. I'd rather not make a bunch of potentially expensive defensive copies so I'm inclined to agree with removing it as well.

@gordon-klotho gordon-klotho merged commit f7b52b0 into master Jan 4, 2023
@gordon-klotho gordon-klotho deleted the feature/always_filter_predicates branch January 4, 2023 21:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants