fix: tree-sitter 0.25+ API compatibility in chunking#128
Closed
yevgeniy-ds wants to merge 1 commit into
Closed
Conversation
…ties→methods) tree-sitter >= 0.24 changed the Python bindings API: - Parser.parse() now requires str instead of bytes - tree.root_node is now a method, not a property - Node.children is gone; use node.child(i) + node.child_count() - Node.start_byte/end_byte are now methods, not properties - Node.text attribute removed This caused semble search --include-text-files to crash with: TypeError: argument 'source': 'bytes' object is not an instance of 'str' Fix: add compatibility helpers that detect the tree-sitter API version at runtime and call properties/methods appropriately. This keeps semble compatible with both old (<0.24) and new (>=0.25) tree-sitter versions. Also removes the DownloadError import which was removed from tree-sitter-language-pack in newer versions. Refs: bug-20260521-26ea06
Contributor
|
Thanks for bringing this to our attention. But what is up with the spec file, and why did you remove the download error? We hard pin tree-sitter-language-pack |
Contributor
|
I think this is hallucinated: I did an install from scratch after removing @yevgeniy-ds could you confirm whether this crash actually happened for you? If not, we'll close the PR. |
Contributor
|
Ok now I am getting very confused. I have 0.25.2 installed, which is the latest version, and it just works fine. This is just hallucinated. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
semble search --include-text-filescrashes withTypeError: argument 'source': 'bytes' object is not an instance of 'str'when using tree-sitter >= 0.24.The tree-sitter Python bindings changed their API in 0.24+:
Parser.parse()now requiresstrinstead ofbytestree.root_nodeis now a method (tree.root_node()), not a propertyNode.childrenno longer exists; must usenode.child(i)+node.child_count()Node.start_byte/end_byteare now methods, not propertiesSince
pyproject.tomldeclarestree-sitter>=0.25, all these API changes are in effect and the existing code crashes.Fix
Add runtime-compatible helper functions in
core.pythat detect which API variant is active and call it appropriately:_node_start_byte(node)/_node_end_byte(node)— handles both property and method access_node_child_count(node)— handleslen(node.children)andnode.child_count()_node_child(node, i)— handlesnode.children[i]andnode.child(i)_node_children(node)— returns a list in both API variantschunk()now passesstrtoparser.parse()and callstree.root_node()as a method with property fallbackThis keeps semble compatible with both old (<0.24) and new (>=0.25) tree-sitter versions.
Also removes the
DownloadErrorimport which was removed fromtree-sitter-language-packin newer versions.Testing
semble search --include-text-fileson a real repo (no more TypeError)test_core_chunk_passes_str_to_parserRefs
Feedback ID: bug-20260521-26ea06