Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge master into napi branch #129

Closed

Conversation

MichaelBelousov
Copy link

@MichaelBelousov MichaelBelousov commented Mar 10, 2023

  • converted newer bindings on master since creation of the napi branch to use Napi
  • minor binding.gyp fixes for newer node versions
  • fix linux segfaults for not suppressing destructors of static Napi::Reference

Notes:

  • This PR adds a new contract for providing a the TSLanguage* from the module, which is to export a _language property which is a Napi::External. This allows loading both new and old, but I have not added (except in a manual test) the code to generate such a property in new auto generated bindings, and this new contract warrants some discussion

maxbrunsfeld and others added 30 commits November 4, 2019 12:45
A Range contains:
- startIndex
- endIndex
- startPosition
- endPosition
…tree-sitter#63)

* Fix the TypeScript definition of a Range

A Range contains:
- startIndex
- endIndex
- startPosition
- endPosition

* Add `bufferSize` and `includedRanges` to TS def

The function signature of `Parser.prototype.parse` defined in index.js
`Parser.prototype.parse = function(input, oldTree, {bufferSize,
includedRanges}={})` accepts a third optional parameter.
This commit reflects the signature in TypeScript definition.
* queries: initial implementation

* queries: add predicates

* test: setup query tests

* package: add build script

* query: display TSQueryError name instead of value

* tree-sitter: update to latest

* query_cursor: remove & move .exec to query.cc

* query: continue implementation of .exec

* test: adjust query_test.js to match current API

* lint: remove unused module

* node_methods: expose GetMarshalNodes

* lint: add section comments

* query: implement Query.matches

* query: implement Query.captures

* query: implement predicates filtering

* query: make Query::GetPredicates warmer

* query: fix predicates initialization

* tests: unskip query.captures

* query: implement range searching

* query: remove Query.exec

* query: use arrays for marshalling

* node: fix tree-sitter#66 free transfer_buffer

* tests: fix query_test labels

* query: flatten intermediate array

* query: dont transfer capture_count

* unmarshalNodes: pass nodes as array

* lint

* query: remove unused values

* query: avoid out-of-bound access
* Fix typo

* tree-sitter.d.ts - define missing Parser class fields

* tree-sitter.d.ts - define missing SyntaxNode typeId field

* tree-sitter.d.ts - define missed TreeCursor currentFieldName field

* tree-sitter.d.ts - define missed Tree printDotGraph method

* tree-sitter.d.ts - define missed Query class and related types

* tree-sitter.d.ts - fix Query.captures return type

* tree-sitter.d.ts - sync query parameter names with wasm binding

* fix types checking for node types
fix: prevent Nan::Utf8String destructor call due to out of scope
const TSLanguage *language = static_cast<const TSLanguage *>(
GetInternalFieldPointer(value)
);
const TSLanguage *language
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maxbrunsfeld this is the only thing I'm unsure about. I remember having issues loading my language when I originally made this branch, so I added to that language (somewhere, I forget) a special property _language which is a Napi::External with a pointer to the TSLanguage. I can probably remove this change, tests should still pass, but I do think there is probably some untested issue that I need to recover, unrelated to merging with master. Thoughts on me removing this change in this PR and deferring finding what the issue was and solving it to another PR?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to refresh on how much of this should be generated, but apparently in my fork of tree-sitter-sql I modified the node binding module export to export the TSLanguage as a Napi::External for loading here. I did that here

Copy link
Collaborator

@verhovsky verhovsky Mar 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this what @maxbrunsfeld was talking about in #52 (comment)?

Unfortunately, we can't achieve true ABI-stability all in one fell swoop. This module has two binary interfaces that currently depend on Nan / V8 directly:

  1. Retrieving the TSLanguage * pointers out of Language objects from language modules. Currently, the Node.js binding code that the tree-sitter CLI generates for individual languages are reliant on nan and the V8 ABI 😞 .

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost definitely that's it. Well I had a workaround clearly but it's worth discussing.

Copy link
Collaborator

@verhovsky verhovsky May 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the changes suggested in this comment: #52 (comment) ?

@MichaelBelousov MichaelBelousov marked this pull request as ready for review March 15, 2023 17:55
@verhovsky
Copy link
Collaborator

verhovsky commented Apr 12, 2023

@MichaelBelousov node-tree-sitter depends on the superstring library which is also a NAN package. superstring is abandoned, but there's some people that are working on a fork: https://github.com/pulsar-edit/superstring and they're also working on porting it to Node-API https://github.com/pulsar-edit/superstring/tree/napi

@mauricioszabo said on Discord that "We actually are moving to a WASM version of superstring because the move no N-API is going slower than expected".

I guess I'm telling you this in case(/in hopes that) you're interested in helping out with that as well.

@MichaelBelousov
Copy link
Author

I guess I'm telling you this in case(/in hopes that) you're interested in helping out with that as well.

I will at least take a look at what they're doing and figure out what the time commitment would look like.

@verhovsky
Copy link
Collaborator

verhovsky commented May 5, 2023

I tried using your code but it didn't work:

$ cd /tmp
$ git clone --recurse-submodules git@github.com:MichaelBelousov/node-tree-sitter.git
Cloning into 'node-tree-sitter'...
remote: Enumerating objects: 2710, done.
remote: Counting objects: 100% (378/378), done.
remote: Compressing objects: 100% (64/64), done.
remote: Total 2710 (delta 331), reused 323 (delta 314), pack-reused 2332
Receiving objects: 100% (2710/2710), 449.60 KiB | 685.00 KiB/s, done.
Resolving deltas: 100% (1578/1578), done.
Submodule 'vendor/tree-sitter' (https://github.com/tree-sitter/tree-sitter.git) registered for path 'vendor/tree-sitter'
Cloning into '/private/tmp/node-tree-sitter/vendor/tree-sitter'...
remote: Enumerating objects: 36442, done.        
remote: Counting objects: 100% (396/396), done.        
remote: Compressing objects: 100% (188/188), done.        
remote: Total 36442 (delta 226), reused 344 (delta 206), pack-reused 36046        
Receiving objects: 100% (36442/36442), 14.55 MiB | 574.00 KiB/s, done.
Resolving deltas: 100% (25347/25347), done.
Submodule path 'vendor/tree-sitter': checked out '3d554ecf6b68ad2c267c1e90b6ef9aa68ae88bcd'
$ cd node-tree-sitter/
$ git submodule update --recursive --remote
Submodule path 'vendor/tree-sitter': checked out 'a62bac5370dc5c76c75935834ef083457a6dd0e1'
$ git checkout napi-with-query-api
M	vendor/tree-sitter
branch 'napi-with-query-api' set up to track 'origin/napi-with-query-api'.
Switched to a new branch 'napi-with-query-api'
$ npm install
npm WARN deprecated har-validator@5.1.5: this library is no longer supported
npm WARN deprecated uuid@3.4.0: Please upgrade  to version 7 or higher.  Older versions may use Math.random() in certain circumstances, which is known to be problematic.  See https://v8.dev/blog/math-random for details.
npm WARN deprecated request@2.88.2: request has been deprecated, see https://github.com/request/request/issues/3142
npm WARN deprecated tar@2.2.2: This version of tar is no longer supported, and will not receive security updates. Please upgrade asap.
npm WARN deprecated tar@2.2.2: This version of tar is no longer supported, and will not receive security updates. Please upgrade asap.
npm ERR! code 7
npm ERR! path /private/tmp/node-tree-sitter/node_modules/superstring
npm ERR! command failed
npm ERR! command sh -c node-gyp rebuild
npm ERR! gyp info it worked if it ends with ok
npm ERR! gyp info using node-gyp@6.1.0
npm ERR! gyp info using node@20.0.0 | darwin | arm64
npm ERR! gyp info find Python using Python version 3.11.3 found at "/opt/homebrew/opt/python@3.11/bin/python3.11"
npm ERR! gyp ERR! UNCAUGHT EXCEPTION 
npm ERR! gyp ERR! stack TypeError: Cannot assign to read only property 'cflags' of object '#<Object>'
npm ERR! gyp ERR! stack     at createConfigFile (/private/tmp/node-tree-sitter/node_modules/node-gyp/lib/configure.js:118:21)
npm ERR! gyp ERR! stack     at /private/tmp/node-tree-sitter/node_modules/node-gyp/lib/configure.js:85:9
npm ERR! gyp ERR! stack     at /private/tmp/node-tree-sitter/node_modules/mkdirp/index.js:30:20
npm ERR! gyp ERR! stack     at FSReqCallback.oncomplete (node:fs:186:23)
npm ERR! gyp ERR! System Darwin 22.4.0
npm ERR! gyp ERR! command "/opt/homebrew/Cellar/node/20.0.0/bin/node" "/private/tmp/node-tree-sitter/node_modules/.bin/node-gyp" "rebuild"
npm ERR! gyp ERR! cwd /private/tmp/node-tree-sitter/node_modules/superstring
npm ERR! gyp ERR! node -v v20.0.0
npm ERR! gyp ERR! node-gyp -v v6.1.0
npm ERR! gyp ERR! This is a bug in `node-gyp`.
npm ERR! gyp ERR! Try to update node-gyp and file an Issue if it does not help:
npm ERR! gyp ERR!     <https://github.com/nodejs/node-gyp/issues>

npm ERR! A complete log of this run can be found in: /Users/space/.npm/_logs/2023-05-05T14_27_49_593Z-debug-0.log

Usually I've ran into Cannot assign to read only property 'cflags' of object '#<Object>' error when I have a wrong/non-existent filename (because I didn't clone node-tree-sitter without the tree-sitter submodule) in binding.gyp's sources.

@MichaelBelousov
Copy link
Author

I tried using your code but it didn't work:

I've only tried it on Linux. I'm on vacation without a computer until the 17th. I can fix it when I get back, I have a Mac mini to test on

@verhovsky
Copy link
Collaborator

verhovsky commented May 5, 2023

I was able to get it to start building after some minor changes

diff --git a/package.json b/package.json
index da61a08..a1e11b0 100644
--- a/package.json
+++ b/package.json
@@ -17,16 +17,22 @@
   "dependencies": {
     "nan": "^2.14.0",
     "node-addon-api": "git+https://github.com/nodejs/node-addon-api.git",
-    "prebuild-install": "^6.0.1"
+    "prebuild-install": "^7.0.1",
+    "node-gyp": "9.3.1"
   },
   "devDependencies": {
     "@types/node": "^14.14.31",
     "chai": "^4.3.3",
     "mocha": "^8.3.1",
     "prebuild": "^10.0.1",
-    "superstring": "^2.4.2",
+    "@curlconverter/superstring": "^0.0.3",
     "tree-sitter-javascript": "git://github.com/tree-sitter/tree-sitter-javascript.git#master"
   },
+  "overrides": {
+    "prebuild": {
+      "node-gyp": "$node-gyp"
+    }
+  },
   "scripts": {
     "install": "prebuild-install || node-gyp rebuild",
     "build": "node-gyp build",
diff --git a/test/node_test.js b/test/node_test.js
index 8dadc3d..e66f6ce 100644
--- a/test/node_test.js
+++ b/test/node_test.js
@@ -1,7 +1,7 @@
 const Parser = require("..");
 const JavaScript = require('tree-sitter-javascript');
 const { assert } = require("chai");
-const { TextBuffer } = require("superstring");
+const { TextBuffer } = require("@curlconverter/superstring");
 
 describe("Node", () => {
   let parser;
diff --git a/test/parser_test.js b/test/parser_test.js
index b45f1df..19ed1c7 100644
--- a/test/parser_test.js
+++ b/test/parser_test.js
@@ -1,7 +1,7 @@
 const Parser = require("..");
 const JavaScript = require('tree-sitter-javascript');
 const { assert } = require("chai");
-const {TextBuffer} = require('superstring');
+const {TextBuffer} = require('@curlconverter/superstring');
 
 describe("Parser", () => {
   let parser;
diff --git a/vendor/tree-sitter b/vendor/tree-sitter
index c51896d..a62bac5 160000
--- a/vendor/tree-sitter
+++ b/vendor/tree-sitter
@@ -1 +1 @@
-Subproject commit c51896d32dcc11a38e41f36e3deb1a6a9c4f4b14
+Subproject commit a62bac5370dc5c76c75935834ef083457a6dd0e1

but I get a new error:

/tmp/node-tree-sitter$ npm install

> tree-sitter@0.20.0 install
> prebuild-install || node-gyp rebuild

prebuild-install warn install No prebuilt binaries found (target=20.0.0 runtime=node arch=arm64 libc= platform=darwin)
gyp info it worked if it ends with ok
gyp info using node-gyp@9.3.1
gyp info using node@20.0.0 | darwin | arm64
gyp info find Python using Python version 3.11.3 found at "/opt/homebrew/opt/python@3.11/bin/python3.11"
gyp info spawn /opt/homebrew/opt/python@3.11/bin/python3.11
gyp info spawn args [
gyp info spawn args   '/private/tmp/node-tree-sitter/node_modules/node-gyp/gyp/gyp_main.py',
gyp info spawn args   'binding.gyp',
gyp info spawn args   '-f',
gyp info spawn args   'make',
gyp info spawn args   '-I',
gyp info spawn args   '/private/tmp/node-tree-sitter/build/config.gypi',
gyp info spawn args   '-I',
gyp info spawn args   '/private/tmp/node-tree-sitter/node_modules/node-gyp/addon.gypi',
gyp info spawn args   '-I',
gyp info spawn args   '/Users/space/Library/Caches/node-gyp/20.0.0/include/node/common.gypi',
gyp info spawn args   '-Dlibrary=shared_library',
gyp info spawn args   '-Dvisibility=default',
gyp info spawn args   '-Dnode_root_dir=/Users/space/Library/Caches/node-gyp/20.0.0',
gyp info spawn args   '-Dnode_gyp_dir=/private/tmp/node-tree-sitter/node_modules/node-gyp',
gyp info spawn args   '-Dnode_lib_file=/Users/space/Library/Caches/node-gyp/20.0.0/<(target_arch)/node.lib',
gyp info spawn args   '-Dmodule_root_dir=/private/tmp/node-tree-sitter',
gyp info spawn args   '-Dnode_engine=v8',
gyp info spawn args   '--depth=.',
gyp info spawn args   '--no-parallel',
gyp info spawn args   '--generator-output',
gyp info spawn args   'build',
gyp info spawn args   '-Goutput_dir=.'
gyp info spawn args ]
gyp info spawn make
gyp info spawn args [ 'BUILDTYPE=Release', '-C', 'build' ]
  CC(target) Release/obj.target/nothing/node_modules/node-addon-api/nothing.o
  LIBTOOL-STATIC Release/nothing.a
warning: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/libtool: archive library: Release/nothing.a the table of contents is empty (no object file members in the library define global symbols)
  CC(target) Release/obj.target/tree_sitter/vendor/tree-sitter/lib/src/lib.o
  LIBTOOL-STATIC Release/tree_sitter.a
  CXX(target) Release/obj.target/tree_sitter_runtime_binding/src/binding.o
  CXX(target) Release/obj.target/tree_sitter_runtime_binding/src/conversions.o
  CXX(target) Release/obj.target/tree_sitter_runtime_binding/src/language.o
  CXX(target) Release/obj.target/tree_sitter_runtime_binding/src/logger.o
  CXX(target) Release/obj.target/tree_sitter_runtime_binding/src/node.o
  CXX(target) Release/obj.target/tree_sitter_runtime_binding/src/parser.o
In file included from ../src/parser.cc:12:
In file included from ../vendor/superstring/text-buffer-snapshot-wrapper.h:4:
../node_modules/nan/nan.h:686:39: warning: 'IdleNotificationDeadline' is deprecated: Use MemoryPressureNotification() to influence the GC schedule. [-Wdeprecated-declarations]
    return v8::Isolate::GetCurrent()->IdleNotificationDeadline(
                                      ^
/Users/space/Library/Caches/node-gyp/20.0.0/include/node/v8-isolate.h:1291:3: note: 'IdleNotificationDeadline' has been explicitly marked deprecated here
  V8_DEPRECATE_SOON(
  ^
/Users/space/Library/Caches/node-gyp/20.0.0/include/node/v8config.h:550:39: note: expanded from macro 'V8_DEPRECATE_SOON'
# define V8_DEPRECATE_SOON(message) [[deprecated(message)]]
                                      ^
1 warning generated.
  CXX(target) Release/obj.target/tree_sitter_runtime_binding/src/query.o
  CXX(target) Release/obj.target/tree_sitter_runtime_binding/src/tree.o
../src/tree.cc:206:3: error: use of undeclared identifier 'cached_nodes_'
  cached_nodes_[key] = cache_entry;
  ^
../src/tree.cc:207:3: error: void function 'CacheNodeForTree' should not return a value [-Wreturn-type]
  return env.Undefined();
  ^      ~~~~~~~~~~~~~~~
2 errors generated.
make: *** [Release/obj.target/tree_sitter_runtime_binding/src/tree.o] Error 1
gyp ERR! build error 
gyp ERR! stack Error: `make` failed with exit code: 2
gyp ERR! stack     at ChildProcess.onExit (/private/tmp/node-tree-sitter/node_modules/node-gyp/lib/build.js:203:23)
gyp ERR! stack     at ChildProcess.emit (node:events:511:28)
gyp ERR! stack     at ChildProcess._handle.onexit (node:internal/child_process:293:12)
gyp ERR! System Darwin 22.4.0
gyp ERR! command "/opt/homebrew/Cellar/node/20.0.0/bin/node" "/private/tmp/node-tree-sitter/node_modules/.bin/node-gyp" "rebuild"
gyp ERR! cwd /private/tmp/node-tree-sitter
gyp ERR! node -v v20.0.0
gyp ERR! node-gyp -v v9.3.1
gyp ERR! not ok 
npm ERR! code 1
npm ERR! path /private/tmp/node-tree-sitter
npm ERR! command failed
npm ERR! command sh -c prebuild-install || node-gyp rebuild

npm ERR! A complete log of this run can be found in: /Users/space/.npm/_logs/2023-05-05T17_27_10_419Z-debug-0.log

enjoy the vacation

@maxbrunsfeld
Copy link
Contributor

Thanks so much for working on this @MichaelBelousov. It seems like it's very close.

Thoughts on two of the design questions:

  1. superstring - Now that the Atom editor is completely dead. I think that the superstring support in this module can be completely deprioritized. I think we should just remove the dependency on superstring and all support for parsing superstring TextBuffers. I don't think it's worth maintaining that anymore.
  2. exporting the language pointers - I would be ok with bumping the major version of this library, and compeltely changing how the language objects are exported (the way that you suggested, with an NAPI::External seems fine). We'll need to make a corresponding change in the tree-sitter CLI, so that when it generates the Node binding, it stops using nan. It's a breaking change, but at this point, this node library has been frozen for so long that it's better to make a breaking change than to leave it.

@mauricioszabo
Copy link

I'm not sure if it helps, but I was working on this migration in parallel, before I found out that all grammars would suffer from the same problem and will not load on newer Electron versions anyway: pulsar-edit#1

@verhovsky
Copy link
Collaborator

Why won't they load on Electron?

@mauricioszabo
Copy link

@verhovsky After Electron 13, all modules need to be "context-aware", meaning that they need to offer some guarantees of not keeping global "node context" objects. node-tree-sitter in NaN is not, but in N-API it is.

The problem is that none of the grammars are context-aware, and they all need to be. So even if I could finish the migration to N-API, all grammars would need to be converted...

@MichaelBelousov
Copy link
Author

The problem is that none of the grammars are context-aware, and they all need to be. So even if I could finish the migration to N-API, all grammars would need to be converted...

But the bindings for languages are both generated and versioned... so can't we just move up a major version and have the newer version generate incompatible bindings? The version check will still be ABI compatible ofc

@sualehasif
Copy link

We use tree-sitter a ton in production and i would love to help get this merged so that tree-sitter can transition to the NAPI. @MichaelBelousov @maxbrunsfeld wondering if you two have some sense of the work that is required to get this PR merged. I really appreciate the work that both of you have put in to get this PR so close.

@MichaelBelousov
Copy link
Author

@MichaelBelousov @maxbrunsfeld wondering if you two have some sense of the work that is required to get this PR merged.

I can do the work and provide a schedule if @maxbrunsfeld has a plan for getting it merged/reviewed

@sualehasif
Copy link

I can help out too! It just so happens that @maxbrunsfeld has the best context on getting it merged

@segevfiner
Copy link
Contributor

segevfiner commented Dec 27, 2023

Another issue is that unless this module is made fully context aware, as in, using SetInstanceData for all module level data. It seems to segfault in vitest with threads enabled. 😢

I'm now stuck with deciding what to do... Whether I want to re-implement stuff in my own native node addon module, maintain a fork and complete the work on this there, etc... which also includes having to port the grammers binding, and regenerate and rebuild any that I want to use... Or maybe using the slower wasm version...

But I'm not sure if @maxbrunsfeld, or anyone else, is still actively maintaining this to get this merged and done all the way through...

@segevfiner
Copy link
Contributor

Oh great. superstring even fails to build in newer macOS:

npm ERR! ../src/core/encoding-conversion.cc:64:13: error: no matching function for call to 'iconv_close'
npm ERR!   if (data) iconv_close(data);
npm ERR!             ^~~~~~~~~~~
npm ERR! /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/iconv.h:74:5: note: candidate function not viable: cannot convert argument of incomplete type 'void *' to 'iconv_t' (aka '__tag_iconv_t *') for 1st argument
npm ERR! int     iconv_close(iconv_t);
npm ERR!         ^
npm ERR! ../src/core/encoding-conversion.cc:120:32: error: no matching function for call to 'iconv'
npm ERR!       auto conversion_result = iconv(
npm ERR!                                ^~~~~
npm ERR! /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/iconv.h:71:8: note: candidate function not viable: no known conversion from 'iconv_t *' (aka '__tag_iconv_t **') to 'iconv_t' (aka '__tag_iconv_t *') for 1st argument; dereference the argument with *
npm ERR! size_t  iconv(iconv_t, char ** __restrict,
npm ERR!         ^
npm ERR! 2 errors generated.

@verhovsky
Copy link
Collaborator

superstring is unmaintained and should be removed.

#141

@segevfiner segevfiner mentioned this pull request Feb 25, 2024
@amaanq amaanq closed this Mar 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet