-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Emacs crashes due to SIGABRT when using new scanner #38
Comments
Hey thanks for the report! It's weird that this doesn't happen during CI since that example.hcl is also parsed there i think! I'll give it a look tomorrow! |
Thanks, your repro script worked fine!
|
When i provide it the library that gets generated by tree-sitter when you do
|
An Emacs maintainer has also found that the grammar works fine if it is built manually: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=66549#11 I imagine that means the problem is in the way Emacs builds the grammar? Here is the relevant code from Emacs' ;; We need to go into the source directory because some
;; header files use relative path (#include "../xxx").
;; cd "${sourcedir}"
(setq default-directory source-dir)
(message "Compiling library")
;; cc -fPIC -c -I. parser.c
(treesit--call-process-signal
cc nil t nil "-fPIC" "-c" "-I." "parser.c")
;; cc -fPIC -c -I. scanner.c
(when (file-exists-p "scanner.c")
(treesit--call-process-signal
cc nil t nil "-fPIC" "-c" "-I." "scanner.c"))
;; c++ -fPIC -I. -c scanner.cc
(when (file-exists-p "scanner.cc")
(treesit--call-process-signal
c++ nil t nil "-fPIC" "-c" "-I." "scanner.cc"))
;; cc/c++ -fPIC -shared *.o -o "libtree-sitter-${lang}.${soext}"
(apply #'treesit--call-process-signal
(if (file-exists-p "scanner.cc") c++ cc)
nil t nil
`("-fPIC" "-shared"
,@(directory-files
default-directory nil
(rx bos (+ anychar) ".o" eos))
"-o" ,lib-name))
;; Copy out. So it looks like the shared object is built with git clone --depth=1 https://github.com/MichaHoffmann/tree-sitter-hcl.git
cd tree-sitter-hcl/src
cc -fPIC -I. -c parser.c
cc -fPIC -I. -c scanner.c
cc -fPIC -shared *.o -o libtree-sitter-hcl.so If I run this and then mount the resulting object into the container at I don't see any shared object produced when running |
It's in $HOME/.cache/tree-sitter for me but I also compiled with the emacs commands and it worked on my machine. I'll try tomorrow again, maybe I made a mistake. I also patched emacs to compile the parser with -g and during serialization and deserialization the state of the scanner looks pretty wrong. If I compile outside of the container in my checkout of this repo it works and the state looks fine though. Cc @amaanq do you know if any other parser had similar issues? edit: did you also change the minimal-reproduce.el so that it not compiles and overwrites your mounted file? |
I did, yes; that's the most recent commit in my repo. I can confirm that the .so object created by The shared object produced by
Anyway, thanks for looking at this bug. I'm now more convinced that the problem is on Emacs's side. I'll leave it up to you whether or not to keep this issue open. |
No, Edit - didn't read enough |
Thank you @amaanq ! |
@erik-overdahl I'll keep this open. It feels like the library should not fail with invalid memory access, regardless of compilation flags or compiler there. Also the way emacs compiles it does not even look offensive at all so i would really like to know whats happening before i'll close the issue. I'm happy that we found a workaround for you though! |
Unfortunately, I have discovered that using the object file produced by |
The Emacs maintainers have decided that this isn't a problem on their side https://debbugs.gnu.org/cgi/bugreport.cgi?bug=66549#35 Unless you want to keep digging further, the easiest "fix" is probably to put a note in the readme that HEAD doesn't work with Emacs 29.1 builtin treesit mode. If you look at the last few messages in the Emacs bugreport thread, you'll see that the crash only happens when compiled at -O0 (which Emacs's |
Hey sorry for not responding but I had no time last days. Let's dog a bit more over the weekend, it's definitely something I want to understand |
@amaanq do you have an idea? I have not been able to reproduce this outside of emacs, even fuzzing for a while. Because i have no idea how to fix forward I start leaning towards rolling back the move to C, would that be acceptable from your side? |
I have an idea but let me get to a computer first in a couple hours and I'll send a patch as to what I think a fix could be, but if you'd like to implement my idea now replace the serialized byte lengths that are casted to char/uint8_t with a memcpy of all 4 bytes instead, or just write each byte properly with shifts |
Sorry @amaanq ; are you able to send a PR please? |
Sorry, yes let me do that It'd be nice @erik-overdahl if you can try those changes out and see if the crashing is fixed |
Hey @erik-overdahl ; latest main solves it for me with your repro instructions! Can you check it out please? |
The commit fixes the crash that occurred when the grammar was compiled with
optimization level 0 (as Emacs does by default). The parse tree of the
example file is now closer to correct but still contains a large number of
error tokens. So the originally reported issue here is fixed, but the
grammar is not completely usable in Emacs.
…On Mon, Nov 13, 2023 at 09:22 Michael Hoffmann ***@***.***> wrote:
Hey @erik-overdahl <https://github.com/erik-overdahl> ; latest main
solves it for me with your repro instructions! Can you check it out please?
—
Reply to this email directly, view it on GitHub
<#38 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AMTYYJY6UE4O5Q35RPZWMKTYEI3LVAVCNFSM6AAAAAA6APM6BWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMBYGM3TAMBSGQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Can you help me reproduce that? it should not contain errors. Is there a way to dump the parse tree from emacs? [ sorry no emacs user here ] |
But not crashing is a good first step ! |
i managed with
|
yeah strings look pretty messed up; let me try to fix it! |
@erik-overdahl i have a version that parses successfully! ill push in a minute |
@erik-overdahl can you try locally from |
Since I was able to reproduce initially but not on latest main, ill close this. @erik-overdahl if you still have the issue please reopen! If you can confirm that this is also solved for you it would be cool to know too! |
When building the grammar from HEAD (currently b553906) and attempting to use the grammar to highlight an HCL file in Emacs 29.1, the grammar causes Emacs to crash with a
munmap_chunk(): invalid pointer
error.I am unsure of whether this bug resides in the grammar or within Emacs. It only occurs when Emacs is configured with the flag
--with-pgtk
, which tends to be finicky. However, it does not occur with the v1.1.0 release from this repo, and so may be a problem with the rewritten scanner. I have also submitted a report to the GNU Emacs bug tracker.I've written up a full reproduction of the crash here: https://github.com/erik-overdahl/emacs-29-pgtk-ts-crash-bugreport. Please let me know if you need more information.
The text was updated successfully, but these errors were encountered: