Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal exception in ruby-pg CI with truffleruby-head #3478

Closed
larskanis opened this issue Mar 4, 2024 · 12 comments
Closed

Internal exception in ruby-pg CI with truffleruby-head #3478

larskanis opened this issue Mar 4, 2024 · 12 comments
Assignees
Labels

Comments

@larskanis
Copy link
Contributor

larskanis commented Mar 4, 2024

It happens at every run here in the pg specs when returning a float NaN value.

This error is raised since several weeks in truffleruby-head "24.1.0-dev-3a920de7, like ruby 3.2.2, GraalVM CE Native [x86_64-linux]". It doesn't happen in truffleruby "23.1.2, like ruby 3.2.2, Oracle GraalVM Native [x86_64-linux]".

Here is a failing CI run: https://github.com/ged/ruby-pg/actions/runs/8115487309/job/22183459895#step:12:458

The output:

truffleruby: an internal exception escaped out of the interpreter,
please report it to https://github.com/oracle/truffleruby/issues

dead handle 0xbad0000000198f0 (com.oracle.truffle.api.CompilerDirectives.ShouldNotReachHere)
	from com.oracle.truffle.api.CompilerDirectives.shouldNotReachHere(CompilerDirectives.java:574)
	from com.oracle.truffle.api.CompilerDirectives.shouldNotReachHere(CompilerDirectives.java:520)
	from org.truffleruby.cext.UnwrapNode$UnwrapNativeNode.raiseError(UnwrapNode.java:107)
	from org.truffleruby.cext.UnwrapNode$UnwrapNativeNode.unwrapTaggedObject(UnwrapNode.java:92)
	from org.truffleruby.cext.UnwrapNodeGen$UnwrapNativeNodeGen$Inlined.execute(UnwrapNodeGen.java:382)
	from org.truffleruby.cext.UnwrapNode.unwrapGeneric(UnwrapNode.java:286)
	from org.truffleruby.cext.UnwrapNodeGen$Inlined.execute(UnwrapNodeGen.java:156)
	from org.truffleruby.cext.CExtNodes$CallWithCExtLockAndFrameAndUnwrapNode.callWithCExtLockAndFrame(CExtNodes.java:261)
	from org.truffleruby.cext.CExtNodesFactory$CallWithCExtLockAndFrameAndUnwrapNodeFactory$CallWithCExtLockAndFrameAndUnwrapNodeGen.execute(CExtNodesFactory.java:564)
	from org.truffleruby.language.locals.WriteLocalVariableNode.execute(WriteLocalVariableNode.java:28)
	from org.truffleruby.language.RubyContextSourceNode.executeVoid(RubyContextSourceNode.java:23)
	from org.truffleruby.language.control.SequenceNode.execute(SequenceNode.java:35)
	from org.truffleruby.core.module.ModuleNodes$DefineMethodNode$CallMethodWithLambdaBody.execute(ModuleNodes.java:1373)
	from org.truffleruby.language.RubyLambdaRootNode.execute(RubyLambdaRootNode.java:84)
/home/runner/.rubies/truffleruby-head/lib/truffle/truffle/cext_ruby.rb:24:in `getvalue'
	from /home/runner/work/ruby-pg/ruby-pg/spec/pg/basic_type_map_for_results_spec.rb:142:in `block (5 levels) in <top (required)>'
	from /home/runner/work/ruby-pg/ruby-pg/spec/pg/basic_type_map_for_results_spec.rb:129:in `each'
	from /home/runner/work/ruby-pg/ruby-pg/spec/pg/basic_type_map_for_results_spec.rb:129:in `block (4 levels) in <top (required)>'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/lib/rspec/core/example.rb:263:in `instance_exec'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/lib/rspec/core/example.rb:263:in `block in run'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/lib/rspec/core/example.rb:511:in `block in with_around_and_singleton_context_hooks'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/lib/rspec/core/example.rb:468:in `block in with_around_example_hooks'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/lib/rspec/core/hooks.rb:486:in `block in run'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/lib/rspec/core/hooks.rb:626:in `block in run_around_example_hooks_for'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/lib/rspec/core/example.rb:352:in `call'
	from /home/runner/work/ruby-pg/ruby-pg/spec/helpers.rb:56:in `block in included'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/lib/rspec/core/example.rb:457:in `instance_exec'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/lib/rspec/core/example.rb:457:in `instance_exec'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/lib/rspec/core/hooks.rb:390:in `execute_with'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/lib/rspec/core/hooks.rb:628:in `block (2 levels) in run_around_example_hooks_for'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/lib/rspec/core/example.rb:352:in `call'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/lib/rspec/core/hooks.rb:627:in `run_around_example_hooks_for'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/lib/rspec/core/hooks.rb:486:in `run'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/lib/rspec/core/example.rb:468:in `with_around_example_hooks'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/lib/rspec/core/example.rb:511:in `with_around_and_singleton_context_hooks'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/lib/rspec/core/example.rb:259:in `run'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/lib/rspec/core/example_group.rb:646:in `block in run_examples'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/lib/rspec/core/example_group.rb:642:in `map'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/lib/rspec/core/example_group.rb:642:in `run_examples'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/lib/rspec/core/example_group.rb:607:in `run'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/lib/rspec/core/example_group.rb:608:in `block in run'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/lib/rspec/core/example_group.rb:608:in `map'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/lib/rspec/core/example_group.rb:608:in `run'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/lib/rspec/core/example_group.rb:608:in `block in run'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/lib/rspec/core/example_group.rb:608:in `map'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/lib/rspec/core/example_group.rb:608:in `run'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/lib/rspec/core/runner.rb:121:in `block (3 levels) in run_specs'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/lib/rspec/core/runner.rb:121:in `map'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/lib/rspec/core/runner.rb:121:in `block (2 levels) in run_specs'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/lib/rspec/core/configuration.rb:2091:in `with_suite_hooks'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/lib/rspec/core/runner.rb:116:in `block in run_specs'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/lib/rspec/core/reporter.rb:74:in `report'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/lib/rspec/core/runner.rb:115:in `run_specs'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/lib/rspec/core/runner.rb:89:in `run'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/lib/rspec/core/runner.rb:71:in `run'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/lib/rspec/core/runner.rb:45:in `invoke'
	from /home/runner/.rubies/truffleruby-head/lib/gems/gems/rspec-core-3.13.0/exe/rspec:4:in `<top (required)>'
	from <internal:core> core/kernel.rb:378:in `load'
	from /home/runner/.rubies/truffleruby-head/bin/rspec:25:in `<main>'
@flavorjones
Copy link
Contributor

flavorjones commented Mar 7, 2024

I've seen similar failures in Nokogiri's test suite, though it appears to be sporadic for me and for an earlier version of TR. I'm happy to open a separate bug report, but the stack trace is so similar I thought I'd start here.

Example test output: https://github.com/sparklemotion/nokogiri/actions/runs/8194425716/job/22410492610#step:7:114

Version: truffleruby 23.1.2, like ruby 3.2.2, Oracle GraalVM Native [x86_64-linux]

dead handle 0xbad000000023028 (com.oracle.truffle.api.CompilerDirectives.ShouldNotReachHere)
	from com.oracle.truffle.api.CompilerDirectives.shouldNotReachHere(CompilerDirectives.java:574)
	from com.oracle.truffle.api.CompilerDirectives.shouldNotReachHere(CompilerDirectives.java:520)
	from org.truffleruby.cext.UnwrapNode$UnwrapNativeNode.raiseError(UnwrapNode.java:107)
	from org.truffleruby.cext.UnwrapNode$UnwrapNativeNode.unwrapTaggedObject(UnwrapNode.java:92)
	from org.truffleruby.cext.UnwrapNodeGen$UnwrapNativeNodeGen$Inlined.executeAndSpecialize(UnwrapNodeGen.java:421)
	from org.truffleruby.cext.UnwrapNodeGen$UnwrapNativeNodeGen$Inlined.execute(UnwrapNodeGen.java:387)
	from org.truffleruby.cext.UnwrapNode.longToWrapper(UnwrapNode.java:270)
	from org.truffleruby.cext.UnwrapNodeGen$Inlined.executeAndSpecialize(UnwrapNodeGen.java:183)
	from org.truffleruby.cext.UnwrapNodeGen$Inlined.execute(UnwrapNodeGen.java:1[58](https://github.com/sparklemotion/nokogiri/actions/runs/8194425716/job/22410492610#step:7:59))
	from org.truffleruby.cext.CExtNodes$CallWithCExtLockAndFrameAndUnwrapNode.callWithCExtLockAndFrame(CExtNodes.java:258)
	from org.truffleruby.cext.CExtNodesFactory$CallWithCExtLockAndFrameAndUnwrapNodeFactory$CallWithCExtLockAndFrameAndUnwrapNodeGen.executeAndSpecialize(CExtNodesFactory.java:577)
	from org.truffleruby.cext.CExtNodesFactory$CallWithCExtLockAndFrameAndUnwrapNodeFactory$CallWithCExtLockAndFrameAndUnwrapNodeGen.execute(CExtNodesFactory.java:556)
	from org.truffleruby.language.locals.WriteLocalVariableNode.execute(WriteLocalVariableNode.java:28)
	from org.truffleruby.language.RubyNode.doExecuteVoid(RubyNode.java:64)
	from org.truffleruby.language.control.SequenceNode.execute(SequenceNode.java:34)
	from org.truffleruby.core.module.ModuleNodes$DefineMethodNode$CallMethodWithLambdaBody.execute(ModuleNodes.java:1373)
	from org.truffleruby.language.RubyLambdaRootNode.execute(RubyLambdaRootNode.java:84)
/home/runner/.rubies/truffleruby-23.1.2/lib/truffle/truffle/cext_ruby.rb:23:in `parent'
	from /home/runner/work/nokogiri/nokogiri/test/xml/test_node_set.rb:[60](https://github.com/sparklemotion/nokogiri/actions/runs/8194425716/job/22410492610#step:7:61)2:in `block (4 levels) in <class:TestNodeSet>'
	from /home/runner/work/nokogiri/nokogiri/lib/nokogiri/xml/node_set.rb:237:in `block in each'
	from /home/runner/work/nokogiri/nokogiri/lib/nokogiri/xml/node_set.rb:236:in `upto'
	from /home/runner/work/nokogiri/nokogiri/lib/nokogiri/xml/node_set.rb:236:in `each'

@eregon
Copy link
Member

eregon commented Mar 11, 2024

dead handle 0xbad... (com.oracle.truffle.api.CompilerDirectives.ShouldNotReachHere) usually means that some VALUE field was not marked properly by a C extension. TruffleRuby may call the marking functions more often or at different places than CRuby, CRuby calls them during GC and that's not possible on JVM so TruffleRuby calls them e.g. after returning from a method defined in C using DATA_PTR.
So it could be some field missed to be marked in the marking function, or it could be a missing RB_GC_GUARD.
The Ruby part of the backtrace usually gives some hint about which field/local variable it is about.
Since 24.0 native extensions are executed natively which means all VALUE variables are handles while before only those that escape to the native heap not managed by Sulong (e.g. a malloc or passed to some system library), so it makes it more likely to discover such issues.

@eregon eregon added the cexts label Mar 11, 2024
@flavorjones
Copy link
Contributor

Interesting. The two most recent Nokogiri errors I have found ruby backtraces to tests that deal with duplicating nodes:

/home/runner/.rubies/truffleruby-23.1.2/lib/truffle/truffle/cext_ruby.rb:23:in `parent'
	from /home/runner/work/nokogiri/nokogiri/test/xml/test_node_set.rb:602:in `block (4 levels) in <class:TestNodeSet>'
	from /home/runner/work/nokogiri/nokogiri/lib/nokogiri/xml/node_set.rb:237:in `block in each'
	from /home/runner/work/nokogiri/nokogiri/lib/nokogiri/xml/node_set.rb:236:in `upto'
	from /home/runner/work/nokogiri/nokogiri/lib/nokogiri/xml/node_set.rb:236:in `each'
	from /home/runner/work/nokogiri/nokogiri/test/xml/test_node_set.rb:601:in `test_0002_wraps each node within a dup of the Node argument'

and

/home/runner/.rubies/truffleruby-head/lib/truffle/truffle/cext_ruby.rb:24:in `[]'
	from /home/runner/work/nokogiri/nokogiri/lib/nokogiri/xml/node_set.rb:237:in `block in each'
	from /home/runner/work/nokogiri/nokogiri/lib/nokogiri/xml/node_set.rb:236:in `upto'
	from /home/runner/work/nokogiri/nokogiri/lib/nokogiri/xml/node_set.rb:236:in `each'
	from <internal:core> core/enumerable.rb:594:in `any?'
	from /home/runner/work/nokogiri/nokogiri/lib/nokogiri/xml/document_fragment.rb:103:in `css'
	from /home/runner/work/nokogiri/nokogiri/lib/nokogiri/xml/searchable.rb:144:in `at_css'
	from /home/runner/work/nokogiri/nokogiri/test/xml/test_document_fragment.rb:304:in `test_dup_creates_mutable_tree'

I'll take a deeper look when I get a chance.

@ntkme
Copy link
Contributor

ntkme commented Mar 19, 2024

Observed a very similar error for google-protobuf on truffleruby+graalvm-24.0.0.

What's interesting is that it is throwing from just loading the cext with require 'google/protobuf_c': https://github.com/protocolbuffers/protobuf/blob/v26.0/ruby/lib/google/protobuf_native.rb#L15

dead handle 0xbad000000018070 (com.oracle.truffle.api.CompilerDirectives.ShouldNotReachHere)
	from com.oracle.truffle.api.CompilerDirectives.shouldNotReachHere(CompilerDirectives.java:574)
	from com.oracle.truffle.api.CompilerDirectives.shouldNotReachHere(CompilerDirectives.java:520)
	from org.truffleruby.cext.UnwrapNode$UnwrapNativeNode.raiseError(UnwrapNode.java:107)
	from org.truffleruby.cext.UnwrapNode$UnwrapNativeNode.unwrapTaggedObject(UnwrapNode.java:92)
	from org.truffleruby.cext.UnwrapNodeGen$UnwrapNativeNodeGen$Inlined.execute(UnwrapNodeGen.java:377)
	from org.truffleruby.cext.UnwrapNode.longToWrapper(UnwrapNode.java:270)
	from org.truffleruby.cext.UnwrapNodeGen$Inlined.execute(UnwrapNodeGen.java:143)
	from org.truffleruby.cext.ValueWrapperManager$UnwrapperFunction.execute(ValueWrapperManager.java:401)
	from org.truffleruby.cext.UnwrapperFunctionGen$InteropLibraryExports$Cached.execute(UnwrapperFunctionGen.java:117)
	from com.oracle.truffle.llvm.runtime.nodes.func.LLVMDispatchNode$LLVMLookupDispatchForeignNode.doGeneric(LLVMDispatchNode.java:459)
	from com.oracle.truffle.llvm.runtime.nodes.func.LLVMDispatchNode$LLVMLookupDispatchForeignNode.doUnknownType(LLVMDispatchNode.java:487)
	from com.oracle.truffle.llvm.runtime.nodes.func.LLVMDispatchNodeGen$LLVMLookupDispatchForeignNodeGen.execute(LLVMDispatchNodeGen.java:1471)
	from com.oracle.truffle.llvm.runtime.nodes.func.LLVMDispatchNode.doForeignExecutable(LLVMDispatchNode.java:380)
	from com.oracle.truffle.llvm.runtime.nodes.func.LLVMDispatchNodeGen.executeDispatch(LLVMDispatchNodeGen.java:272)
	from com.oracle.truffle.llvm.runtime.nodes.func.LLVMCallNode.doCall(LLVMCallNode.java:82)
	from com.oracle.truffle.llvm.runtime.nodes.func.LLVMCallNodeGen.executeGeneric(LLVMCallNodeGen.java:37)
	from com.oracle.truffle.llvm.runtime.nodes.api.LLVMFrameNullerExpression.doGeneric(LLVMFrameNullerExpression.java:71)
	from com.oracle.truffle.llvm.runtime.nodes.api.LLVMFrameNullerExpressionNodeGen.executeGeneric(LLVMFrameNullerExpressionNodeGen.java:29)
	from com.oracle.truffle.llvm.runtime.nodes.vars.LLVMWriteNodeFactory$LLVMWritePointerNodeGen.execute_generic1(LLVMWriteNodeFactory.java:1370)
	from com.oracle.truffle.llvm.runtime.nodes.vars.LLVMWriteNodeFactory$LLVMWritePointerNodeGen.execute(LLVMWriteNodeFactory.java:1344)
	from com.oracle.truffle.llvm.runtime.nodes.base.LLVMBasicBlockNode$InitializedBlockNode.execute(LLVMBasicBlockNode.java:154)
	from com.oracle.truffle.llvm.runtime.nodes.control.LLVMDispatchBasicBlockNode.dispatchFromBasicBlock(LLVMDispatchBasicBlockNode.java:116)
	from com.oracle.truffle.llvm.runtime.nodes.control.LLVMDispatchBasicBlockNode.doDispatch(LLVMDispatchBasicBlockNode.java:87)
	from com.oracle.truffle.llvm.runtime.nodes.control.LLVMDispatchBasicBlockNodeGen.executeGeneric(LLVMDispatchBasicBlockNodeGen.java:33)
	from com.oracle.truffle.llvm.runtime.nodes.control.LLVMFunctionRootNode.doRun(LLVMFunctionRootNode.java:81)
	from com.oracle.truffle.llvm.runtime.nodes.control.LLVMFunctionRootNodeGen.executeGeneric(LLVMFunctionRootNodeGen.java:34)
	from com.oracle.truffle.llvm.runtime.nodes.func.LLVMFunctionStartNode.execute(LLVMFunctionStartNode.java:102)
/home/runner/.rubies/truffleruby+graalvm-24.0.0/lib/truffle/truffle/cext.rb:2248:in `block in resolve_registered_addresses'
	from /home/runner/.rubies/truffleruby+graalvm-24.0.0/lib/truffle/truffle/cext.rb:2247:in `each'
	from /home/runner/.rubies/truffleruby+graalvm-24.0.0/lib/truffle/truffle/cext.rb:2247:in `resolve_registered_addresses'
	from /home/runner/.rubies/truffleruby+graalvm-24.0.0/lib/truffle/truffle/cext.rb:220:in `init_extension'
	from <internal:core> core/kernel.rb:229:in `gem_original_require'
	from <internal:/home/runner/.rubies/truffleruby+graalvm-24.0.0/lib/mri/rubygems/core_ext/kernel_require.rb>:37:in `require'
	from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/vendor/bundle/truffleruby/3.2.2.24.0.0.2/gems/google-protobuf-4.26.0/lib/google/protobuf_native.rb:15:in `<top (required)>'
	from <internal:core> core/kernel.rb:229:in `gem_original_require'
	from <internal:/home/runner/.rubies/truffleruby+graalvm-24.0.0/lib/mri/rubygems/core_ext/kernel_require.rb>:37:in `require'
	from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/vendor/bundle/truffleruby/3.2.2.24.0.0.2/gems/google-protobuf-4.26.0/lib/google/protobuf.rb:57:in `<module:Protobuf>'
	from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/vendor/bundle/truffleruby/3.2.2.24.0.0.2/gems/google-protobuf-4.26.0/lib/google/protobuf.rb:15:in `<module:Google>'
	from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/vendor/bundle/truffleruby/3.2.2.24.0.0.2/gems/google-protobuf-4.26.0/lib/google/protobuf.rb:14:in `<top (required)>'
	from <internal:core> core/kernel.rb:229:in `gem_original_require'
	from <internal:/home/runner/.rubies/truffleruby+graalvm-24.0.0/lib/mri/rubygems/core_ext/kernel_require.rb>:37:in `require'
	from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/ext/sass/embedded_sass_pb.rb:5:in `<top (required)>'
	from <internal:core> core/kernel.rb:292:in `require_relative'
	from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/lib/sass/embedded_protocol.rb:6:in `<module:EmbeddedProtocol>'
	from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/lib/sass/embedded_protocol.rb:5:in `<module:Sass>'
	from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/lib/sass/embedded_protocol.rb:3:in `<top (required)>'
	from <internal:core> core/kernel.rb:292:in `require_relative'
	from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/lib/sass/compiler.rb:11:in `<top (required)>'
	from <internal:core> core/kernel.rb:292:in `require_relative'
	from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/lib/sass/embedded.rb:3:in `<top (required)>'
	from <internal:core> core/kernel.rb:292:in `require_relative'
	from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/lib/sass-embedded.rb:4:in `<top (required)>'
	from <internal:core> core/kernel.rb:229:in `gem_original_require'
	from <internal:/home/runner/.rubies/truffleruby+graalvm-24.0.0/lib/mri/rubygems/core_ext/kernel_require.rb>:37:in `require'
	from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/spec/spec_helper.rb:3:in `<top (required)>'
	from <internal:core> core/kernel.rb:229:in `gem_original_require'
	from <internal:/home/runner/.rubies/truffleruby+graalvm-24.0.0/lib/mri/rubygems/core_ext/kernel_require.rb>:37:in `require'
	from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/spec/sass/compile_error_spec.rb:3:in `<top (required)>'
	from <internal:core> core/kernel.rb:378:in `load'
	from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/vendor/bundle/truffleruby/3.2.2.24.0.0.2/gems/rspec-core-3.13.0/lib/rspec/core/configuration.rb:2138:in `load_file_handling_errors'
	from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/vendor/bundle/truffleruby/3.2.2.24.0.0.2/gems/rspec-core-3.13.0/lib/rspec/core/configuration.rb:1638:in `block in load_spec_files'
	from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/vendor/bundle/truffleruby/3.2.2.24.0.0.2/gems/rspec-core-3.13.0/lib/rspec/core/configuration.rb:1636:in `each'
	from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/vendor/bundle/truffleruby/3.2.2.24.0.0.2/gems/rspec-core-3.13.0/lib/rspec/core/configuration.rb:1636:in `load_spec_files'
	from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/vendor/bundle/truffleruby/3.2.2.24.0.0.2/gems/rspec-core-3.13.0/lib/rspec/core/runner.rb:102:in `setup'
	from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/vendor/bundle/truffleruby/3.2.2.24.0.0.2/gems/rspec-core-3.13.0/lib/rspec/core/runner.rb:86:in `run'
	from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/vendor/bundle/truffleruby/3.2.2.24.0.0.2/gems/rspec-core-3.13.0/lib/rspec/core/runner.rb:71:in `run'
	from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/vendor/bundle/truffleruby/3.2.2.24.0.0.2/gems/rspec-core-3.13.0/lib/rspec/core/runner.rb:45:in `invoke'
	from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/vendor/bundle/truffleruby/3.2.2.24.0.0.2/gems/rspec-core-3.13.0/exe/rspec:4:in `<main>'

@eregon
Copy link
Member

eregon commented Mar 20, 2024

@ntkme Could you file a separate issue for that one?
Given it happens with resolve_registered_addresses it seems quite different, and might possibly happen all the time, not transient?

Going forward, it seems best to file separate issues (one per gem) for dead handle errors, because it's very likely to be specific to some code in the gem, and it makes it much easier to track & investigate & keep the information together.
@flavorjones Could you also file a separate issue for nokogiri, so then this one is only about ruby-pg?

@ntkme
Copy link
Contributor

ntkme commented Mar 20, 2024

@eregon It happens rarely. For most of the time it works just fine. I created a new issue here: #3500

@larskanis
Copy link
Contributor Author

I finally got truffleruby-24.x running locally and now it looks like a bug in Truffleruby-24.0.
The code is fairly simple in ruby-pg. Three Float numbers are calculated in the init of the extension like so:

	s_nan = rb_eval_string("0.0/0.0");
	rb_global_variable(&s_nan);
	s_pos_inf = rb_eval_string("1.0/0.0");
	rb_global_variable(&s_pos_inf);
	s_neg_inf = rb_eval_string("'-1.0/0.0'");
	rb_global_variable(&s_neg_inf);

Then in a C-func the value is returned like so:

static VALUE
pg_text_dec_float(...){
	return s_neg_inf;
}

But to that time the returned Float object is no longer valid, resulting in the dead handle error on truffleruby-24.x.

The same happens with any floating point number. But nothing crashs, when I use some other ruby object.
For instance , if I use a String but do not register the global variable like so:

	s_nan = rb_eval_string("'xyz'");
	# rb_global_variable(&s_nan);

then Truffleruby crashs with very much the same error like with a Float object. But when I uncomment the rb_global_variable call, that no crash happens, because the String is properly marked.

This is in contrast to Float objects. They fail with the dead handle error regardless of the rb_global_variable call. It also doesn't matter if I change rb_global_variable to rb_gc_register_mark_object or rb_define_const. None of the seem to mark a Float object.

@eregon
Copy link
Member

eregon commented Mar 22, 2024

@larskanis Thank you for the investigation and details, this makes it a lot easier to look into it.

@eregon
Copy link
Member

eregon commented Mar 22, 2024

I can reproduce the issue reliably in a C API spec, with both bignums and floats (v = LONG2NUM(INT64_MAX);, v = DBL2NUM(0.0/0.0);, v = rb_eval_string("0.0/0.0")).

@eregon
Copy link
Member

eregon commented Mar 22, 2024

The issue for the Float case seems that the ValueWrapper is not kept alive, after the Init_ function has returned.
And when that GC's we lose the mapping from native address to the Float instance (a java.lang.Double).
rb_global_variable() etc add the relevant objects (e.g. the Float instance) in GC_REGISTERED_ADDRESSES. For regular Ruby objects they hold onto their ValueValue wrapper too, so that works fine, but for primitives like Float there is no way to store the ValueWrapper in a java.lang.Double instance.
So I think GC_REGISTERED_ADDRESSES should hold onto ValueWrapper's instead of the actual objects they refer to and that should fix it.

For bignums like INT64_MAX this actually fits in a Java long so is the same case as Float (cannot store a ValueWrapper in a java.lang.Long instance).
For "true" bignums that don't fit in a Java long it works fine already and it's like e.g. Symbols.
Fixnums are not affected because those are tagged pointers like in CRuby so VALUE->long is done only from the VALUE address and not needing any ValueWrapper/HandleBlock/etc.
true/false/nil/Qundef are fine, those wrappers are always held alive and have special addresses.
And there are no other kinds of "primitives", the rest is all RubyDynamicObject/ImmutableRubyObject.

A quick workaround until this is fixed is to run with TRUFFLERUBYOPT="--experimental-options --keep-handles-alive".
That will keep all handles (VALUE) alive by leaking them, so obviously just a workaround but could be useful in CI until this is fixed.
We should have a fix very soon hopefully.

@flavorjones
Copy link
Contributor

I've opened a new issue for the nokogiri errors above at #3503

@eregon
Copy link
Member

eregon commented Mar 27, 2024

This fix should be included for the 24.0.1 Release (Apr 16, 2024).
(and of course it's fixed on master and in truffleruby-dev/head)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants