Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Facing `compile_file': "\\xE2" on US-ASCII (Encoding::InvalidByteSequenceError) error when trying to run any rails command #444

Closed
Physium opened this issue Jun 22, 2023 · 14 comments

Comments

@Physium
Copy link

Physium commented Jun 22, 2023

So I recently upgrade my project from Ruby 2.5 to 2.7.8 because obviously its been out of date for the longest time. After the upgrade I'm faced with the following error for bootsnap while trying to deploy it as a docker container.

I'm currently on:
rails (6.0.6.1)
ruby (2.7.8) previously 2.5.0
bundler (2.1.4) previouly 1.17.3
bootsnap 1.16.0

are there some sort of changes as to how higher ruby version works with bootsnap?

Also i notice that when i run rails db:migrate for e.g. the bundle exec rails ... would then work... not sure whats going on.

Would appreciate if anyone can shine some light.

Traceback (most recent call last):
	19: from bin/rails:10:in `<main>'
	18: from /app/src/vendor/bundle/ruby/2.7.0/gems/bootsnap-1.16.0/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:32:in `require'
	17: from /app/src/vendor/bundle/ruby/2.7.0/gems/bootsnap-1.16.0/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:32:in `require'
	16: from /app/src/vendor/bundle/ruby/2.7.0/gems/railties-6.0.6.1/lib/rails/commands.rb:3:in `<main>'
	15: from /app/src/vendor/bundle/ruby/2.7.0/gems/bootsnap-1.16.0/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:32:in `require'
	14: from /app/src/vendor/bundle/ruby/2.7.0/gems/bootsnap-1.16.0/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:32:in `require'
	13: from /app/src/vendor/bundle/ruby/2.7.0/gems/railties-6.0.6.1/lib/rails/command.rb:3:in `<main>'
	12: from /app/src/vendor/bundle/ruby/2.7.0/gems/bootsnap-1.16.0/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:32:in `require'
	11: from /app/src/vendor/bundle/ruby/2.7.0/gems/bootsnap-1.16.0/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:32:in `require'
	10: from /app/src/vendor/bundle/ruby/2.7.0/gems/activesupport-6.0.6.1/lib/active_support.rb:27:in `<main>'
	 9: from /app/src/vendor/bundle/ruby/2.7.0/gems/bootsnap-1.16.0/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:32:in `require'
	 8: from /app/src/vendor/bundle/ruby/2.7.0/gems/bootsnap-1.16.0/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:32:in `require'
	 7: from /app/src/vendor/bundle/ruby/2.7.0/gems/activesupport-6.0.6.1/lib/active_support/dependencies/autoload.rb:3:in `<main>'
	 6: from /app/src/vendor/bundle/ruby/2.7.0/gems/bootsnap-1.16.0/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:32:in `require'
	 5: from /app/src/vendor/bundle/ruby/2.7.0/gems/bootsnap-1.16.0/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:32:in `require'
	 4: from /app/src/vendor/bundle/ruby/2.7.0/gems/bootsnap-1.16.0/lib/bootsnap/compile_cache/iseq.rb:85:in `load_iseq'
	 3: from /app/src/vendor/bundle/ruby/2.7.0/gems/bootsnap-1.16.0/lib/bootsnap/compile_cache/iseq.rb:60:in `fetch'
	 2: from /app/src/vendor/bundle/ruby/2.7.0/gems/bootsnap-1.16.0/lib/bootsnap/compile_cache/iseq.rb:60:in `fetch'
	 1: from /app/src/vendor/bundle/ruby/2.7.0/gems/bootsnap-1.16.0/lib/bootsnap/compile_cache/iseq.rb:42:in `input_to_storage'
/app/src/vendor/bundle/ruby/2.7.0/gems/bootsnap-1.16.0/lib/bootsnap/compile_cache/iseq.rb:42:in `compile_file': "\\xE2" on US-ASCII (Encoding::InvalidByteSequenceError)
@casperisfine
Copy link
Contributor

I'm not convinced this is a bootsnap error. Have you tried disabling it to confirm it's not just a syntax error with Ruby itself?

@Physium
Copy link
Author

Physium commented Jun 22, 2023

yes i tried disabling it and it works as per normal

@casperisfine
Copy link
Contributor

Hum, ok. So:

  • \xE2 (or 226) is a common UTF-8 prefix, suggesting there are unicorn characters in that source file.
  • However the exception clearly state the compiler is dealing with an US-ASCII string.
  • So somehow RubyVM::InstructionSequence.compile_file doesn't behave like Kernel.load here, something must make it think it should parse as pure ASCII.

It's really unclear what's going on, I tried a few local repros but no luck.

So I prepared a branch for you to try, which should tell us which file is causing this:

Please try it:

gem "bootsnap", github: "Shopify/bootsnap", branch: "debug-encoding"

It should print [:compile_file, "path/to/file.rb"] when it hits this error, ideally, please share that file content in a gist once it's identified.

@Physium
Copy link
Author

Physium commented Jun 22, 2023

I'm facing a different error now with the branch whereby i cant load bootsnap at all. Am I missing out something?

LoadError: cannot load such file -- bootsnap/bootsnap
/app/src/vendor/cache/bootsnap-3dfa6ea49654/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:29:in `require'
/app/src/vendor/cache/bootsnap-3dfa6ea49654/lib/bootsnap/compile_cache/iseq.rb:3:in `<top (required)>'
/app/src/vendor/cache/bootsnap-3dfa6ea49654/lib/bootsnap/compile_cache.rb:16:in `require_relative'
/app/src/vendor/cache/bootsnap-3dfa6ea49654/lib/bootsnap/compile_cache.rb:16:in `setup'
/app/src/vendor/cache/bootsnap-3dfa6ea49654/lib/bootsnap.rb:57:in `setup'
/app/src/vendor/cache/bootsnap-3dfa6ea49654/lib/bootsnap.rb:100:in `default_setup'
/app/src/vendor/cache/bootsnap-3dfa6ea49654/lib/bootsnap/setup.rb:5:in `<top (required)>'
/app/src/config/boot.rb:5:in `require'
 /app/src/config/boot.rb:5:in `<top (required)>'
/app/src/config/application.rb:1:in `require_relative'
/app/src/config/application.rb:1:in `<top (required)>'
/app/src/Rakefile:4:in `require_relative'
/app/src/Rakefile:4:in `<top (required)>'
/app/src/vendor/bundle/ruby/2.7.0/gems/rake-12.3.3/exe/rake:27:in `<top (required)>'
/app/src/bin/bundle:3:in `load'
/app/src/bin/bundle:3:in `<main>'
See full trace by running task with --trace)

@casperisfine
Copy link
Contributor

This suggest the extension wasn't compiled. Not sure how you got into such a state, but clearly the way you build your application is not good.

But whatever, let's try another strategy. Back on the regular gem, please insert this in your config/boot.rb before the bootsnap/setup

module DebugIseqCompile
  def compile_file(path, *args)
    super
  rescue Encoding::InvalidByteSequenceError
    p [:compile_file, path]
    raise
  end
end
RubyVM::InstructionSequence.singleton_class.prepend(DebugIseqCompile)

The effect should be the same, it should print the path of the file causing the issue.

@Physium
Copy link
Author

Physium commented Jun 22, 2023

Thank you so much for doing this.

This is what is being logged:

[:compile_file, "/app/src/vendor/bundle/ruby/2.7.0/gems/activesupport-6.0.6.1/lib/active_support/inflector/methods.rb"]

@casperisfine
Copy link
Contributor

Ok, so looking at that file:

>> File.binread("/tmp/methods.rb").index("\xE2".b)
=> 7436
>> puts File.binread("/tmp/methods.rb").slice(7420..7460)
gsub(/\b(?<!\w['’`])[a-z]/) do |match|

So it's choking on that regexp. I only had Ruby 2.7.7 locally, but I'm installing 2.7.8 to see if I can repro. But I suspect it has something to do with Encoding.default_internal or Encoding.default_external.

While I'm digging more, You mention Docker, can you give me more info? Which base image are you using, is it alpine based by any chance? How do you install ruby?

@casperisfine
Copy link
Contributor

>> "’".b
=> "\xE2\x80\x99"

Here's our \xE2.

@casperisfine
Copy link
Contributor

Ok, I can repro with:

>> Encoding.default_internal = Encoding::US_ASCII
=> #<Encoding:US-ASCII>
>> RubyVM::InstructionSequence.compile_file("/tmp/methods.rb")
Traceback (most recent call last):
        5: from /opt/rubies/2.7.8/bin/irb:23:in `<main>'
        4: from /opt/rubies/2.7.8/bin/irb:23:in `load'
        3: from /opt/rubies/2.7.8/lib/ruby/gems/2.7.0/gems/irb-1.2.6/exe/irb:11:in `<top (required)>'
        2: from (irb):4
        1: from (irb):4:in `compile_file'
Encoding::UndefinedConversionError (U+2019 from UTF-8 to US-ASCII)

This generally means LANG (or LC_ALL) is missing from your environment, see the Encoding section in https://hub.docker.com/_/ruby#Encoding

Setting ENV LANG en_US.UTF-8 in your container will very likely fix your problem.

@Physium
Copy link
Author

Physium commented Jun 22, 2023

my base image is an CentOS7 image where ruby is installed with rbenv.

Is there a reason why this is happening now? I'm assuming something change in ruby 2.7.8?

@casperisfine
Copy link
Contributor

It's possible that between 2.5.0 and 2.7.8, the way Encoding.default_internal is initialized changed, that vaguely rings a bell, but that was so long ago I can't quite remember.

@casperisfine
Copy link
Contributor

Also could be a problem with CentOS, given the Fedora maintainer had a similar issue a while ago: https://bugs.ruby-lang.org/issues/12127#note-3

@Physium
Copy link
Author

Physium commented Jun 22, 2023

Nonetheless, appreciate your effort! This is probably the first time I receive response on an opensource issue that quick.

@casperisfine
Copy link
Contributor

o7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants