Skip to content


Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP


Fib the encoding to File.readlines to prevent failure on UTF-8 code lines #1238

wants to merge 3 commits into from

3 participants


While working on brianmario/mysql2#470 I came across this problem. I have a UTF-8 encoded Chinese character in the Ruby source file, I'm using this to test encoding support for MySQL error messages. I turned up a problem in rspec, consistent from 2.8.1 (the versions we're pinned at in mysql2) up through 2.14.7 (I was testing if this had been fixed recently, nope). With this change in place, I get the expected failed spec output.

Backtrace without this change:

  1) Mysql2::Error behaves like mysql2 error encoding (MySQL < 5.5) #message   should transcode from nil to ASCII-8BIT
mysql2/vendor/bundle/ruby/2.1.0/gems/rspec-core-2.14.7/lib/rspec/core/formatters/base_formatter.rb:198:in `readlines': U+5B57 from UTF-8 to US-ASCII (Encoding::UndefinedConversionError)
    from mysql2/vendor/bundle/ruby/2.1.0/gems/rspec-core-2.14.7/lib/rspec/core/formatters/base_formatter.rb:198:in `read_failed_line'
    from mysql2/vendor/bundle/ruby/2.1.0/gems/rspec-core-2.14.7/lib/rspec/core/formatters/base_text_formatter.rb:299:in `dump_failure_info'
    from mysql2/vendor/bundle/ruby/2.1.0/gems/rspec-core-2.14.7/lib/rspec/core/formatters/base_text_formatter.rb:293:in `dump_failure'
    from mysql2/vendor/bundle/ruby/2.1.0/gems/rspec-core-2.14.7/lib/rspec/core/formatters/base_text_formatter.rb:24:in `block in dump_failures'
    from mysql2/vendor/bundle/ruby/2.1.0/gems/rspec-core-2.14.7/lib/rspec/core/formatters/base_text_formatter.rb:22:in `each'
    from mysql2/vendor/bundle/ruby/2.1.0/gems/rspec-core-2.14.7/lib/rspec/core/formatters/base_text_formatter.rb:22:in `each_with_index'
    from mysql2/vendor/bundle/ruby/2.1.0/gems/rspec-core-2.14.7/lib/rspec/core/formatters/base_text_formatter.rb:22:in `dump_failures'
    from mysql2/vendor/bundle/ruby/2.1.0/gems/rspec-core-2.14.7/lib/rspec/core/reporter.rb:127:in `block in notify'
    from mysql2/vendor/bundle/ruby/2.1.0/gems/rspec-core-2.14.7/lib/rspec/core/reporter.rb:126:in `each'
    from mysql2/vendor/bundle/ruby/2.1.0/gems/rspec-core-2.14.7/lib/rspec/core/reporter.rb:126:in `notify'
    from mysql2/vendor/bundle/ruby/2.1.0/gems/rspec-core-2.14.7/lib/rspec/core/reporter.rb:109:in `finish'
    from mysql2/vendor/bundle/ruby/2.1.0/gems/rspec-core-2.14.7/lib/rspec/core/reporter.rb:60:in `report'
    from mysql2/vendor/bundle/ruby/2.1.0/gems/rspec-core-2.14.7/lib/rspec/core/command_line.rb:25:in `run'
    from mysql2/vendor/bundle/ruby/2.1.0/gems/rspec-core-2.14.7/lib/rspec/core/runner.rb:80:in `run'
    from mysql2/vendor/bundle/ruby/2.1.0/gems/rspec-core-2.14.7/lib/rspec/core/runner.rb:17:in `block in autorun'
ruby -S rspec ./spec/em/em_spec.rb ./spec/mysql2/client_spec.rb ./spec/mysql2/error_spec.rb ./spec/mysql2/result_spec.rb failed

Bugger, have to make this conditional for Ruby 1.8, it's not happy about extra arguments to File.readlines.


Thanks for the assist, we'd need some specs covering this behaviour and I'd also like to check what happens when you force a differently encoded file (say UTF16-LE) to binary with this method.


Also we traditionally like to not run those sorts of checks at run time, if you extract the action to a private method you can conditionally define the implementation.



  if defined?(Encoding)
    def file_lines(file_path)
      File.readlines(file_path, :encoding => 'BINARY')
    def file_lines(file_path)

If File.readlines were used more than once, I'd surely make it a function. I'll check if there are other uses.
Yep, needs tests. I'm hacking directly in the GitHub web interface right now, just getting this started :).


I'd still prefer you refactor this so that it doesn't run the conditional check every time it prints a failed line, extracting a single use LOC into a method is perfectly acceptable in ruby for readability and things like this.


Oh, got it - I didn't realize the frequency of use.

Something else occurred to me: I may be causing this problem by changing the Encoding.default_internal to control behavior in MySQL error message handling. I'll try changing the encoding closer to where I need it, and see if that helps.


I tried restoring Encoding.default_internal immediately after triggering the expected exception, still falls over with this encoding error in rspec :(


Encoding is tricky, which is why things like this need specs before changing ;) I'm not sure forcing 'BINARY' will solve this for all, or even the most common encoding situations.

Have you tried using the # encoding: directive in your files rather than monkeying around with Encoding.default_internal?


My file contains UTF-8 and has encoding: utf8. I need to change the Encoding.default_internal in a spec in order to test behavior of the code under test. This is a weird case, admittedly.

The workaround of forcing your backtrace code to always operate in binary certainly ought to work for all encodings; in any event, your code is not sensitive to the encoding of my ruby files, so there's an assumption one way or the other (i.e. the assumption now is that the encoding of the file that might fail a test is the same as the spec script!)


Historically within rspec binary has been the least compatible encoding, as it means "no encoding", remember this output is going to be mixed into peoples consoles, and that this will potentially corrupt the output. Replacing UTF-8 chars with there code points equivalents is not a solution.


Looking at your code you're actually setting default_internal when you load your shared examples, so it won't have the desired effect anyway; could you try doing it in a before block, or even better stubbing it directly (so it's automatically cleaned up around your examples)


The contents of the shared example runs only when the shared example is executed with arguments, no?


No, the contents of the outer block are evaluated when you call it_should_behave_like but the examples won't be executed until later, meaning that the last value evaluated will be used in all examples.

Demonstrated thusly:


Closing, stale and it's not clear to me what the next step is.

@xaviershay xaviershay closed this

No problem, the tests ended up being refactored take a different approach in any event - we actually had to implement a character sanitization routine in mysql2 because Ruby (before 2.1) does not have a best-effort option :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
This page is out of date. Refresh to see the latest.
Showing with 2 additions and 1 deletion.
  1. +2 −1  lib/rspec/core/formatters/base_formatter.rb
3  lib/rspec/core/formatters/base_formatter.rb
@@ -204,7 +204,8 @@ def read_failed_line(exception, example)
file_path, line_number = matching_line.match(/(.+?):(\d+)(|:\d+)/)[1..2]
if File.exist?(file_path)
- File.readlines(file_path)[line_number.to_i - 1] ||
+ lines = defined?(Encoding) ? File.readlines(file_path, :encoding => 'BINARY') : File.readlines(file_path)
+ lines[line_number.to_i - 1] ||
"Unable to find matching line in #{file_path}"
"Unable to find #{file_path} to read failed line"
Something went wrong with that request. Please try again.