Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Fib the encoding to File.readlines to prevent failure on UTF-8 code lines #1238

Closed
wants to merge 3 commits into from

3 participants

@sodabrew

While working on brianmario/mysql2#470 I came across this problem. I have a UTF-8 encoded Chinese character in the Ruby source file, I'm using this to test encoding support for MySQL error messages. I turned up a problem in rspec, consistent from 2.8.1 (the versions we're pinned at in mysql2) up through 2.14.7 (I was testing if this had been fixed recently, nope). With this change in place, I get the expected failed spec output.

Backtrace without this change:

  1) Mysql2::Error behaves like mysql2 error encoding (MySQL < 5.5) #message   should transcode from nil to ASCII-8BIT
mysql2/vendor/bundle/ruby/2.1.0/gems/rspec-core-2.14.7/lib/rspec/core/formatters/base_formatter.rb:198:in `readlines': U+5B57 from UTF-8 to US-ASCII (Encoding::UndefinedConversionError)
    from mysql2/vendor/bundle/ruby/2.1.0/gems/rspec-core-2.14.7/lib/rspec/core/formatters/base_formatter.rb:198:in `read_failed_line'
    from mysql2/vendor/bundle/ruby/2.1.0/gems/rspec-core-2.14.7/lib/rspec/core/formatters/base_text_formatter.rb:299:in `dump_failure_info'
    from mysql2/vendor/bundle/ruby/2.1.0/gems/rspec-core-2.14.7/lib/rspec/core/formatters/base_text_formatter.rb:293:in `dump_failure'
    from mysql2/vendor/bundle/ruby/2.1.0/gems/rspec-core-2.14.7/lib/rspec/core/formatters/base_text_formatter.rb:24:in `block in dump_failures'
    from mysql2/vendor/bundle/ruby/2.1.0/gems/rspec-core-2.14.7/lib/rspec/core/formatters/base_text_formatter.rb:22:in `each'
    from mysql2/vendor/bundle/ruby/2.1.0/gems/rspec-core-2.14.7/lib/rspec/core/formatters/base_text_formatter.rb:22:in `each_with_index'
    from mysql2/vendor/bundle/ruby/2.1.0/gems/rspec-core-2.14.7/lib/rspec/core/formatters/base_text_formatter.rb:22:in `dump_failures'
    from mysql2/vendor/bundle/ruby/2.1.0/gems/rspec-core-2.14.7/lib/rspec/core/reporter.rb:127:in `block in notify'
    from mysql2/vendor/bundle/ruby/2.1.0/gems/rspec-core-2.14.7/lib/rspec/core/reporter.rb:126:in `each'
    from mysql2/vendor/bundle/ruby/2.1.0/gems/rspec-core-2.14.7/lib/rspec/core/reporter.rb:126:in `notify'
    from mysql2/vendor/bundle/ruby/2.1.0/gems/rspec-core-2.14.7/lib/rspec/core/reporter.rb:109:in `finish'
    from mysql2/vendor/bundle/ruby/2.1.0/gems/rspec-core-2.14.7/lib/rspec/core/reporter.rb:60:in `report'
    from mysql2/vendor/bundle/ruby/2.1.0/gems/rspec-core-2.14.7/lib/rspec/core/command_line.rb:25:in `run'
    from mysql2/vendor/bundle/ruby/2.1.0/gems/rspec-core-2.14.7/lib/rspec/core/runner.rb:80:in `run'
    from mysql2/vendor/bundle/ruby/2.1.0/gems/rspec-core-2.14.7/lib/rspec/core/runner.rb:17:in `block in autorun'
ruby -S rspec ./spec/em/em_spec.rb ./spec/mysql2/client_spec.rb ./spec/mysql2/error_spec.rb ./spec/mysql2/result_spec.rb failed
@sodabrew

Bugger, have to make this conditional for Ruby 1.8, it's not happy about extra arguments to File.readlines.

@JonRowe
Owner

Thanks for the assist, we'd need some specs covering this behaviour and I'd also like to check what happens when you force a differently encoded file (say UTF16-LE) to binary with this method.

@JonRowe
Owner

Also we traditionally like to not run those sorts of checks at run time, if you extract the action to a private method you can conditionally define the implementation.

e.g.

private

  if defined?(Encoding)
    def file_lines(file_path)
      File.readlines(file_path, :encoding => 'BINARY')
    end
  else
    def file_lines(file_path)
      File.readlines(file_path) 
    end
  end
@sodabrew

If File.readlines were used more than once, I'd surely make it a function. I'll check if there are other uses.
Yep, needs tests. I'm hacking directly in the GitHub web interface right now, just getting this started :).

@JonRowe
Owner

I'd still prefer you refactor this so that it doesn't run the conditional check every time it prints a failed line, extracting a single use LOC into a method is perfectly acceptable in ruby for readability and things like this.

@sodabrew

Oh, got it - I didn't realize the frequency of use.

Something else occurred to me: I may be causing this problem by changing the Encoding.default_internal to control behavior in MySQL error message handling. I'll try changing the encoding closer to where I need it, and see if that helps.

@sodabrew

I tried restoring Encoding.default_internal immediately after triggering the expected exception, still falls over with this encoding error in rspec :(

@JonRowe
Owner

Encoding is tricky, which is why things like this need specs before changing ;) I'm not sure forcing 'BINARY' will solve this for all, or even the most common encoding situations.

Have you tried using the # encoding: directive in your files rather than monkeying around with Encoding.default_internal?

@sodabrew

My file contains UTF-8 and has encoding: utf8. I need to change the Encoding.default_internal in a spec in order to test behavior of the code under test. This is a weird case, admittedly.

The workaround of forcing your backtrace code to always operate in binary certainly ought to work for all encodings; in any event, your code is not sensitive to the encoding of my ruby files, so there's an assumption one way or the other (i.e. the assumption now is that the encoding of the file that might fail a test is the same as the spec script!)

@JonRowe
Owner

Historically within rspec binary has been the least compatible encoding, as it means "no encoding", remember this output is going to be mixed into peoples consoles, and that this will potentially corrupt the output. Replacing UTF-8 chars with there code points equivalents is not a solution.

@JonRowe
Owner

Looking at your code you're actually setting default_internal when you load your shared examples, so it won't have the desired effect anyway; could you try doing it in a before block, or even better stubbing it directly (so it's automatically cleaned up around your examples)

@sodabrew

The contents of the shared example runs only when the shared example is executed with arguments, no?

@JonRowe
Owner

No, the contents of the outer block are evaluated when you call it_should_behave_like but the examples won't be executed until later, meaning that the last value evaluated will be used in all examples.

Demonstrated thusly: https://gist.github.com/JonRowe/f19eb6f69692f90a1b26

@xaviershay
Collaborator

Closing, stale and it's not clear to me what the next step is.

@xaviershay xaviershay closed this
@sodabrew

No problem, the tests ended up being refactored take a different approach in any event - we actually had to implement a character sanitization routine in mysql2 because Ruby (before 2.1) does not have a best-effort option :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
This page is out of date. Refresh to see the latest.
Showing with 2 additions and 1 deletion.
  1. +2 −1  lib/rspec/core/formatters/base_formatter.rb
View
3  lib/rspec/core/formatters/base_formatter.rb
@@ -204,7 +204,8 @@ def read_failed_line(exception, example)
file_path, line_number = matching_line.match(/(.+?):(\d+)(|:\d+)/)[1..2]
if File.exist?(file_path)
- File.readlines(file_path)[line_number.to_i - 1] ||
+ lines = defined?(Encoding) ? File.readlines(file_path, :encoding => 'BINARY') : File.readlines(file_path)
+ lines[line_number.to_i - 1] ||
"Unable to find matching line in #{file_path}"
else
"Unable to find #{file_path} to read failed line"
Something went wrong with that request. Please try again.