Tar input needs to be encoding aware #149

Closed
tenderlove opened this Issue Aug 8, 2011 · 1 comment

Comments

Projects
None yet
2 participants
@tenderlove
Contributor

tenderlove commented Aug 8, 2011

YAML files in rubygems are written as UTF-8 documents. Rubygems does not specify an encoding when reading the gem, so it's possible to read the YAML file in as US-ASCII when it contains UTF-8 characters. This will cause an invalid byte sequence error to be raised.

Steps to reproduce:

  1. Download this version of bundler. It contains a YAML file with UTF-8 characters in it.

  2. Try to install the gem using 1.9.2+ and specify the local lang as ISO-8859-1.

Here is my output:

$ LANG=ISO8859-1 gem install ~/Downloads/bundler-1.0.16.gem --backtrace
ERROR:  While executing gem ... (ArgumentError)
    invalid byte sequence in US-ASCII
    /Users/aaron/.local/lib/ruby/1.9.1/rubygems/specification.rb:575:in `normalize_yaml_input'
    /Users/aaron/.local/lib/ruby/1.9.1/rubygems/specification.rb:487:in `from_yaml'
    /Users/aaron/.local/lib/ruby/1.9.1/rubygems/package/tar_input.rb:190:in `load_gemspec'
    /Users/aaron/.local/lib/ruby/1.9.1/rubygems/package/tar_input.rb:55:in `block in initialize'
    /Users/aaron/.local/lib/ruby/1.9.1/rubygems/package/tar_reader.rb:64:in `block in each'
    /Users/aaron/.local/lib/ruby/1.9.1/rubygems/package/tar_reader.rb:55:in `loop'
    /Users/aaron/.local/lib/ruby/1.9.1/rubygems/package/tar_reader.rb:55:in `each'
    /Users/aaron/.local/lib/ruby/1.9.1/rubygems/package/tar_input.rb:35:in `initialize'
    /Users/aaron/.local/lib/ruby/1.9.1/rubygems/package/tar_input.rb:20:in `new'
    /Users/aaron/.local/lib/ruby/1.9.1/rubygems/package/tar_input.rb:20:in `open'
    /Users/aaron/.local/lib/ruby/1.9.1/rubygems/package.rb:44:in `open'
    /Users/aaron/.local/lib/ruby/1.9.1/rubygems/format.rb:62:in `from_io'
    /Users/aaron/.local/lib/ruby/1.9.1/rubygems/format.rb:46:in `block in from_file_by_path'
    /Users/aaron/.local/lib/ruby/1.9.1/rubygems/format.rb:45:in `open'
    /Users/aaron/.local/lib/ruby/1.9.1/rubygems/format.rb:45:in `from_file_by_path'
    /Users/aaron/.local/lib/ruby/1.9.1/rubygems/dependency_installer.rb:218:in `block in find_spec_by_name_and_version'
    /Users/aaron/.local/lib/ruby/1.9.1/rubygems/dependency_installer.rb:215:in `each'
    /Users/aaron/.local/lib/ruby/1.9.1/rubygems/dependency_installer.rb:215:in `find_spec_by_name_and_version'
    /Users/aaron/.local/lib/ruby/1.9.1/rubygems/dependency_installer.rb:259:in `install'
    /Users/aaron/.local/lib/ruby/1.9.1/rubygems/commands/install_command.rb:121:in `block in execute'
    /Users/aaron/.local/lib/ruby/1.9.1/rubygems/commands/install_command.rb:115:in `each'
    /Users/aaron/.local/lib/ruby/1.9.1/rubygems/commands/install_command.rb:115:in `execute'
    /Users/aaron/.local/lib/ruby/1.9.1/rubygems/command.rb:278:in `invoke'
    /Users/aaron/.local/lib/ruby/1.9.1/rubygems/command_manager.rb:147:in `process_args'
    /Users/aaron/.local/lib/ruby/1.9.1/rubygems/command_manager.rb:117:in `run'
    /Users/aaron/.local/lib/ruby/1.9.1/rubygems/gem_runner.rb:65:in `run'
    /Users/aaron/.local/bin/gem:21:in `<main>'
$

Here is my gem env:

$ gem env
RubyGems Environment:
  - RUBYGEMS VERSION: 1.8.6.1
  - RUBY VERSION: 1.9.4 (2011-08-05 patchlevel -1) [x86_64-darwin11.0.0]
  - INSTALLATION DIRECTORY: /Users/aaron/.local/lib/ruby/gems/1.9.1
  - RUBY EXECUTABLE: /Users/aaron/.local/bin/ruby
  - EXECUTABLE DIRECTORY: /Users/aaron/.local/bin
  - RUBYGEMS PLATFORMS:
    - ruby
    - x86_64-darwin-11
  - GEM PATHS:
     - /Users/aaron/.local/lib/ruby/gems/1.9.1
     - /Users/aaron/.gem/ruby/1.9.1
  - GEM CONFIGURATION:
     - :update_sources => true
     - :verbose => true
     - :benchmark => false
     - :backtrace => false
     - :bulk_threshold => 1000
  - REMOTE SOURCES:
     - http://rubygems.org/
$
@tenderlove

This comment has been minimized.

Show comment Hide comment
@tenderlove

tenderlove Aug 8, 2011

Contributor

I forgot to mention: it should be safe to assume the encoding of the YAML file is UTF-8. Rubygems doesn't specify an encoding when constructing the YAML file, so psych will default the encoding to UTF-8.

Contributor

tenderlove commented Aug 8, 2011

I forgot to mention: it should be safe to assume the encoding of the YAML file is UTF-8. Rubygems doesn't specify an encoding when constructing the YAML file, so psych will default the encoding to UTF-8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment