Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

encoding problems #31

Open
wants to merge 1 commit into from

3 participants

@eloyesp

I'm not sure if it is a problem or if i'm doing something bad.

I have a database with UTF-8 and I have simbols like á ñ ü and I'm ussing seed-fu:writer for write seed for this data.

when I try to use this seed I see that there are errors:

Localidad {:id=>26, :name=>"ALBARI\xC3\x91O", :departamento=>"PEHUAJO",
  :lat=>nil, :lng=>nil, :zoom=>nil, :provincia_id=>1}
Encoding::UndefinedConversionError: "\xC3" from ASCII-8BIT to UTF-8: INSERT INTO "localidades" ("created_at", "departamento", "id", "lat", "lng", "name", "provincia_id", "updated_at", "zoom") VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)

Is there something I can do?

@Fodoj

I have same problem, with russian symbols.

@eloyesp eloyesp Adding compatibility for different encodings - fix #31
I change the default internal encoding because seed.inspect need to be
properly encoded.
174f796
@eloyesp eloyesp referenced this pull request from a commit in eloyesp/seed-fu
@eloyesp eloyesp Adding compatibility for different encodings - fix #31
I change the default internal encoding because seed.inspect need to be
properly encoded.
0138fd3
@jonleighton
Collaborator

Thanks for the patch.

Any chance you could rework the it to simply using Encoding.default_external? I think it's fine to put in the coding comment, but it doesn't need to be an option. (We should then make sure that anything that gets written to the IO is encoded as default external.)

Also, it needs to retain Ruby 1.8 compatibility.

Cheers

@eloyesp

I've tried first to use default_external but it didn't work, because buffer << ' ' + seed.inspect generated ugly lines without default internal. (I don't remember what the problem was exactly, but it was this line.)

What should be if not an option?

It is not 1.8 compatible, why? Encoding is not there? then I should add a raise if option[:encoding] && Ruby.version < 1.9 ??

Thanks

@jonleighton
Collaborator

Yeah it looks like the seed.inspect will return a string in US-ASCII.

I think we need our own version of inspect that encodes the key/value, if it is a string. So something like:

def inspect_seed(seed)
  "{" + seed.map { |k, v| "#{encode_string k}=>#{encode_string v}" }.join(', ') + "}"
end

def encode_string(s)
  s.respond_to?(:encode) ? s.encode : s
end

That will deal with ruby 1.8 compat too as the string won't respond to encode.

@eloyesp
@jonleighton
Collaborator

The reason I don't want to mess with Encoding.default_internal is that it's a global variable and changing it could impact other code outside seed-fu. So I think it is more appropriate to just respect whatever default_internal and default_external are already set to and just generate the file with that encoding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Dec 11, 2011
  1. @eloyesp

    Adding compatibility for different encodings - fix #31

    eloyesp authored
    I change the default internal encoding because seed.inspect need to be
    properly encoded.
This page is out of date. Refresh to see the latest.
Showing with 37 additions and 1 deletion.
  1. +23 −1 lib/seed-fu/writer.rb
  2. +14 −0 spec/writer_spec.rb
View
24 lib/seed-fu/writer.rb
@@ -31,9 +31,11 @@ class Writer
# @option options [:seed, :seed_once] :seed_type (:seed) The method to use when generating
# seeds. See {ActiveRecordExtension} for details.
# @option options [Array<Symbol>] :constraints ([:id]) The constraining attributes for the seeds
+ # @option options [String] :encoding The encoding to use in the generated file.
def initialize(options = {})
@options = self.class.default_options.merge(options)
raise ArgumentError, "missing option :class_name" unless @options[:class_name]
+ set_encoding if @options[:encoding]
end
# Creates a new instance of {Writer} with the `options`, and then calls {#write} with the
@@ -56,10 +58,11 @@ def write(io_or_filename, &block)
if io_or_filename.respond_to?(:write)
write_to_io(io_or_filename, &block)
else
- File.open(io_or_filename, 'w') do |file|
+ File.open(io_or_filename, "w#{encoding_string}") do |file|
write_to_io(file, &block)
end
end
+ unset_encoding if @old_encoding
end
# Add a seed. Must be called within a block passed to {#write}.
@@ -88,6 +91,7 @@ def <<(seed)
def write_to_io(io)
@io, @count = io, 0
+ @io.write(encoding_comment) if @options[:encoding]
@io.write(file_header)
@io.write(seed_header)
yield(self)
@@ -128,5 +132,23 @@ def seed_footer
def chunk_this_seed?
@count != 0 && (@count % @options[:chunk_size]) == 0
end
+
+ def encoding_string
+ ":#{@options[:encoding]}" if @options[:encoding]
+ end
+
+ def encoding_comment
+ "# encoding: #{@options[:encoding]}\n"
+ end
+
+ def set_encoding
+ @old_encoding = Encoding.default_internal
+ Encoding.default_internal = @options[:encoding]
+ end
+
+ def unset_encoding
+ Encoding.default_internal = @old_encoding if @old_encoding
+ end
end
end
+
View
14 spec/writer_spec.rb
@@ -1,3 +1,4 @@
+# encoding: utf-8
require 'spec_helper'
describe SeedFu::Writer do
@@ -42,4 +43,17 @@
SeededModel.find(1).title.should == "Dr"
end
+
+ it "should support specifying the output encoding to use" do
+ SeedFu::Writer.write(@file_name, :class_name => 'SeededModel', :encoding => "utf-8") do |writer|
+ writer << { :id => 1, :title => "Mr" }
+ writer << { :id => 2, :title => "Máster" }
+ end
+
+ File.read(@file_name).should include("# encoding: utf-8\n")
+
+ load @file_name
+ SeededModel.find(2).title.should == "Máster"
+ end
end
+
Something went wrong with that request. Please try again.