encoding problems #31

Open
wants to merge 1 commit into
from

Projects

None yet

3 participants

@eloyesp
eloyesp commented Dec 5, 2011

I'm not sure if it is a problem or if i'm doing something bad.

I have a database with UTF-8 and I have simbols like á ñ ü and I'm ussing seed-fu:writer for write seed for this data.

when I try to use this seed I see that there are errors:

Localidad {:id=>26, :name=>"ALBARI\xC3\x91O", :departamento=>"PEHUAJO",
  :lat=>nil, :lng=>nil, :zoom=>nil, :provincia_id=>1}
Encoding::UndefinedConversionError: "\xC3" from ASCII-8BIT to UTF-8: INSERT INTO "localidades" ("created_at", "departamento", "id", "lat", "lng", "name", "provincia_id", "updated_at", "zoom") VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)

Is there something I can do?

@Fodoj
Fodoj commented Dec 5, 2011

I have same problem, with russian symbols.

@eloyesp eloyesp Adding compatibility for different encodings - fix #31
I change the default internal encoding because seed.inspect need to be
properly encoded.
174f796
@eloyesp eloyesp added a commit to eloyesp/seed-fu that referenced this pull request Dec 11, 2011
@eloyesp eloyesp Adding compatibility for different encodings - fix #31
I change the default internal encoding because seed.inspect need to be
properly encoded.
0138fd3
@jonleighton
Collaborator

Thanks for the patch.

Any chance you could rework the it to simply using Encoding.default_external? I think it's fine to put in the coding comment, but it doesn't need to be an option. (We should then make sure that anything that gets written to the IO is encoded as default external.)

Also, it needs to retain Ruby 1.8 compatibility.

Cheers

@eloyesp
eloyesp commented Jan 23, 2012

I've tried first to use default_external but it didn't work, because buffer << ' ' + seed.inspect generated ugly lines without default internal. (I don't remember what the problem was exactly, but it was this line.)

What should be if not an option?

It is not 1.8 compatible, why? Encoding is not there? then I should add a raise if option[:encoding] && Ruby.version < 1.9 ??

Thanks

@jonleighton
Collaborator

Yeah it looks like the seed.inspect will return a string in US-ASCII.

I think we need our own version of inspect that encodes the key/value, if it is a string. So something like:

def inspect_seed(seed)
  "{" + seed.map { |k, v| "#{encode_string k}=>#{encode_string v}" }.join(', ') + "}"
end

def encode_string(s)
  s.respond_to?(:encode) ? s.encode : s
end

That will deal with ruby 1.8 compat too as the string won't respond to encode.

@eloyesp
eloyesp commented Jan 26, 2012

Isn't it more complicated than change internal_encoding? what is the
problem with that approach.

The warning approach for ruby 1.8 seems simplier and easier to mantain.

@jonleighton
Collaborator

The reason I don't want to mess with Encoding.default_internal is that it's a global variable and changing it could impact other code outside seed-fu. So I think it is more appropriate to just respect whatever default_internal and default_external are already set to and just generate the file with that encoding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment