Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Treat GB2312 encodings as GB18030 #504

Merged
merged 1 commit into from

2 participants

@bpot
GB 18030 character assignments are backwards compatible with the GB 2312-1980 standard and the GBK specification.

http://icu-project.org/docs/papers/gb18030.html

@bpot

This seems to be a relatively common issue.

@jeremy jeremy merged commit c299cda into mikel:master
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Jan 30, 2013
  1. @bpot

    Treat GB2312 encodings as GB18030

    bpot authored
This page is out of date. Refresh to see the latest.
View
6 lib/mail/version_specific/ruby_1_9.rb
@@ -132,6 +132,12 @@ def Ruby19.pick_encoding(charset)
when 'shift-jis'
Encoding::Shift_JIS
+ # Many encoded fields which self identify as GB2312 are
+ # actually GB18030. Just use GB18030 since it is a superset
+ # of GB2312.
+ when /gb2312/i
+ Encoding::GB18030
+
else
charset
end
View
5 spec/mail/encodings_spec.rb
@@ -165,6 +165,11 @@
string = '=?shift-jis?Q?=93=FA=96{=8C=EA=?='.force_encoding('us-ascii')
Mail::Encodings.value_decode(string).should == "日本語"
end
+
+ it "should decode GB18030 encoded string misidentified as GB2312" do
+ string = '=?GB2312?B?6V8=?='.force_encoding('us-ascii')
+ Mail::Encodings.value_decode(string).should == ""
+ end
end
end
Something went wrong with that request. Please try again.