Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Treat GB2312 encodings as GB18030 #504

Merged
merged 1 commit into from

2 participants

Bob Potter Jeremy Kemper
Bob Potter
GB 18030 character assignments are backwards compatible with the GB 2312-1980 standard and the GBK specification.

http://icu-project.org/docs/papers/gb18030.html

Bob Potter

This seems to be a relatively common issue.

Jeremy Kemper jeremy merged commit c299cda into from
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Jan 30, 2013
  1. Bob Potter

    Treat GB2312 encodings as GB18030

    bpot authored
This page is out of date. Refresh to see the latest.
6 lib/mail/version_specific/ruby_1_9.rb
View
@@ -132,6 +132,12 @@ def Ruby19.pick_encoding(charset)
when 'shift-jis'
Encoding::Shift_JIS
+ # Many encoded fields which self identify as GB2312 are
+ # actually GB18030. Just use GB18030 since it is a superset
+ # of GB2312.
+ when /gb2312/i
+ Encoding::GB18030
+
else
charset
end
5 spec/mail/encodings_spec.rb
View
@@ -165,6 +165,11 @@
string = '=?shift-jis?Q?=93=FA=96{=8C=EA=?='.force_encoding('us-ascii')
Mail::Encodings.value_decode(string).should == "日本語"
end
+
+ it "should decode GB18030 encoded string misidentified as GB2312" do
+ string = '=?GB2312?B?6V8=?='.force_encoding('us-ascii')
+ Mail::Encodings.value_decode(string).should == ""
+ end
end
end
Something went wrong with that request. Please try again.