Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Decode quoted-printable with bad line breaks #493

Merged
merged 1 commit into from

1 participant

@jeremy
Collaborator

Many many clients hex-encode the \r, \n, or \r\n line break when they're supposed to use a literal CRLF (quoted-printable had the idea that they'd be line break agnostic). To make matters worse, they also include the CRLF after the encoded line break, so decoding it results in a double line break. If anything, they should use a soft line break.

"=0D\r\n" should decode as "\r\n" not "\r\r\n"
"=0A\r\n" should decode as "\r\n" not "\n\r\n"
"=0A=0D\r\n" should decode as "\r\n" not "\r\n\r\n"

Clients using hex-encoded line breaks along with quoted-printable soft line breaks still work as expected.

@jeremy jeremy Decode quoted-printable with bad line breaks.
Many many clients hex-encode the \r, \n, or \r\n line break when they're
supposed to use a literal CRLF (quoted-printable had the idea that
they'd be line break agnostic). To make matters worse, they also include
the CRLF after the encoded line break, so decoding it results in a
double line break. If anything, they should use a soft line break.

"=0D\r\n" should decode as "\n" not "\r\r\n"
"=0A\r\n" should decode as "\n" not "\n\r\n"
"=0A=0D\r\n" should decode as "\n" not "\r\n\r\n"

Clients using hex-encoded line breaks along with quoted-printable soft
line breaks still work as expected.
e8f42b2
@jeremy jeremy merged commit af770e2 into from
@jeremy jeremy referenced this pull request from a commit
@jeremy jeremy Update CHANGELOG for #493 83c0321
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Jan 28, 2013
  1. @jeremy

    Decode quoted-printable with bad line breaks.

    jeremy authored
    Many many clients hex-encode the \r, \n, or \r\n line break when they're
    supposed to use a literal CRLF (quoted-printable had the idea that
    they'd be line break agnostic). To make matters worse, they also include
    the CRLF after the encoded line break, so decoding it results in a
    double line break. If anything, they should use a soft line break.
    
    "=0D\r\n" should decode as "\n" not "\r\r\n"
    "=0A\r\n" should decode as "\n" not "\n\r\n"
    "=0A=0D\r\n" should decode as "\n" not "\r\n\r\n"
    
    Clients using hex-encoded line breaks along with quoted-printable soft
    line breaks still work as expected.
This page is out of date. Refresh to see the latest.
View
5 lib/mail/encodings/quoted_printable.rb
@@ -12,9 +12,10 @@ def self.can_encode?(str)
EightBit.can_encode? str
end
- # Decode the string from Quoted-Printable
+ # Decode the string from Quoted-Printable. Cope with hard line breaks
+ # that were incorrectly encoded as hex instead of literal CRLF.
def self.decode(str)
- str.unpack("M*").first.to_lf
+ str.gsub(/(?:=0D=0A|=0D|=0A)\r\n/, "\r\n").unpack("M*").first.to_lf
end
def self.encode(str)
View
16 spec/mail/encodings/quoted_printable_spec.rb
@@ -31,5 +31,21 @@
result = "\000\000\000\000"
Mail::Encodings::QuotedPrintable.decode("=00=00=00=00").should eq result
end
+
+ %w(=0D =0A =0D=0A).each do |linebreak|
+ expected = "first line wraps\n\nsecond paragraph"
+ it "should cope with inappropriate #{linebreak} line break encoding" do
+ body = "first line=\r\n wraps#{linebreak}\r\n#{linebreak}\r\nsecond paragraph=\r\n"
+ Mail::Encodings::QuotedPrintable.decode(body).should eq expected
+ end
+ end
+
+ [["\r", "=0D"], ["\n", "=0A"], ["\r\n", "=0D=0A"]].each do |crlf, linebreak|
+ expected = "first line wraps\n\nsecond paragraph"
+ it "should allow encoded #{linebreak} line breaks with soft line feeds" do
+ body = "first line=\r\n wraps#{linebreak}=\r\n#{linebreak}=\r\nsecond paragraph=\r\n"
+ Mail::Encodings::QuotedPrintable.decode(body).should eq expected
+ end
+ end
end
Something went wrong with that request. Please try again.