Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

One go parsing #3

Open
wants to merge 2 commits into from

2 participants

@ruz
ruz commented

This pull request is a request for review and comments.

It fixes several issues reported earlier, but a few test start to fail. Could you please review failing tests and decide how to change tests and/or code?

ruz added some commits
@ruz ruz properly escape chars when quoting phrase
Not only " should be escaped, but \ as well.
d21f709
@ruz ruz parse in one go using capturing regexp
Before this change each matched mailbox in the string
was re-matched with separate regexps to extract parts.
This could lead to very suprising results. See ticket #52102
for details [1].

Also, quoted phrases was not de-quoted properly.

[1] https://rt.cpan.org/Ticket/Display.html?id=52102
89fddb4
@rjbs
Owner

I believe that the failing test cases are test bugs, not code bugs in your changes.

Also, I loathe these tests. For example, this is terrible:

[
'"Name"',
...
]

The first argument there is fed to build a new address against which to compare the parsed one, so the double-quotes are redundant. It would be much better if the test values were tests for the results of the phrase, address, comment, as_string, and name methods. This would break tests with dumb extra quotes, but would make the meaning of the tests much easier. I'd actually like the tests to be more in the form:

$tests = {
  q{"Some Input" <string@example.com>} => [
    {
      as_string => ...defaults to "should be $input"...,
      phrase => "Some Input",
      address => 'string@example.com',
    }
  ],
  ...,
}

...maybe I'll do that after you're done, unless you build something from that Dominic Sayers corpus. ;)

@rjbs
Owner

I pushed a few small changes, primarily to make it easier for me to skim test output.

@rjbs
Owner

You wrote:

This pull request is a request for review and comments.

I reviewed, I think. What's left for this to be merged? Test fixes?

@ruz

test fixes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Nov 6, 2012
  1. @ruz

    properly escape chars when quoting phrase

    ruz authored
    Not only " should be escaped, but \ as well.
  2. @ruz

    parse in one go using capturing regexp

    ruz authored
    Before this change each matched mailbox in the string
    was re-matched with separate regexps to extract parts.
    This could lead to very suprising results. See ticket #52102
    for details [1].
    
    Also, quoted phrases was not de-quoted properly.
    
    [1] https://rt.cpan.org/Ticket/Display.html?id=52102
This page is out of date. Refresh to see the latest.
Showing with 60 additions and 22 deletions.
  1. +21 −20 lib/Email/Address.pm
  2. +8 −1 t/quoting.t
  3. +31 −1 t/tests.t
View
41 lib/Email/Address.pm
@@ -144,6 +144,12 @@ $angle_addr = qr/$cfws*<$addr_spec>$cfws*/;
$name_addr = qr/$display_name?$angle_addr/;
$mailbox = qr/(?:$name_addr|$addr_spec)$comment*/;
+my $capturing_mailbox = qr/
+ (?:($display_name)?($cfws*)<($addr_spec)>($cfws*)
+ |($addr_spec)
+ )($comment*)
+/x;
+
sub _PHRASE () { 0 }
sub _ADDRESS () { 1 }
sub _COMMENT () { 2 }
@@ -213,36 +219,31 @@ sub parse {
return @cached;
}
- my (@mailboxes) = ($line =~ /$mailbox/go);
+ my (@mailboxes) = ($line =~ /($capturing_mailbox)/go);
my @addrs;
- foreach (@mailboxes) {
- my $original = $_;
-
- my @comments = /($comment)/go;
- s/$comment//go if @comments;
+ while (my @list = splice @mailboxes, 0, 7) {
+ my ($original, $phrase, $address, @comments)
+ = ($list[0], $list[1], $list[3]||$list[5], @list[2,4,6]);
- my ($user, $host, $com);
- ($user, $host) = ($1, $2) if s/<($local_part)\@($domain)>//o;
- if (! defined($user) || ! defined($host)) {
- s/($local_part)\@($domain)//o;
- ($user, $host) = ($1, $2);
- }
+ return if $address =~ /\P{ASCII}/;
- return if $user =~ /\P{ASCII}/;
- return if $host =~ /\P{ASCII}/;
+ if ( defined $phrase ) {
+ # for backwards compatibility
+ unshift @comments, $phrase =~ /($comment)/go;
+ $phrase =~ s/$comment//go;
- my ($phrase) = /($display_name)/o;
+ $phrase =~ s/\\(.)/$1/g if $phrase =~ s/\A\s*"(.*)"\s*\z/$1/;
+ }
- for ( $phrase, $host, $user, @comments ) {
+ for ( $phrase, $address, @comments ) {
next unless defined $_;
s/^\s+//;
s/\s+$//;
$_ = undef unless length $_;
}
- my $new_comment = join q{ }, @comments;
- push @addrs,
- $class->new($phrase, "$user\@$host", $new_comment, $original);
+ my $new_comment = join q{ }, grep defined, @comments;
+ push @addrs, $class->new($phrase, $address, $new_comment, $original);
$addrs[-1]->[_IN_CACHE] = [ \$line, $#addrs ]
}
@@ -441,7 +442,7 @@ sub _enquoted_phrase {
return $phrase if $phrase =~ /\A=\?.+\?=\z/;
$phrase =~ s/\A"(.+)"\z/$1/;
- $phrase =~ s/\"/\\"/g;
+ $phrase =~ s/([\\"])/\\$1/g;
return qq{"$phrase"};
}
View
9 t/quoting.t
@@ -2,7 +2,7 @@
use strict;
use Email::Address;
-use Test::More tests => 6;
+use Test::More tests => 8;
my $phrase = q{jack!work};
my $email = 'jack@work.com';
@@ -36,3 +36,10 @@ is(
);
is($ea3->phrase, $phrase, "the phrase method returns the right thing");
+
+{
+ my $mailbox = q{"jack \\"\\\\\\" robinson" <jack@work.com>};
+ my ($ea) = Email::Address->parse($mailbox);
+ is $ea->phrase, q{jack "\\" robinson};
+ is $ea->format, q{"jack \\"\\\\\\" robinson" <jack@work.com>};
+}
View
32 t/tests.t
@@ -1618,7 +1618,37 @@ my @list = (
undef,
],
],
- ]
+ ],
+ [
+ '"Newsletter from <superm-- ATAT --rket>" <newsletter-- ATAT --example.com>',
+ [
+ [
+ "Newsletter from <superm-- ATAT --rket>",
+ 'newsletter-- ATAT --example.com',
+ undef,
+ ],
+ ],
+ ],
+ [
+ '"Lawrence \\"Yogi\\" Berra" <yogi-- ATAT --example.com>',
+ [
+ [
+ 'Lawrence "Yogi" Berra',
+ 'yogi-- ATAT --example.com',
+ undef,
+ ],
+ ],
+ ],
+ [
+ '"Peter \Sales Department" <peter-- ATAT --example.com>',
+ [
+ [
+ "Peter Sales Department",
+ 'peter-- ATAT --example.com',
+ undef,
+ ],
+ ],
+ ],
);
my $tests = 1;
Something went wrong with that request. Please try again.