Fixing boundaries #5

Merged
merged 10 commits into from Feb 22, 2017

Projects

None yet

2 participants

@MartijnBraam

This fixes the regex that checks for boundaries (The end of line detection was wrong)
It also makes the regex strip the quotes if they exist and made the regex more strict (and no longer contains unbound searches)

This pullrequest also contains the chaines for #4

@djmattyg007

Thank you so much for putting together PHPUnit test cases. I'm incredibly thankful for your contribution. There's a few things that need updating, but they're mostly small things. All of the logic changes look fine to me.

PlancakeEmailParser.php
@@ -412,16 +412,14 @@ public function getBody($returnType = self::PLAINTEXT)
}
// there could be more than one boundary
- preg_match_all('!boundary=(.*?)[;$]!mi', $this->emailRawContent, $matches);
+ preg_match_all('/boundary=(?:|")([a-zA-Z0-9\(\)_\/+-]+)(?:|")(?:$|;)/mi', $this->emailRawContent, $matches);
@djmattyg007
djmattyg007 Feb 18, 2017 Owner

Could you please outline exactly what was wrong with the previous regex? This definitely looks more comprehensive than the previous one and more correct, but I'd appreciate it if you could outline some cases where the old regex didn't work but the new one does.

Also, can you please update the comment on line 414 to include the detail in the comment that was on line 419 but has been removed by your changes.

@MartijnBraam
MartijnBraam Feb 18, 2017

The [;$] part didn't seem to work on any of the test cases (at least in my php7 installation)

tests/PlancakeEmailParserTest.php
@@ -0,0 +1,104 @@
+<?php
+declare(strict_types = 1);
@djmattyg007
djmattyg007 Feb 18, 2017 Owner

As much as I love PHP 7 and its strict typing, technically this library still supports PHP5. I'm fine with dropping support for anything prior to PHP 5.6, but ideally I still need to be able to run the tests on PHP 5.6 for the foreseeable future. Could you therefore please remove this.

@MartijnBraam
MartijnBraam Feb 18, 2017

Hmm yes, I copied that straight from the phpunit docs without thinking.

tests/PlancakeEmailParserTest.php
+class PlancakeEmailParserTest extends TestCase {
+
+ public function testSubject() {
+ foreach (glob(__DIR__ . '/emails/*.txt') as $testFile) {
@djmattyg007
djmattyg007 Feb 18, 2017 Owner

Brace placement and indentation needs to follow the existing conventions of this codebase (which generally follow PSR-2).

tests/PlancakeEmailParserTest.php
+ foreach (glob(__DIR__ . '/emails/*.txt') as $testFile) {
+
+ $answerFile = str_replace('.txt', '.yml', $testFile);
+ $answers = \Symfony\Component\Yaml\Yaml::parse(
@djmattyg007
djmattyg007 Feb 18, 2017 Owner

Add a use statement for Symfony\Component\Yaml\Yaml at the top of the file.

tests/bootstrap.php
@@ -0,0 +1,2 @@
+<?php
+require_once(__DIR__ . '/../vendor/autoload.php');
@djmattyg007
djmattyg007 Feb 18, 2017 Owner

I prefer this style of writing paths:

require_once(dirname(__DIR__) . '/vendor/autoload.php');
+ All Polish diacritics included ;-) tag also included for additional tests.
+ It should look like in the picture attached.
+ Best regards
+ Pawel
@djmattyg007
djmattyg007 Feb 18, 2017 Owner

Can you please ensure that all of these files have a trailing newline.
See here for context: https://unix.stackexchange.com/questions/18743/whats-the-point-in-adding-a-new-line-to-the-end-of-a-file

@MartijnBraam
MartijnBraam Feb 18, 2017

I didn't give them newlines since that line is included in the multiline field, making the test fail.

Fixed it with another syntax in yaml.

@djmattyg007
Owner

Also, could you please close the other PR in favour of this one.

@MartijnBraam

I changed all the points in your review.

@djmattyg007 djmattyg007 merged commit 2c79883 into djmattyg007:master Feb 22, 2017
@djmattyg007
Owner

I've merged all of your changes. Thanks for your contribution! I'll be issuing a new release within the next 24 hours with your improvements.

@djmattyg007
Owner

I've just released v4.0.0 https://github.com/djmattyg007/official-library-php-email-parser/releases/tag/4.0.0

Two of the subject-parsing test cases were failing on my machine, as I have the IMAP PHP extension installed and enabled. It turned out that not using the IMAP extension produced incorrect results for any encoded non-ASCII text. As a result, I've made the decision to require the IMAP extension to ensure correct, consistent results across all environments. I apologise if this requires you to make changes to your runtime environments, but I figured it was best not to persist with a solution that could produce two different results in two different environments due to what amounts to a misconfiguration.

@MartijnBraam

I chosen this class because I couldn't install the imap extension. Wouldn't it be better to just fix the encoding issue?

@djmattyg007
Owner

The issue isn't with the input (which is perfecrly valid). The iconv_mime_decode function quite simply produces incorrect results for text with various diacritics, and my preference is that this library cannot produce incorrect results under any circumstances.

I've opened an issue at the Symfony Polyfill repository to see if anyone is aware of a proper polyfill: symfony/polyfill#83

A possible interim solution may be to allow forcing the undesirable behaviour by making the user set a variable on the object. What sort of override would suit you best?

@djmattyg007
Owner

It looks like an override is unnecessary. The mb_decode_mimeheader() function works perfectly as a replacement, so I'll be pushing a new version shortly that uses that instead.

@djmattyg007
Owner

I've just pushed a new version that removes the dependency on the IMAP PHP extension https://github.com/djmattyg007/official-library-php-email-parser/releases/tag/5.0.0

Given you started off by using PHP7 syntax in the PHPUnit test, I hope the fact that I've dropped support for PHP <= 5.5 won't be a problem.

I've also set up CI on Travis: https://travis-ci.org/djmattyg007/official-library-php-email-parser

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment