Skip to content
PHP library for parsing plain text email content.
PHP
Branch: master
Clone or download

Latest commit

willdurand Merge pull request #73 from Spone/patch-1
Change FR regex to accept nbsp before `:`
Latest commit 642bec1 Mar 25, 2020

Files

Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
src Fix regex for nbsp Mar 24, 2020
tests
.gitignore Refactored tests, improved multiline header detection May 13, 2012
.travis.yml Added php 7.2 on travis Oct 1, 2018
CONTRIBUTING.md
LICENSE Added License Nov 16, 2011
README.md Mark as PHP7 ready Dec 8, 2015
composer.json
phpunit.xml.dist Use PSR-4 and autoload-dev Oct 1, 2015

README.md

EmailReplyParser

Build Status Total Downloads Latest Stable Version PHP7 ready

EmailReplyParser is a PHP library for parsing plain text email content, based on GitHub's email_reply_parser library written in Ruby.

Installation

The recommended way to install EmailReplyParser is through Composer:

composer require willdurand/email-reply-parser

Usage

Instantiate an EmailParser object and parse your email:

<?php

use EmailReplyParser\Parser\EmailParser;

$email = (new EmailParser())->parse($emailContent);

You get an Email object that contains a set of Fragment objects. The Email class exposes two methods:

  • getFragments(): returns all fragments;
  • getVisibleText(): returns a string which represents the content considered as "visible".

The Fragment represents a part of the full email content, and has the following API:

<?php

$fragment = current($email->getFragments());

$fragment->getContent();

$fragment->isSignature();

$fragment->isQuoted();

$fragment->isHidden();

$fragment->isEmpty();

Alternatively, you can rely on the EmailReplyParser to either parse an email or get its visible content in a single line of code:

$email = \EmailReplyParser\EmailReplyParser::read($emailContent);

$visibleText = \EmailReplyParser\EmailReplyParser::parseReply($emailContent);

Known Issues

Quoted Headers

Quoted headers aren't picked up if there's an extra line break:

On <date>, <author> wrote:

> blah

Also, they're not picked up if the email client breaks it up into multiple lines. GMail breaks up any lines over 80 characters for you.

On <date>, <author>
wrote:
> blah

The above On ....wrote: can be cleaned up with the following regex:

$fragment_without_date_author = preg_replace(
    '/\nOn(.*?)wrote:(.*?)$/si',
    '',
    $fragment->getContent()
);

Note though that we're search for "on" and "wrote". Therefore, it won't work with other languages.

Possible solution: Remove "reply@reply.github.com" lines...

Weird Signatures

Lines starting with - or _ sometimes mark the beginning of signatures:

Hello

--
Rick

Not everyone follows this convention:

Hello

Mr Rick Olson
Galactic President Superstar Mc Awesomeville
GitHub

**********************DISCLAIMER***********************************
* Note: blah blah blah                                            *
**********************DISCLAIMER***********************************

Strange Quoting

Apparently, prefixing lines with > isn't universal either:

Hello

--
Rick

________________________________________
From: Bob [reply@reply.github.com]
Sent: Monday, March 14, 2011 6:16 PM
To: Rick

Unit Tests

Setup the test suite using Composer:

$ composer install

Run it using PHPUnit:

$ phpunit

Contributing

See CONTRIBUTING file.

Credits

License

EmailReplyParser is released under the MIT License. See the bundled LICENSE file for details.

You can’t perform that action at this time.