-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Properly parse FROM email headers with colon, comma characters #13
Comments
I believe this is correct. The name before angle brackets is produced by an
That said, if you knew for certainty that you were parsing a single address (where a comma should not appear as a tokenization point), you could escape the comma with a unique string and then unescape in the resulting output without too much difficulty. |
hlian is right, but I'd also say "or they have to be quoted". The "display name" portion of an email address can be made up of words, which can only contain commas if they are quoted strings. So It may be the assumption of the data (against the RFC) that since this is a single email address, it can drop the quotes. There's many imaginable hacks to get around that. For instance, if the email doesn't have a quote in it, but does have an angle bracket, start with a quote and replace '<' with '"<'. Unfortunate. |
Thanks for the responses. An app using the library can definitely take some of the scrubbing/cleanup actions you're describing before parsing, but it'd be great if the lib had something like a There's a certain elegance to this library that I can pass reasonably sensible headers to it and it just works, but having to precondition all of the data makes that feel a little kludgy. This is all nit-picking, though... thanks for the responses! :-) |
We've encountered a few email address headers that are not properly parsed, likely because they contain comma and colon characters. These return
null
when parsed viaparseOneAddress
.Test code:
Here's some sample inputs and null failure output:
Removing the colon and comma characters generates successful output:
Are these not parsing properly because they don't meet the RFC 5322 standard?
It'd be great to be able to support these and headers like them, particularly on calls to
parseOneAddress
, as delimiter checking shouldn't be necessary for a single address. These and headers like them are seen commonly enough in email we are seeing in practice that it may make sense to extend support.The text was updated successfully, but these errors were encountered: