RFC 5322 compliant parser for email addresses
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
src
test
README.md
darts.lib.email-address-test.asd
darts.lib.email-address.asd

README.md

Email Address Parsing

This library provides a fully RFC 5322 compliant parser for email addresses. Also provided are a few tiny helper functions, which allow the formatting of email addresses in ways, which ensures, that they are RFC 5322 compliant.

This library has been tested under

  • SBCL
  • Clozure Common Lisp
  • LispWorks
  • ABCL

Package DARTS.LIB.EMAIL-ADDRESS

  • Variable: *allow-unicode*

    If true, the parser functions accept arbitrary characters (with char-code > 127) in addition to what they accept otherwise. This affects the productions of ctext, atext, qtext, and dtext. In other words: something like

    Däsiree Äßeldahl <d.äßeldahl@secret-äskulap.com>

    becomes a valid email address. All base parser functions take a :allow-unicode keyword argument, whose default value is the value of this variable.

  • Variable: *allow-obsolete-syntax*

    If true, enable support for a few of the obs-... productions in the RFC. This is disabled by default. Right now, enabling this option makes

    R. L. Stephenson <r.l.stephenson@literature-and-coffee.cookies>

    a well-formed mailbox spec. Without this option enabled, the address must be written as (e.g.)

    "R. L. Stephenson" <r.l.stephenson@literature-and-coffee.cookies>

  • Function: parse-rfc5322-addr-spec string &key start end allow-unicode allow-trailing-junklocal-part domain error position

    Parse string (or a subequence of it) as an RFC 5322 addr-spec. If allow-unicode, characters outside of the ASCII range (i.e., with codes > 127) are allowed virtually anywhere. See *ALLOW-UNICODE* for details, whose value also is the default for this argument.

    The values of start and end are bounding index designators for the part of string to work on.

    If allow-trailing-garbage is false (the default), the parser function makes sure, that no unprocessed characters remain in the designated input region of string after a full address has successfully been parsed. If the value is true, this function does not check for unprocessed characters; the caller may inspect the returned position value to determine, whether the string was processed fully, or whether unprocessed characters remain.

    This function returns four values:

    • local-part is the value of the address' local part. If parsing fails early enough, this value is nil.

    • domain is the value of the address' domain part. If parsing fails, before the domain is encountered, this value is nil.

    • error is a nil, if the string could be parsed successfully. Otherwise, it is a keyword symbol, which indicates, why the parser stopped.

    • position is an integer, which identifies the first character in string, which has not been processed by this function.

  • Function: parse-rfc5322-mailbox string &key start end allow-unicode allow-obsolete-syntax allow-trailing-junklocal-part domain display-name error

    Parse string (or a subequence of it) as an RFC 5322 mailbox. If allow-unicode, characters outside of the ASCII range (i.e., with codes > 127) are allowed virtually anywhere. See *ALLOW-UNICODE* for details, whose value also is the default for this argument.

    If allow-obsolete-syntax is false (the default), this function is very strict with respect to the accepted input. In particular, none of the obs- productions is recognized in any of the address components. By supplying a value of true for this argument, the parser becomes more lenient, accepting values, which have historically been accepted as well-formed addresses. See *ALLOW-OBSOLETE-SYNTAX* for details.

    The values of start and end are bounding index designators for the part of string to work on.

    If allow-trailing-garbage is false (the default), the parser function makes sure, that no unprocessed characters remain in the designated input region of string after a full address has successfully been parsed. If the value is true, this function does not check for unprocessed characters; the caller may inspect the returned position value to determine, whether the string was processed fully, or whether unprocessed characters remain.

    This function returns five values:

    • local-part is the value of the address' local part. If parsing fails early enough, this value is nil.

    • domain is the value of the address' domain part. If parsing fails, before the domain is encountered, this value is nil.

    • display-name is the display name found, or nil, if the address did not contain a display name part.

    • error is a nil, if the string could be parsed successfully. Otherwise, it is a keyword symbol, which indicates, why the parser stopped.

    • position is an integer, which identifies the first character in string, which has not been processed by this function.

  • Function: parse-rfc5322-mailbox-list string &key start end allow-unicode allow-obsolete-syntaxlist error position

    Parse string (or a subequence of it) as a comma separated list of RFC 5322 mailbox specifications. If allow-unicode, characters outside of the ASCII range (i.e., with codes > 127) are allowed virtually anywhere. See *ALLOW-UNICODE* for details, whose value also is the default for this argument.

    If allow-obsolete-syntax is false (the default), this function is very strict with respect to the accepted input. In particular, none of the obs- productions is recognized in any of the address components. By supplying a value of true for this argument, the parser becomes more lenient, accepting values, which have historically been accepted as well-formed addresses. See *ALLOW-OBSOLETE-SYNTAX* for details.

    The values of start and end are bounding index designators for the part of string to work on.

    If allow-trailing-garbage is false (the default), the parser function makes sure, that no unprocessed characters remain in the designated input region of string after a full address has successfully been parsed. If the value is true, this function does not check for unprocessed characters; the caller may inspect the returned position value to determine, whether the string was processed fully, or whether unprocessed characters remain.

    This function returns three values:

    • list is a list of sub-lists of the form (local-part domain display-name), one sublist for each successfully parsed mailbox specification in the input string. The elements appear in the order, they are found in the input.

    • error is a nil, if the string could be parsed successfully. Otherwise, it is a keyword symbol, which indicates, why the parser stopped.

    • position is an integer, which identifies the first character in string, which has not been processed by this function.

  • Function: escape-local-part string &key start endresult

    Ensures, that string is properly escaped for use as the local part of an email address. If necessary, this function adds quotes and backslashes. Note, that non-ASCII characters with codes > 127 are not special cased by this function, i.e., they are implicitly allowed.

    The values of start and end are bounding index designators for the part of string to work on.

  • Function: escape-display-name string &key start endresult

    Ensures, that string is properly escaped for use as the display name of a mailbox. If necessary, this function adds quotes and backslashes. Note, that non-ASCII characters with codes > 127 are not special cased by this function, i.e., they are implicitly allowed.

    The values of start and end are bounding index designators for the part of string to work on.

  • Structure: address

    Instances of this structure represent email addresses. Basically, an address is a pair of "local part" and "domain". After construction, address instances are immutable.

    This library defines a total ordering over all addresses, which is derived from the lexicographic orderings of the components. When comparing for order (i.e., using address<, address<=, address>= or address>) the domain part is always compared first. If ambigous (i.e., if both address instances have equal domains), the local parts are compared.

    Regardless of whether the comparison is for order or for equality, the domain parts are always compared disregarding the letter case, and the local parts are always compared case-sensitively.

  • Function: address objectaddress

    Tries to coerce its argument object into an instance of structure class address, according to the following rules:

    • if object is already an instance of address, it is returned directly

    • if object is a string, it is parsed according to the RFC mailbox production, and the results are used to construct a new address. If a display name part is present in object, it will be ignored.

    If this function cannot convert its argument into an address, it signals an error of type type-error.

  • Function: address-local-part objectstring

    Answers the string, which is the local part of email address object

  • Function: address-domain objectstring

    Answers the string, which is the domain part of email address object

  • Function: address-string objectstring

    Answers the fully escaped string representation of email address object. The string returned by this function may be parsed back into an address instance (e.g. by calling the address function), and the resulting address instance should be equivalent with object under address=.

  • Function: address-hash objectresult

    Answers a hash code for address instance object

  • Function: address= address1 address2result

    Compares the addresses address1 and address2, and answers true, if both represent the same email address, and false otherwise.

  • Function: address/= address1 address2result

    Compares the addresses address1 and address2, and answers true, if both represent different email addresses, and false otherwise.

  • Function: address< address1 address2result

    Compares the addresses address1 and address2, and answers true, if address1 is strictly less than address2. See the description of structure class address for details about address ordering.

  • Function: address<= address1 address2result

    Compares the addresses address1 and address2, and answers true, if address1 is less than or equal to address2. See the description of structure class address for details about address ordering.

  • Function: address>= address1 address2result

    Compares the addresses address1 and address2, and answers true, if address1 is greater than or equal to address2. See the description of structure class address for details about address ordering.

  • Function: address> address1 address2result

    Compares the addresses address1 and address2, and answers true, if address1 is strictly greater than address2. See the description of structure class address for details about address ordering.

  • Class: mailbox

    A mailbox is basically an address combined with a display name. This class itself does not actually provide anything interesting. It merely exists for the purpose of type discrimination.

  • Class: basic-mailbox

    This is a concrete implementation of the mailbox protocol. Instances have two slots mailbox-address and mailbox-display-name.

  • Function: mailbox objectmailbox

    Tries to coerce its argument object into an instance of class mailbox, according to the following rules:

    • if object is already an instance of mailbox, it is returned directly

    • if object is an address, a new basic-mailbox is created, whose address part is object, and whose display name is nil.

    • if object is a string, it is parsed according to the RFC mailbox production, and the results are used to construct a new basic-mailbox.

    If this function cannot convert its argument into a mailbox, it signals an error of type type-error.

  • Generic Function: mailboxp objectresult

    Tests, whether object fulfills the mailbox protocol. This condition is always true by definition for subclasses of class mailbox. It may additionally be true for other objects.

  • Generic Function: mailbox-address objectaddress

    Answers the address instance, which describes the actual email address associated with mailbox object. This method is part of the core mailbox protocol, and must be implemented by all objects, which want to participate in that protocol.

  • Generic Function: mailbox-display-name objectresult

    Answers the display name associated with the given mailbox instance object. This function is part of the core mailbox protocol and must be implemented by all objects, which want to participate in that protocol.

  • Generic Function: mailbox-local-part objectresult

    Answers the local part string of this mailbox's address. The default method simply extracts the address-local-part from the object returned by mailbox-address when applied to the given object.

  • Generic Function: mailbox-domain objectresult

    Answers the domain string of this mailbox's address. The default method simply extracts the address-domain from the object returned by mailbox-address when applied to the given object.

  • Generic Function: mailbox-string objectresult

    Constructs a string representation of the given mailbox instance. The result is required to be a well-formed RFC 5322 email address parsable using the mailbox production. The default method should be usable by almost all concrete mailbox implementations.