Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

should email.utils.parseaddr treat a@b. as invalid email ? #81673

Closed
jpic mannequin opened this issue Jul 3, 2019 · 6 comments
Closed

should email.utils.parseaddr treat a@b. as invalid email ? #81673

jpic mannequin opened this issue Jul 3, 2019 · 6 comments
Labels
topic-email type-bug An unexpected behavior, bug, or error

Comments

@jpic
Copy link
Mannequin

jpic mannequin commented Jul 3, 2019

BPO 37492
Nosy @warsaw, @ericvsmith, @bitdancer, @jpic

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2019-07-03.12:42:10.992>
created_at = <Date 2019-07-03.10:12:20.730>
labels = ['type-bug', 'invalid', 'expert-email']
title = 'should email.utils.parseaddr treat a@b. as invalid email ?'
updated_at = <Date 2019-07-13.22:08:56.303>
user = 'https://github.com/jpic'

bugs.python.org fields:

activity = <Date 2019-07-13.22:08:56.303>
actor = 'jpic'
assignee = 'none'
closed = True
closed_date = <Date 2019-07-03.12:42:10.992>
closer = 'jpic'
components = ['email']
creation = <Date 2019-07-03.10:12:20.730>
creator = 'jpic'
dependencies = []
files = []
hgrepos = []
issue_num = 37492
keywords = []
message_count = 6.0
messages = ['347207', '347219', '347221', '347225', '347856', '347858']
nosy_count = 4.0
nosy_names = ['barry', 'eric.smith', 'r.david.murray', 'jpic']
pr_nums = []
priority = 'normal'
resolution = 'not a bug'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue37492'
versions = []

@jpic
Copy link
Mannequin Author

jpic mannequin commented Jul 3, 2019

Following up bpo-34155[0] PR#13079[1], which changes:

>>> parseaddr('a@malicious@good')

From returning:

('', 'a@malicious')

To return:

('', '')

As such, parseaddr behaves more like documented:

email.utils.parseaddr(address)
Parse addresswhich should be the value of some address-containing field such as To or Ccinto its constituent realname and email address parts. Returns a tuple of that information, unless the parse fails, in which case a 2-tuple of ('', '') is returned.

The pull request discussion suggested that it would be good to open a new bpo to discuss changing the following behaviour:

    parseaddr('a@b.')

From returning:

('', 'a@b.')

To return a tuple of empty strings as well.

We have not found RFC to back up that a@b. was not a valid email, however RFC 1034 states that dots separate labels:

When a user needs to type a domain name, the length of each label is
omitted and the labels are separated by dots (".").

As such, my understanding is that a valid domain must not end with a dot.

[0] https://bugs.python.org/issue34155
[1] #13079

@jpic jpic mannequin added topic-email type-bug An unexpected behavior, bug, or error labels Jul 3, 2019
@ericvsmith
Copy link
Member

RFC 1034 defines absolute domain names as ending with dot:

------------
When a user needs to type a domain name, the length of each label is omitted and the labels are separated by dots ("."). Since a complete domain name ends with the root label, this leads to a printed form which ends in a dot. We use this property to distinguish between:

  • a character string which represents a complete domain name
    (often called "absolute"). For example, "poneria.ISI.EDU."

  • a character string that represents the starting labels of a
    domain name which is incomplete, and should be completed by
    local software using knowledge of the local domain (often
    called "relative"). For example, "poneria" used in the
    ISI.EDU domain.
    ------------

I'll admit that it isn't common to specify absolute domain names, and many resolvers treat a domain name with an internal dot, but no terminal dot, as an absolute name.

I doubt in practice there are any email addresses that have a TLD name.

There's some bpo issue where this was discussed in reference to the ipaddress module. I think the issues was canonicalizing names, and it was decided not to add trailing dot to make them absolute. I realize that logic doesn't directly apply here.

In spite of "com." being a valid domain name, I think it's reasonable to reject it as the domain part of an email address. But there should be a comment in the code as such.

@ericvsmith
Copy link
Member

Counterpoint: I just sent an email to "info@info.", and Thunderbird and my MTA (postfix) and my mail relay all accepted it. I guess it's possible that a TLD (especially one of the newer ones) could accept email addresses in the TLD itself.

It turns out that "info@info." isn't a mailbox as of right now, but I think it's a valid and accepted address, at least by the software listed above. And it could be a valid mailbox, just isn't in this particular case.

Maybe the more conservative approach is to say that "info@info." (and "a@b.", etc.) should be considered valid email addresses.

If you were actually trying to send email to a mailbox in the "info" TLD, I think most resolvers would resolve "info" as a relative domain name, which isn't what we'd want to happen: you'd have to specify the domain as "info.".

@jpic
Copy link
Mannequin Author

jpic mannequin commented Jul 3, 2019

Thanks a heap Eric, I feel a bit silly I missed it.

Closing the issue as not a bug, please feel free to reopen if necessary.

@jpic jpic mannequin closed this as completed Jul 3, 2019
@jpic jpic mannequin added the invalid label Jul 3, 2019
@bitdancer
Copy link
Member

Right, those absolutely are valid addresses. A resolver will normally look up a name with an internal dot first as if it were an FQDN, but if it does so and does not get an answer it will then look it up again as a "local" address (appending in turn the strings from the 'search' directive in resolv.conf or equivalent) *if* it does not end in a final dot. If it does end in a final dot, no further lookup as local is done.

While it isn't *normal* to send email to a TLD using a trailing dot, it is *legal*. In theory the address 'postmaster@com.' ought to be a valid email address (I doubt that it actually is, though). On the other hand, I will be very surprised if *all other* TLDs are without valid email addresses, especially the new ones. It is also easy to imagine an environment using email with private single label domain names using trailing dots specifically to suppress appending of search domains for sandboxing reasons. Thus the email library must support it as valid, both for RFC reasons and for practical reasons.

@jpic
Copy link
Mannequin Author

jpic mannequin commented Jul 13, 2019

Thanks for the heads up.

There is still one last case where maybe parseaddr should return a tuple of
empty strings, currently:

>>> parseaddr('a@')
('', 'a@')

Is this worth changing ?

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-email type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

2 participants