Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discrepancy between IPv4 and IPv6 #180

Closed
Marsup opened this issue Aug 11, 2018 · 22 comments
Closed

Discrepancy between IPv4 and IPv6 #180

Marsup opened this issue Aug 11, 2018 · 22 comments
Assignees
Milestone

Comments

@Marsup
Copy link
Contributor

Marsup commented Aug 11, 2018

Hi!

While doing a change on joi to default on minDomainAtoms=2, I noticed a weird thing in the unit tests.

Apparently joe@[IPv6:2a00:1450:4001:c02::1b] is not valid with that option. So I thought it could make sense, or at least it's not a hugely common use case to have an IP as domain, but then I noticed joe@127.0.0.1 is valid under the same conditions. So are joe@127.0.0 and joe@127.0, since they're not a full IPv4 nor a valid domain, I don't think they should validate.

I'm not sure what to make of this. Would you consider it a bug ?

@skeggse skeggse self-assigned this Aug 11, 2018
@skeggse
Copy link
Owner

skeggse commented Aug 11, 2018

Oh, those are some cool edge-cases!

I think there are a few different things going on here.

The original DNS RFC specified that domains could use [0-9A-Za-z-], with more specific provisions requiring that the hyphens not fall at the ends of labels, and numbers be forbidden at the start of labels. Since then, domain names RFC 1123 relaxed the definition of a domain name to permit labels that start with a number. This introduces some ambiguity, so we include a warning when the top-level domain provided starts with a number: rfc5321TLDNumeric. Per #162/#4, it's not really easy to work with those warnings, though.

Anyway, the examples joe@127.0.0.1, joe@127.0.0 and joe@127.0 work because isemail is treating them as domain names and not IP addresses, and it correctly rejects joe@[127.0.0] and joe@[127.0], while providing the rfc5321AddressLiteral warning for joe@[127.0.0.1].

Ideally we wouldn't even have minDomainAtoms, but I understand the desire to weed out a category of false-positives. A "real" application using isemail would want to also run a DNS check to determine whether the server is actually reachable, and that would give them whatever assurance they need that wow@png isn't a valid domain.

We could still stand for a better API to handle warnings/errors, and parse addresses to make it easier to run DNS queries against the domain name. I guess it probably makes sense to skip the minDomainAtoms if the domain part is actually an address literal.

@Marsup
Copy link
Contributor Author

Marsup commented Aug 12, 2018

I agree that numbers can be part of domain names, but when it's only numbers and dots shouldn't it be considered an ip and validated as is? It doesn't need a DNS check for that.

@skeggse
Copy link
Owner

skeggse commented Aug 12, 2018

When it's only numbers and dots it won't (or shouldn't, at least) be considered an IP address. We don't really decide how the mail transfer agent handles the address in that case. If it's wrapped in square brackets, that's a different story, and most MTAs should deliver it as an IP address.

@Marsup
Copy link
Contributor Author

Marsup commented Aug 12, 2018

So you're saying that something like 127.0.0 could eventually be a valid domain and is left to the MTA to decide? I don't see how the ICANN would allow that with the current rules.

@skeggse
Copy link
Owner

skeggse commented Aug 12, 2018

We handle that case with the rfc5321TLDNumeric warning/error - there aren't any gTLDs that are numeric (currently, and in the foreseeable future) and this doesn't fail the default threshold because it's possible you might want to use isemail in a context with network-internal domain names.

You could set the default Joi errorLevel to 1, as I would consider all the RFC5321 warnings not-good-for-web-applications.

@skeggse
Copy link
Owner

skeggse commented Aug 12, 2018

And I'm pretty sure it's not actually up to the MTA to decide - I believe they're required to treat 127.0.0 and 127.0.0.1 as hostnames unless in the address literal syntax but I didn't know the appropriate RFC offhand.

@Marsup
Copy link
Contributor Author

Marsup commented Aug 12, 2018

If I set it to 1, something like "joe"@example.com or joe@123.4.5.6 becomes invalid. It is admittedly uncommon but still valid, right ? It's hard to find the sweet spot of "what people would expect" :)

@skeggse
Copy link
Owner

skeggse commented Aug 12, 2018

For sure. joe@123.4.5.6 is undeliverable and the domain is unregisterable as specified and will remain so for the foreseeable future, though. Does it not make sense to reject that address and similar ones for most webapp purposes?

@skeggse
Copy link
Owner

skeggse commented Aug 14, 2018

Correct me if I'm wrong, but here's your contention as I understand it. You expect joe@123.4.5.6 to be interpreted as the mailbox joe at the IP address 123.4.5.6.

The RFCs (in particular, I'm referencing RFC 5321) specify that the part after the @ be parsed either as a domain (one that requires the use of the Domain Name System to resolve), or as an address literal wrapped in square brackets. An address literal is either an IPv4 address or a tagged literal such as IPv6:::. This means that joe@123.4.5.6 must be resolved by checking the DNS, which under current ICANN methodology will be unresolvable.

On a related note, my interpretation of the rfc5321TLDNumeric code that Dominic Sayers used in the original tests we included is to handle this particular case - generally a domain that with a final atom that is numeric won't be addressable.

Apologies if this was already clear to you, but I wanted to make sure. If I've misunderstood your concern, could you help me understand?

@Marsup
Copy link
Contributor Author

Marsup commented Aug 14, 2018

Your assumption is correct. I expected it to work because I already emailed such an address, but maybe the client interpreted it with brackets ?

My expectations are that any email that validates should be emailable if it exists, is there any setting that would fit that description ? Is there a way to deny IPs (with or without brackets, nobody uses those) ?

Indeed I don't think a TLD can only have numbers in its extension part, so I'm surprised an IP would be considered as a valid domain here.

@skeggse
Copy link
Owner

skeggse commented Aug 18, 2018

I expected it to work because I already emailed such an address, but maybe the client interpreted it with brackets ?

This is super surprising to me. I'd be interested to see the relevant subset of the mail envelope headers.

My expectations are that any email that validates should be emailable if it exists, is there any setting that would fit that description ?

Well, that's what it currently does. The crux of the issue seems to be the "if it exists" part, because joe@123.4.5.6 would be emailable if it existed, but it doesn't (right?).

Is there a way to deny IPs (with or without brackets, nobody uses those) ?

With brackets it would seem that minDomainAtoms already covers that, and the rfc5321AddressLiteral warning would also cover it. Without brackets, there's not much we can do because IPv4 addresses are also valid domain names, even if policy-wise ICANN forbids them (for good reason slash because they're frequently used in the same context as a domain name). rfc5321TLDNumeric would be the best candidate for screening those out.

Indeed I don't think a TLD can only have numbers in its extension part, so I'm surprised an IP would be considered as a valid domain here.

Syntactically, it can. Policy-wise, ICANN forbids it. As a result, I haven't directly implemented support for specifically forbidding numeric TLDs, although the rfc5321TLDNumeric error code does explicitly flag that. Making that usable independent of other error levels falls under #4. Moreover, this kind of business logic would be better implemented when we have a parse function than with the existing validate function.

I know this isn't the answer you're looking for, but we need a better isemail API because the error-level-based version doesn't provide tight enough control for your use-case.

@Marsup
Copy link
Contributor Author

Marsup commented Aug 29, 2018

The crux of the issue seems to be the "if it exists" part, because joe@123.4.5.6 would be emailable if it existed, but it doesn't (right?).

I'm not asking for a DNS or reachability check, just for a purely theoretical emailability.

--

Let's get back to the original issue for a moment. The default (no option) isn't really satisfactory for joi because it allows stuff like root@localhost, which is valid but is not what people expect by default. minDomainAtoms: 2 is what we advised everyone to do for the past few years but it seems a bit flawed as well. But would you say it's the lesser evil for now? It doesn't seem perfect but it's the closest check we have for a valid email (in the common sense people think about email addresses).

So far I was going to go for isemail(string, { minDomainAtoms: 2 }) === true in the default case, would you have some better suggestion? Maybe playing with the errorLevel?

@skeggse
Copy link
Owner

skeggse commented Sep 3, 2018

minDomainAtoms: 2 as a default should give you a better default for most joi users. I'll look into a stop-gap for a granular error exclusion parameter.

@skeggse
Copy link
Owner

skeggse commented Oct 7, 2018

I just issued pull requests #187 and #188. Please let me know if you think those will cover your concerns.

@Marsup
Copy link
Contributor Author

Marsup commented Oct 10, 2018

Just to be sure, suitable default options would be { errorLevel: Isemail.diagnoses.rfc5321TLD, minDomainAtoms: 2 } ?

@skeggse
Copy link
Owner

skeggse commented Oct 11, 2018

Assuming you want to allow "marsup"@example.com (which is identical to marsup@example.com), and marsup@[0.0.0.0], I think the suitable options for joi would be:

{
  errorLevel: Isemail.diagnoses.rfc5321TLD,
  excludeDiagnoses: [
    Isemail.diagnoses.rfc5321QuotedString, // for "marsup"@example.com
    Isemail.diagnoses.rfc5321AddressLiteral, // for marsup@[0.0.0.0]
  ],
  minDomainAtoms: 2
}

@Marsup
Copy link
Contributor Author

Marsup commented Oct 11, 2018

I don't think quotes and IPs are common cases, which is why I didn't exclude anything. Was it already possible before?

@skeggse
Copy link
Owner

skeggse commented Oct 11, 2018

I was going based on this comment. Personally I think all three forms should be accepted by email address validators.

@Marsup
Copy link
Contributor Author

Marsup commented Oct 11, 2018

Sure, I was thinking about joi's defaults, which have somewhat different expectations than a general purpose email validator.

@skeggse
Copy link
Owner

skeggse commented Oct 11, 2018

I'm not sure I follow. I think joi should allow these email address forms by default, and also think it's reasonable to forbid a@4.5. The config I recommended would accomplish this, and with the two referenced PRs would resolve the original issue here. If this doesn't match your requirements for joi, can you reiterate?

@skeggse
Copy link
Owner

skeggse commented Oct 18, 2018

I released #187 and #188 in 3.2.0. I'm sorry I never managed to understand your contention perfectly, but I think some combination of the minDomainAtoms exclusion and the excludeDiagnoses parameter will solve your problem. If not, please re-open this issue, or open a new one. Cheers!

@Marsup
Copy link
Contributor Author

Marsup commented Nov 11, 2018

No problem Eli, I'm lacking time to explain and work on it as well. Some day I'll have time to create sane defaults but for now people will have to figure it out by themselves :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants