-
Notifications
You must be signed in to change notification settings - Fork 41
Defamiliarization #30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…s various signatures. All tests passing, no problem.
Codecov Report
@@ Coverage Diff @@
## master #30 +/- ##
=========================================
+ Coverage 96.96% 97.6% +0.63%
=========================================
Files 6 6
Lines 1023 1044 +21
Branches 123 120 -3
=========================================
+ Hits 992 1019 +27
+ Misses 17 13 -4
+ Partials 14 12 -2
Continue to review full report at Codecov.
|
glyph
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly, yes, but I am definitely concerned about compatibility issues.
hyperlink/test/test_url.py
Outdated
|
|
||
| u = URL.from_text('http://[::1]/a/b/?c=d') | ||
| self.assertEqual(u.family, socket.AF_INET6) | ||
| # def test_ip_family_detection(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hyperlink/test/test_url.py
Outdated
| assert url.host == '2001:0db8:85a3:0000:0000:8a2e:0370:7334' | ||
| assert url.port == 80 | ||
| assert url.family == socket.AF_INET6 | ||
| # assert url.family == socket.AF_INET6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🙄
| port=_optional(port, self.port), | ||
| rooted=_optional(rooted, self.rooted), | ||
| userinfo=_optional(userinfo, self.userinfo), | ||
| family=_optional(family, self.family), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think we probably need a deprecation warning here. This is public, documented functionality that's disappearing, and an exception is a rude way to find out about that.
That said, this is a very easy deprecation to do, because the passed value can more or less just be ignored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have any examples handy? Just that family's not been part of the public API for very long (and indeed in the case of replace, hasn't been released at all, speaking of rude).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, good point on replace :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And, further, upon remembering that replace never even had it, I suppose I could be convinced that this is an outlier sufficiently extreme in its low level of usefulness and short lifetime that it's not worth the additional effort to deprecate. (I really wish github had a code search which could help open source maintainers with this kind of thing though.)
hyperlink/_url.py
Outdated
| raise URLParseError('expected integer for port, not %r' | ||
| % port_str) | ||
| family, host = parse_host(host) | ||
| # _, host = parse_host(host) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hyperlink/_url.py
Outdated
| family, host = parse_host(host) | ||
| # _, host = parse_host(host) | ||
| if host: | ||
| host = host.lstrip('[').rstrip(']') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this just be represented as a modification to the RE? I would prefer to see it there than here, because here it seems like potentially invalid uses of [ might sneak in. (For example, a host that begins [[[[[). If the host group only contained the appropriate text it would be cleaner to read as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I can swing that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, it's definitely not a "just". The RE is pretty directly translated from the RFC, which does not break out userinfo, host, and port. It's all "authority".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heh. There should be an online swear jar for anyone who ever says "just" in a code review :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After about an hour and a half, I've got an RE and code that mostly does this. It may be better overall, but when I push you'll see how many changes are necessary, especially to keep the error messages friendly (regexes are pretty all-or-nothing).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can now witness the new _AUTHORITY_RE action, with improved "bad host" detection. Adding a few more tests before I'd consider it final. (Note that this doesn't use regex IPv6 validation, just the host syntax)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(tests added)
|
|
||
| def replace(self, scheme=_UNSET, host=_UNSET, path=_UNSET, query=_UNSET, | ||
| fragment=_UNSET, port=_UNSET, rooted=_UNSET, userinfo=_UNSET, | ||
| family=_UNSET, uses_netloc=_UNSET): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another interesting concern about compatibility - someone attempting to pass either family positionally will now instead, without any error, give that value to uses_netloc instead, which might not have the desired effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Positional arguments are kind of a vice of the API held over for consistency. The order's kinda funky, too. I honestly believe if someone's used >4 positional arguments in an invocation, they've set themselves up the bomb.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that's the case, then we need to rewrite this constructor to take **kw to avoid self-bombing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(My kingdom a kwonly arg on py2.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, if it were me starting from scratch, I'd take everything after fragment and **kw it, I think. In this case I'm not sure how to rewrite it, except that it's probably for another PR :)
…reserving all the previous error messaging for common URL malformations (e.g., typos). this also reverts the primary URL regex to be much closer to the old RFC3986-based pattern
glyph
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good. Please address the minor coverage issue reported at https://codecov.io/gh/python-hyper/hyperlink/pull/30/diff#D2-945 (no test for the new "invalid authority" exception) and land.
OK, in a classic "this, but-not-this" sort of early review step, here is a sketch of URL without
family, per #26. I think this is probably pretty close to what we want for that issue, but there's also probably feedback, particularly from @glyph, so let's have it :)