-
Notifications
You must be signed in to change notification settings - Fork 741
feat: introduce URI data type for HTTP
#12128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: sofia/async-http-headers
Are you sure you want to change the base?
Conversation
4e9af59 to
ad7fa87
Compare
|
Mathlib CI status (docs):
|
|
Reference manual CI status:
|
9882dba to
c77b3ee
Compare
9b737c6 to
6e26b90
Compare
c77b3ee to
bad70e3
Compare
| -/ | ||
| def withQuery (uri : URI) (query : URI.Query) : URI := | ||
| { uri with query } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RFC 7230 Section 5.3.1 defines origin-form as:
origin-form = absolute-path [ "?" query ]
There's no fragment component—fragments are client-side only and should not be sent to servers in HTTP request targets. Should originForm exclude the fragment field, or is there a use case I'm missing?
|
|
||
| /-- | ||
| TCP port number. | ||
| -/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The username and password are stored decoded (per UserInfo docstring), but this ToString instance doesn't percent-encode them before serialization. This can produce invalid/ambiguous URIs.
Test case:
-- Expected: "user%40example:pass%3Aword@host.com"
-- Actual: "user@example:pass:word@host.com" (ambiguous - multiple @ and :)
#eval toString ({
userInfo := some ⟨"user@example", some "pass:word"⟩,
host := .name ⟨"host.com", by decide⟩
} : URI.Authority)Should this use EncodedString.encode for the userinfo components?
| if (← peekWhen? (· == '['.toUInt8)).isSome then | ||
| return .ipv6 (← parseIPv6) | ||
| else if (← peekWhen? isDigit).isSome then | ||
| return .ipv4 (← parseIPv4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Domain names can start with digits (RFC 1123 relaxed RFC 952). A host like 1example.com would trigger the IPv4 branch, fail validation in Net.IPv4Addr.ofString, and not fall back to reg-name parsing.
Should there be a fallback to reg-name if IPv4 parsing fails? Or is rejecting digit-leading domain names intentional?
| -/ | ||
| fragment : Option String := none | ||
| deriving Inhabited | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docstring says "The domain name will be automatically percent-encoded" but the implementation just lowercases—no percent-encoding happens. (Domain names typically use punycode for internationalization rather than percent-encoding anyway.) Should the docstring be corrected?
This PR introduces the
URIdata type.This contains the same code as #10478, divided into separate pieces to facilitate easier review.
The pieces of this feature are:
Headersdata type for HTTP #12127URIdata type for HTTP #12128Bodytype class and some Body types for HTTP #12144