-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make parseUrlHttp[s]
take in a Text
value
#67
Comments
The current implementation seems to already basically do the right thing on UTF-8 encoded It should just be more transparent about the fact that it works on unicode/utf-8 by working on |
Even so, The type which we take as the argument of Anyway, AFAIU, both
Since both options are not ideal and |
The flaws in the types are more or less what you said, but I would adjust the phrasing slightly to show why Text really is much less problematic than ByteString: The two main issues are:
The |
ByteString
semantically represents a sequence of 8 bit octets, not a series of ASCII characters, and URLs are semantically sequences of characters. You can see this distinction when looking at howChar8
is too big to store only ASCII, and with how oftenByteString
is used to store non-ASCII data, and with the non-ASCII encoding typically used when outputting to stdout.Currently if we have a
Text
value that we want to convert into a URL then we are in an awkward position of deciding how to convert it to aByteString
. We canT.unpack
it and any non-ASCII text essentially becomes garbage. We canT.encodeUtf8
it which is probably the most sensible thing to do, as it appears thatparseUrlHttps
will re-pack that utf-8 back into the appropriateText
value. We could also mistakenly use any other non-ASCII-compatible encoding like utf-16 and break everything.It seems to make the most sense to just have it be a
Text
value from the start, particularly given the fact thatparseUrlHttp[s]
already seems to do proper UTF-8 decoding.The text was updated successfully, but these errors were encountered: