New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ByteArray Literals #292
base: master
Are you sure you want to change the base?
ByteArray Literals #292
Conversation
Thanks for the proposal. I would also be very interested to have
My implementation plan would be:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Responded to some concerns.
@hsyl20 That is a very reasonable alternative. I will include your suggested syntax as an alternative and leave that question to the committee. I've never used quasiquotation (outside of working with TH-heavy libraries that require it), so it is unclear to me whether or not there are any drawbacks to that approach. If this is the decided route, wired-in quasiquoters would be essential since TH is a pain for users who cross-compile. Also, wired-in quasiquoters would be essential for any future change in the desugaring of string literals (that is, changing how |
Is there a link to the current rendered proposal? |
I've added a top-level link. I still need to make some changes to this proposal. |
i like the notations you suggest, afaict [octets#|fe01bce8|] -- ByteArray# (four bytes)
[utf8#|Araña|] -- ByteArray# (UTF-8)
[utf16#|Araña|] -- ByteArray# (UTF-16, native endian)
[utf16le#|Araña|] -- ByteArray# (UTF-16, little endian)
[utf16be#|Araña|] -- ByteArray# (UTF-16, big endian) my initial reading of the proposal and discussion didn't highlight providing a binary/hex encodedd data syntax :) |
@hsyl20 Is there any precedent for the wired-in quasiquoters you describe? Are there any wired-in quasiquoters that exist today in GHC that can be used without turning on template haskell? |
@andrewthad I don't think so. But it should be straightforward to implement. In Quasi-quoters already have their own extension, so we would just have to allow it without also enabling TH. |
@hsyl20 Since the wired-in quasiquoters would have type To everyone, I've laid some groundwork for this feature in MR 2971. There is still a lot of work, particularly cmm-to-cmm and cmm-to-asm, that needs to happen to make this actually work. I'm not going to have time to implement this personally, but this might be a Summer of Haskell project if an interested student would like to do this. I've cleaned this up a little more. I'll submit this to the committee in a week if there is no further discussion. |
I would like to submit this proposal to the committee. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the proposal and sorry for the late review! :)
I'd love to see GHC produce fast code for case
-expressions on (byte)string literals. (I don't have a clear understanding of the status quo though.) Since this proposal seems to move things in that direction, I'm in favour! :)
Correct a sentence Co-authored-by: Simon Jakobi <simon.jakobi@gmail.com>
Correct spelling of scrutinize Co-authored-by: Simon Jakobi <simon.jakobi@gmail.com>
Make effects and interactions agree in number Co-authored-by: Simon Jakobi <simon.jakobi@gmail.com>
/remind @bravit to check progress in two weeks |
@nomeata set a reminder for Jun 7th 2020 |
Hi Andrew, thanks for the proposal! Unfortunately, I'm not very excited about it. I believe that such substantial changes as extending GHC Core, dealing with code generation, moving QuasiQuotes-guarded syntax to plain GHC/Haskell, etc. require much stronger motivation. Moreover, it's not immediately clear to me how many performance benefits we could get after implementing this proposal and whether they outweigh the significance of the changes you propose. As for now, I have a couple of suggestions. First, could you please elaborate on how this proposal solves the following problem:
Second, GHC proposals usually have the "Alternatives" section. It would be nice to see there some traces of not so syntactically simple ways to get around things. I think that having such a section could help Committee members to decide on this proposal. |
Hi @andrewthad, do you have any comments or objections? |
…als into bytearray-literals
I've elaborated more in the motivations section. The hyperlinks that I've added have the real information though. Reading those should provide some sense of what there is to gain. Basically, it's unlocking Core-to-Core optimizations are reducing bloat in generated binaries. I've removed the comment about "There is no O(1) way to get the length of a primitive string literal". I had copied that from an older version of this proposal. I resolved that issue in GHC MR 2165, so it is no longer relevant.
I did not intend to suggest this. This syntax would still require the
I don't think this change is as significant as you interpret it to be. This isn't extending GHC Core in the same sense that join points or linear types is. It's adding a data constructor to
The impact on Core is minimal. I can think about alternatives more, but the only alternatives I'm aware of are different user-facing syntaxes for this. Concerning Core, there's only one way to go about this. If you have another idea for how to resolve the motivations issues, write it up and I can add it as an alternative. |
The link "This proposal is discussed at this pull request." in the rendered proposal is broken |
I'm wondering: what makes this a GHC Proposal. After all, it doesn't change the language, only the libraries, and perhaps the compiler implementation. In particular, we have accepted proposal #125, Type annotated quoters. I think this is the canonical link, or maybe this. (I wish there was an easy way to tell.) All is good, except that the type Now the only issue is: will GHC compile certain idioms efficiently. Maybe, maybe not. Let's see. And if not, let's see if we can modify GHC until it does. Only if that proves impossible, and we need some language design change, should a GHC proposal be necessary. TL;DR: maybe you can skip the GHC Proposal process, and just make GHC optimise better! |
👋 @bravit, check progress |
Hi @andrewthad, I think that @simonpj is right. This looks like implementing several quasi quoters, guarded by QuasiQuotes extension, available by default through Prelude, and supported by GHC implementation. So it turns out there are no GHC/Haskell changes proposed. I think we can label this proposal as dormant and see if there is something that we should consider here later. |
Actually, I wasn't quite right. I pointed to a proposal about Typed Template Haskell, but we want these bytearray literals to appear in patterns, so it must be an untyped TH thing. So there's no issue with instantiating a polymorphic type with TExp. Rather the issue is this: how do you write the quoters Or are we? In fact the
Now, that
I'm not sure if there was a GHC proposal about it -- the commit message is silent. I'm a bit lost in ByteString vs ByteArray land, but this looks like what you need, right? I'm not sure what it turns into when converting to Core, nor about the efficiency of the Core but you could check that. |
@simonpj What I've done is extend |
This is what I am doubtful about. Now we have THREE ways in TH Lit for describing a sequence of bytes. Do we need three? |
In considering this proposal, the committee is struggling to figure out whether this really needs to be a proposal, or whether the goals could be accomplished just by adding quasi-quoters to some library (perhaps So: could I ask the author to clarify, in the proposal text itself (not just in a comment here), what makes this a proper proposal? If a change to Core is warranted, could that be spelled out, too? I recognize that the proposal format doesn't really have a spot for a proposed change to Core (as these should be very rare), but something would have to go in the Proposed Change Specification. Thanks! |
I'll ask here before clarifying in the proposal, since I myself am not really sure what counts as changing Core. Does adding a data constructor to |
Yes, I think it does. Every GHC API user might need to update their code. And same for changing TH Syntax, if you plan to do that. The latter esp is a user facing change As I remark above, we already have two ways of describing byte-array literals in TH syntax, and one ( |
We have the following description of our scope:
So, affecting GHC API is not an immediate reason to consider this change. As for changes to GHC Core, I'm not sure that the proposed change accounts for a major feature unless we consider any change to Core as major. Anyway, I think that I should put the needs revision label on this proposal. |
Rendered Link
This is a variant of the now-closed ByteArray Literals proposal written by @phadej some time ago. It copies a lot of the text from that proposal, and I'd like to make sure @phadej is credited for that. It deviates considerably from the original proposal:
"foo"utf8##
instead of"foo"#butf8
for aByteArray#
.ByteArray#
literals is required.All of this can be summarized by saying that this proposal is less ambitious than the original, and most of the benefit this variant provides is for users who are using comfortable writing out
ByteArray#
andAddr#
literals. It does lay the groundwork for improving the desugaring of string literals, but the specifics of that are left for a future proposal.