Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.
Sign upRFC: List append operator proposal #68
Comments
This comment has been minimized.
This comment has been minimized.
|
Sorry to hijack the discussion but would you mind explaining a bit more how the haskell semigroup |
This comment has been minimized.
This comment has been minimized.
|
@PierreR: class Alternative f where
(<|>) :: f a -> f a -> f a
empty :: f aThe reason that it relates to -- Assuming that we have:
instance Alternative F where ...
-- ... that implies that we also have:
instance Applicative F where ...
-- ... because `Applicative` is a superclass of `Alternative`
-- ... which means that we can also define this `Monoid` instance for `F`:
instance Monoid a => Monoid (F a) where
mempty = pure mempty
mappend = liftA2 mappend... and if you have the above
In other words, |
This comment has been minimized.
This comment has been minimized.
josefs
commented
Jun 28, 2017
|
I think the proposal looks fine. I just want to comment briefly on your remark on overloading. Overloading the operators the way you suggest above would mean that the operator (**) would behave as bog standard concatenation for text but as an outer product for lists. That would be terribly confusing. If you want to introduce overloading, I'd rather you made sure that each operator intuitively has the same behavior for every type it is defined for. |
This comment has been minimized.
This comment has been minimized.
|
Having x ** (y ++ z) = (x ** y) ++ (x ** z)Giving it outer product semantics is consistent with that law The semantics of |
This comment has been minimized.
This comment has been minimized.
josefs
commented
Jun 28, 2017
•
|
Perhaps you can explain why you think that the distributivity law is important in this context. I certainly don't understand why it needs to be there. To be clear, here's the law that I think would make sense:
It involves the as yet undefined, but intuitively sensible, function |
This comment has been minimized.
This comment has been minimized.
|
My reasoning goes something like this: For the type
We obviously have to support appending lists (that's the whole point of this proposal!), but some users (like myself) would also like to be able to append elements. For example, I would like to be able to write something like this: let Text/concat = ...
in let names = ["John", "Lucy", "Brad"]
in let greeting = ["Hello", "Goodbye"]
in Text/concat (greeting ** [" "] ** names ** ["!\n"])The whole reason for having two separate operators for appending As a bonus, Regarding functor laws, the laws that I am proposing if there were a hypothetical textToList (x ** y) = textToList x ++ textToList y
textToList "" = [] : List Char |
This comment has been minimized.
This comment has been minimized.
josefs
commented
Jun 29, 2017
|
I agree with everything you say up to the point of reusing the Furthermore, the fact that |
This comment has been minimized.
This comment has been minimized.
|
My reasoning is that operator reuse is good as long as the operator obeys mathematical laws. The reason why is that if the operator obeys laws then you can reason abstractly about the code's behavior independent of what type you instantiate the operator to work on. The point of reuse is not just to avoid wasting operator namespace. It's also about preserving mathematical intuitions as we transition between different types. The whole reason that they are named -- Given:
zero = [] : List Text
one = [""]
x, y : List Text
List/length Text (x ++ y) = List/length Text x + List/length Text y
List/length Text zero = +0
List/length Text (x ** y) = List/length Text x * List/length Text y
List/length Text one = +1However, that still doesn't address the separate question of whether In this case, I prefer to view |
This comment has been minimized.
This comment has been minimized.
josefs
commented
Jun 29, 2017
|
I still disagree. But if I were to argue my position I'd just be repeating myself. I'm not swayed by what you're saying and you don't seem to be swayed by my arguments either. Let's just agree to disagree. |
This comment has been minimized.
This comment has been minimized.
|
Alright, but I have just one last question before you go: the strongest alternative approach that I'm still considering is using |
This comment has been minimized.
This comment has been minimized.
josefs
commented
Jun 29, 2017
|
Using |
This comment has been minimized.
This comment has been minimized.
|
I still might go for |
Gabriel439
added a commit
that referenced
this issue
Jun 30, 2017
This comment has been minimized.
This comment has been minimized.
|
So I put up a pull request to just overload The deciding factor in my decision was the desire to not break existing Dhall code I'll let that pull request sit for a week and if nobody objects then I'll merge |
Gabriel439
closed this
in
8c340c1
Jul 7, 2017
Gabriel439
added a commit
that referenced
this issue
Jul 15, 2017
This comment has been minimized.
This comment has been minimized.
|
I ended up reverting this change temporarily because using the same operator name is problematic for downstream compilers (such as For now I'll just reopen this issue until I figure out how to resolve this. Most likely I will just end up selecting a unique operator and constructor name for appending |
Gabriel439
reopened this
Jul 15, 2017
Gabriel439
added a commit
that referenced
this issue
Jul 22, 2017
This comment has been minimized.
This comment has been minimized.
|
I introduced #90 to reintroduce the same change except with |
Gabriel439 commentedJun 15, 2017
This is a request for comments on a proposal to add support for a list append operator
Right now there is no list append operator. The closest thing Dhall has to such an operator is
./Prelude/List/concat(which is defined in terms ofList/buildandList/fold). The original reason for this was to encourage efficient concatenation of lists since lists are stored under the hood asVectors, and if you want to concatenateNvectors the most efficient way to do so is to concatenate them all at once (which is O(N) time complexity) instead of pair-wise (which is O(N^2) time complexity).However, there are some times when all you want to do is to append two lists. For example, I ran into this issue when trying to translate the second Jsonnet example on this page (and that was the original motivation for this RFC), where I had to append two lists and awkwardly had to use something like
./Prelude/List/concat Text [x, y]when I would have preferredx ++ yor something similar.The reason that
Textprovides a built-in++operator is that there is no similar efficiency concern.Textis implemented under the hood as aBuilderwhich has O(1) time complexity for append.There is also a separate and orthogonal issue, which is what to name this list operator if Dhall were to provide support for one.
++is already taken for appendingText, even though it is also the most natural thing to name the operator to append two lists.So I will structure this three sets of orthogonal proposals for:
In each set of proposals, I will also indicate which one I prefer. I also invite people to submit proposals of their own if they can think of a better way to support this.
Proposal set (A) - How to support the list operator
Proposal (A0) - Do nothing
Maybe this isn't such a big deal and users can live with using either
./Prelude/List/concator a newly added./Prelude/List/appendthat is not used as an infix operator.Proposal (A1) - Add support for user-defined infix operators
Allow users to define new operators using the syntax:
... and add
./Prelude/List/appendto the./PreludeProposal (A2) - Add a built-in list concatenation operator
This would add a new
ListAppendconstructor to the core syntax tree. See Proposal set (B) which covers what to name this operatorMy preference
I prefer proposal (A2).
I don't recommend proposal (A1) because a user-defined list operator cannot take advantage of type inference. The type of
./Prelude/List/appendwould be:... so you would need to instantiate
ato a specific type each time you bound./Prelude/List/appendto the operator name. A built-in operator could take advantage of type inference when type-checking to infer the type of the list being concatenated.I also don't think it hurts to add a new list append operator to the language since it's very simple to implement and I'm pretty sure some users will want to use this operator. It also is associative and has an identity, so it passes the bare minimum criteria for being an operator in Dhall. So this is why I don't think we should do nothing as in Proposal (A0).
Proposal set (B) - Name of the append operator
If we go with proposal (A1) then we should recommend a naming convention for this operator and if we go with proposal (A2) then we have to pick a name.
Proposal (B0) - Reuse the operator
(++)This would add new type-checking logic to infer which
(++)operator the user meant by the types of the arguments and provide a helpful error message if there is a type mismatch.Proposal (B1) - Name list append
(++)and rename text append to(**)This would make it unambiguous which append the user intended, which would improve error messages
Proposal (B2) - Name the list append or text append operator something else
Some other ideas for what to name either operator to avoid a conflict:
(<|>)(which is a valid list concatenation operator in Haskell, too)(+++)(#)(%)My preference
I prefer proposal (B1). The rationale behind these two names is to open the door in the future to
(++)being overloaded to work on bothList/Optionaland(**)working onText/List Text/Optional Text/List (List Text)/...Or more, generally, using a mix of Dhall and Haskell pseudo-code, this is what I envision
These would have the nice property that
(**)and(++)behave like multiplication and addition (thus the operator names):... and all the other semiring laws.
I'm not sure that I will ever implement these more general overloaded versions of
(++)and(**)but I would still like to have some underlying consistency explaining why they are named the way they are.The other reason I prefer distinct names is that Dhall tries to encourage the user to express their intent as much as possible in order to improve error messages. Distinct operator names would make it clearer which concatenation operator the user meant to use.
The downside of this is that it would break existing code since it's definitely not a backwards-compatible change. However, Dhall is still a pretty young language so I think it's okay to still make breaking changes at this point.
Proposal set (C) - Performance
This section will discuss how to represent Dhall
Lists internally and how that effects the time complexity of the most performance-sensitive operations that Dhall currently supports. All of the time complexities are in terms ofE, the total number of list elements.This decision is not as important to get right now since we can always change it later without changing Dhall language. This only affects the Dhall API.
Proposal (C0) - Continue to use
Vectorto storeLists(++)- O(E)List/length- O(1)List/indexed- O(E)List/last- O(1)Proposal (C1) - Use
Data.Sequence.Seqto storeListsBetter time complexity than
Vectorfor append, but possibly worse constant factors:(++)- O(log E)List/length- O(1)List/indexed- O(E) (UsingData.Sequence.mapWithIndex)List/last- O(1)Proposal (C2) - Use
VectorBuilder.Builder.Builderto storeListsMost efficient for append, but worst in other respects:
(++)- O(1)List/length- O(E) (Converting to aVectorfirst)List/indexed- O(E) (Converting to aVectorintermediate)List/last- O(E) (Converting to aVectorfirst)Proposal (C3) - Use a wrapper around
VectorBuilder.Builder.BuilderThis takes advantage of the fact that Dhall doesn't support very many operations, so we can use a final encoding to efficiently cache their results. This takes advantage of Haskell's laziness to only compute what we actually need:
The main operation that this doesn't improve the efficiency of is
List/indexed. However, it might be possible to add more efficient support for that with an upstream patch tovector-builderMy preference
I'm partial to both (C1) and (C3) but I slightly lean towards (C1) (using
Seq) because it's a simple and versatile data structure, and it's an improvement overVectorfrom a time-complexity standpoint for append. Constant factors don't matter as much for Dhall since it's not intended to be a high performance language.Conclusion
So to summarize, my proposal is to:
(**)(++)Seqinstead ofVectorinternally to represent Dhall lists to improve the efficiency of appendAnybody who is interested in this can suggest any other proposals. If nobody objects to the above proposal or suggests any other alternatives then I'll implement this after a week has passed.