Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lists #39

Open
qsctr opened this issue Feb 20, 2021 · 4 comments
Open

Lists #39

qsctr opened this issue Feb 20, 2021 · 4 comments

Comments

@qsctr
Copy link
Contributor

qsctr commented Feb 20, 2021

We are adding lists parameterized by the type of the contents and the maximum length.

For the syntax for the list type, given that the syntax for the int type is written like a function call, should the list type then be something like list(n, T) or list(T, n)? Or maybe we should have different kinds of type arguments be separate, so type-level numbers go in parentheses but types go somewhere else, so something along the lines of list(n) T or [T](n) or T list(n).

For the list literal syntax I was thinking we could just have something like [expr, expr, ..., expr], but if we are parameterizing by max length then we would also need the max length in there if we want to avoid type inference. So it would have to be like [a, b, c](n) or (n)[a, b, c], or if we follow the existing int value syntax, list(n, [a, b, c]) or maybe if we flatten it a bit list(n, a, b, c)? This would be kind of annoying to type out every time though. Maybe we can assume the max size is the length of the list literal by default if it is not explicitly specified, although I have no idea how common that will be the case in actual code.

As for semantics I thought about how to desugar lists to existing constructs but I'm not sure if it fully works out. We could desugar [a, b, c](3) to something like (3, (a, (b, c))) where the first thing is the length and the rest are the elements, or maybe (3, (a, (b, (c, false)))) to be more uniform. And if we have something where the actual length is less than the max length such as [a, b, c](5) then it could be like (3, (a, (b, (c, (def, (def, false)))))) where def is some default value for the type and doesn't really matter. But then the problem with doing it this way is that if we do a cons operation then it's not just a simple O(1) thing but rather we would have to shift everything inwards. And if we put the elements on the inside and the default values on the outside then we would have to go through the default values to get to the elements as well. So I'm not sure if there's a good way to represent things with tuples.

One thing I was thinking was if we don't parameterize by the maximum length but rather just by the actual length. So it would be similar to the vector types you have in dependently typed languages. Then lists could really just be simple syntactic sugar over nested tuples. But then you would probably really need polymorphism over length of lists to do anything useful that you couldn't currently do.

@millstein
Copy link

Great questions and thoughts Bretton!

For the syntax: I don't particularly like the function syntax for types, but given that we have it already for integers, I think list(T, n) makes the most sense. We can always change it later (I would like to review the syntax more generally at some point, but that is a separate issue). For list literals, I like list(n, [a,b,c]) with a default that the max is the length if not specified. Again this is changeable later.

For the semantics: I think we need to first step back and understand the static and dynamic semantics of these lists. Then we can figure out how to best represent and compile them. For example, it's not clear to me that the actual value needs to explicitly include the length, in the same way that bounded integers values do not "know" their length -- the compilation simply leverages the type, which knows the length.

So in particular, what is the semantics of cons? Operationally, consing a value v on to list l of type list(T, n) gives v::l if length(l) < n. What is the semantics if length(l) = n? I think what makes most sense is to think of the list like a bounded queue, so consing v onto l puts v onto the front of l and removes the last element of l. Another option would be in that case for the cons operation to simply produce the original list l, but that seems more error prone for programmers.

Statically, consing t:T onto l:list(T, n) gives a list of type list(T,n).

@millstein
Copy link

As you said, the dependent type approach could be considered but would significantly complicate things in terms of the type system. So I don't think that's the way to go, at least initially.

@guyvdbroeck
Copy link
Contributor

The statement "consing a value v on to list l of type list(T, n) gives v::l if length(l) < n" doesn't really make sense to me: a probabilistic list doesn't have a length; it represents many possible lists, all having different lengths, and we only know the maximum length from the type? So cons should just increase the maximum length by 1 and increase the length of each possible-world-list by 1, whatever it was originally?

@millstein
Copy link

Right. My question is about the semantics in each possible world. I was doing a truncation semantics similar to how we deal with arithmetic currently. You're suggesting an alternate semantics, which is just the normal one for cons, but then requires that the result type have an increased maximum by 1, as you said.

As long as the only operations for creating lists have the property that a reasonable maximum length can be figured out at compile time, then that seems like a nicer way to go. Then we'd have something like this:

[a1, ... , an] : list(T, n) (no need to specify the maximum length)
cons: T * list(T, n) -> list(T, n+1)
head: list(T, n) -> T where n>0 (the list still might be empty in some worlds so what is the semantics?)
tail: list(T, n) -> list(T, n-1) where n>0 (same issue here)
append: list(T, n) -> list(T, m) -> list(T, n+m)

And we would need a simple form of subtyping in the type system, so that if m < n then list(T,m) is a subtype of list(T, n), so that you can create probability distributions over lists of different lengths.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants