Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Static arrays of numbers #209

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

chalcolith
Copy link
Member

@ponylang-main ponylang-main added status - new The RFC is new and ready for discussion. discuss during sync Should be discussed during an upcoming sync labels Apr 24, 2023
@SeanTAllen
Copy link
Member

I would like to see the compile time expressions from Luke's work (or something like them) make it into Pony. Would this syntax that builds on it conflict with the work was in Ponyta? If we decided to bring over something like what was in Ponyta, would we be breaking the change in this RFC?

@chalcolith
Copy link
Member Author

Luke's work uses syntax like #(1+2) to denote compile-time expressions. I thought that #([1;2]) seemed noisy, which is why I suggest #[1;2]. Bringing in Luke's work would not break anything in this PR.

@SeanTAllen
Copy link
Member

I would only be in favor of this if it is a stepping stone to compile time expressions and would want the RFC written as such. Given that, I would probably want to see this written as a compile time expression RFC with it stated what from it can be added independently of one another and that we are ok with "groups" going in at the same time without all features.

@SeanTAllen
Copy link
Member

Luke's work uses syntax like #(1+2) to denote compile-time expressions. I thought that #([1;2]) seemed noisy, which is why I suggest #[1;2]. Bringing in Luke's work would not break anything in this PR.

I'm not really concerned with noisy. This seems like a compile time expression and so having an idea with how we would do compile time expression and making this fit is a very good idea. If we add #() later, it feels like we wouldn't want an entirely separate mechanism for static arrays which I think fall under "compile time expression".

Do you see this feature as a subset of compile time expressions or independent?

@chalcolith
Copy link
Member Author

chalcolith commented Apr 25, 2023

It's a very minor subset of compile-time expressions. Luke's work involves implementing an interpreter in the compiler that can evaluate expressions (including constructing objects and calling their methods). This PR only involves storing tables a la string literals. Further compile-time expression work can use this PR's work without change.

@SeanTAllen
Copy link
Member

I think it is important for this RFC for why this isn't "just an optimization" like String literals.

If it isn't "just an optimization", what happens if someone tries to declare an iso or ref etc static array? There seem to be a lot of edge cases with this approach vs "just an optimization".

I'd like this see the reasoning for this approach over an optimization (which is what I consider "string literals" as you mention them to be).

@SeanTAllen
Copy link
Member

It's a very minor subset of compile-time expressions. Luke's work involves implementing an interpreter in the compiler that can evaluate expressions (including constructing objects and calling their methods). This PR only involves storing tables a la string literals. Further compile-time expression work can use this PR's work without change.

If this is a subset of compile time expressions and not an optimization, then I think that what compile time expressions would look like needs to be part of this PR. Without doing the work in the RFC, I don't think it is safe to say that this will have no impact on an as yet unspecified compile time expression support.

@chalcolith
Copy link
Member Author

The "Alternatives" section addresses the "just optimization" option. If I'm implementing an algorithm that requires static data for performance reasons, I want to explicitly know that it's going to be static.

The design section explicitly says that compile-time expressions are denoted by # in Luke's work. What sort of extra discussion would you like to see?

@chalcolith
Copy link
Member Author

If someone tries to declare a ref or iso static array, the compiler will tell them they can't do that. If we make this "just an optimization", their code will succeed and then their algorithm will suffer performance issues.

@SeanTAllen
Copy link
Member

If someone tries to declare a ref or iso static array, the compiler will tell them they can't do that. If we make this "just an optimization", their code will succeed and then their algorithm will suffer performance issues.

The RFC is lacking in details such as when this feature will succeed, when it won't etc. Those need to be detailed in the RFC.

Also the trade-offs between optimization vs adding new syntax need to be covered. There appears to be a real "simplicity" vs "i know it was done" tradeoff here. I'm not sure I agree with your "just an optimization will succeed", I mean, yes, it will but its a ref array so that feels like a teaching moment in documentation not a requirement for new syntax.

Having to do:

    let y: Array[U8] val = recover val [1;2] end

seems a lot more straightforward.

It's a little different than a String in that Array defaults to ref rather than val but otherwise, this seems nice and straightforward. And all existing code where this optimization could be done, would be done.

At the moment, I'm against new syntax for this. I find the "they need to get an error" if the optimization won't be applied argument to be not particularly good in terms of tradeoffs.

@SeanTAllen
Copy link
Member

The "Alternatives" section addresses the "just optimization" option. If I'm implementing an algorithm that requires static data for performance reasons, I want to explicitly know that it's going to be static.

I had to reread that several times to see how that is the case. It sounds like it is talking about optimizing a new type. I think it both alternatives need to be expanded on.

I don't think editorializing like "magical" should be in the alternatives. I think laying out pros and cons of alternatives should be done.

@SeanTAllen
Copy link
Member

I'd like the RFC to address why "as an optimization" is fine for String literals but not Array literals.

@chalcolith
Copy link
Member Author

I'm not sure I understand what "when this feature will succeed and when it won't" means. The RFC says "The value of a static array literal can only be Array[T] val, where T is a floating-point or integral numeric type."

@SeanTAllen
Copy link
Member

To make sure I am reading this correctly, this would only be for arrays of numbers, I couldn't use this for arrays of String literals or other types in the future?

@SeanTAllen
Copy link
Member

Is having an annotation that will cause a compiler error if an optimization isn't applied an alternative?

@SeanTAllen
Copy link
Member

I'm not sure I understand what "when this feature will succeed and when it won't" means. The RFC says "The value of a static array literal can only be Array[T] val, where T is a floating-point or integral numeric type."

The RFC does not detail what happens for other cases. We are expecting compiler errors? That say what? This can only be for val, so how does this work with something that "becomes a val"? What's the analysis that will need to be done. "Only a val" feels very loose-y goose-y to me in terms of working through the implications of this.

@SeanTAllen
Copy link
Member

"The value of a static array literal can only be Array[T] val, where T is a floating-point or integral numeric type."

What specifically would the check be that the compiler does? It will need to have knowledge of the standard library. What's the check we would do? Array[Number]?

@chalcolith
Copy link
Member Author

If you tried to do let foo: Array[U32] ref = #[1;2] the compiler would say "right side must be a subtype of left side; Array[U32 val] val is not a subtype of Array[U32 val] ref^".

The compiler would check that T is one of { F32, F64, ISize, ILong, I8, I16, I32, I64, I128, USize, ULong, U8, U16, U32, U64, U128 }.

@SeanTAllen
Copy link
Member

If you tried to do let foo: Array[U32] ref = #[1;2] the compiler would say "right side must be a subtype of left side; Array[U32 val] val is not a subtype of Array[U32 val] ref^".

The compiler would check that T is one of { F32, F64, ISize, ILong, I8, I16, I32, I64, I128, USize, ULong, U8, U16, U32, U64, U128 }.

All of this should be in the RFC.

@SeanTAllen
Copy link
Member

If you tried to do let foo: Array[U32] ref = #[1;2] the compiler would say "right side must be a subtype of left side; Array[U32 val] val is not a subtype of Array[U32 val] ref^".

So #[] means "create a val array" and apply a specific optimization to it?

Well, not really, because it is more constrained than val array, why is the specific type for the array constrained?

@SeanTAllen
Copy link
Member

SeanTAllen commented Apr 25, 2023

If this was done as an optimization, could we not have more types that could play, so you could for example, create Maps that could be optimized as such without needing special syntax, other user types could potentially play as well. If something like the Complex number idea that Ryan was thinking about was added, it could play in the optimization game as well.

The question might be "should all the possible options be done"?

This feels a lot more like something for an annotation to me. Where you can annotate the expectation of an optimization (or lack of).

@chalcolith
Copy link
Member Author

#[] means point a val array to a static section of data. Allowing arbitrary object types would mean interpreting them at compile-time and then serializing them into the binary.

@jumpnbrownweasel
Copy link

In case it helps to have another way of saying it, I think I understand the direction Sean's describing: recovered val arrays would be optimized whenever possible to store them in the data segment. This would at first be only when the array elements are primitive numeric vals, but could later be expanded to elements that are string constants, non-numeric primitives, instances of classes with only embed fields, and so on. For predictable performance a new annotation on an array type would require this optimization, causing a compile error if it could not be done. Is that roughly what you're thinking, Sean?

@SeanTAllen
Copy link
Member

We discussed this during the sync call https://sync-recordings.ponylang.io/r/2023_04_25.m4a.

In general, Joe is in favor. I am still against.

I raised by concerns for things I would want to see addressed to make sure we get right for the future related to interning in general (not just arrays of numbers) and the possible addition of compile time expressions.

The first 30 minutes of the linked sync call is about this RFC.

@jemc anything you want to add?

@SeanTAllen
Copy link
Member

@mfelsche noted to me after the sync call that he has some thoughts on adding SIMD support to Pony that he sees as related to this RFC and would like to discuss at an upcoming sync in relation to this RFC.

@pmetras
Copy link

pmetras commented May 4, 2023

Implementing static data can be tricky... To better understand the proposal:

  1. Can we have a generic static array definition like
class Foo[A: UnsignedInteger[A] val = USize]
  let a: Array[A] = #[1; 2; 3]
  1. Why only arrays? I think there could be a relation to introducing let fields in primitive too.

@SeanTAllen
Copy link
Member

Can we have a generic static array definition like
class Foo[A: UnsignedInteger[A] val = USize]
let a: Array[A] = #[1; 2; 3]

Without compile time expression support, UnsignedInteger would allow for classes which requires compile time expressions. So, given the scope of this RFC, not like that. But if it was a generic that was constrained specifically a single primitive number, then I believe under the ideas in this RFC, yes.

@SeanTAllen
Copy link
Member

Why only arrays? I think there could be a relation to introducing let fields in primitive too.

primitives don't have fields, did you mean, primitives that are let fields in classes/actors?

@pmetras
Copy link

pmetras commented May 9, 2023

primitives don't have fields, did you mean, primitives that are let fields in classes/actors?

No. Because primitive creates unique instance, one can consider a field value to be a constant value because it'll exist once. If let fields were added to primitive, this would be static data and we could have static USize or F64 constants too. Or we could have static arrays if the compiler optimized let in primitives as static data. I don't know if I can explain correctly what I mean...

But I'd rather have ponyta solution instead of this solution that appears more as a (temporary) hack than a global solution. The advantage is that the syntax is compatible with ponyta.

@SeanTAllen SeanTAllen removed the discuss during sync Should be discussed during an upcoming sync label May 30, 2023
@SeanTAllen
Copy link
Member

@kulibali will be updating this based on feedback and will let us know when he is ready for more review.

@ponylang-main ponylang-main added the discuss during sync Should be discussed during an upcoming sync label May 30, 2023
@SeanTAllen SeanTAllen removed the discuss during sync Should be discussed during an upcoming sync label May 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status - new The RFC is new and ready for discussion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants