Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document design about implicit casting and implicit typing #2168

Open
certik opened this issue Jul 17, 2023 · 4 comments
Open

Document design about implicit casting and implicit typing #2168

certik opened this issue Jul 17, 2023 · 4 comments

Comments

@certik
Copy link
Contributor

certik commented Jul 17, 2023

The implicit casting are things like 2 * x where x is f64, as well as automatic casting of arguments to functions. It is allowed in some languages, but not all. For example in Rust no implicit casting is allowed. We currently do not allow any implicit casting either.

A separate question is implicit typing, where the compiler infers the type from the right hand side. In Rust one has to always type functions, but inside functions one can use implicit typing. The author of Rust expresses some regret about this in https://graydon2.dreamwidth.org/307291.html:

Complex inference. Long ago Rust's website was just a feature list, and it had two entries that read "Type inference: yes, only local variables" and "Generic types: yes, only simple, non-turing-complete substitution". The first should actually have read "statement-at-a-time" but obviously that wouldn't have held up any more than the rest of it. I don't like "playing type tetris" and I would prefer it went back to this position, though probably the substitution would have needed to at least support bounds on module-typed parameters. I think there's a design space that avoids having to make the user think about unification or at least recursion in their types. Anyway we got a lot of type-system people who think complex types are a good thing, or at least an excusable one, and I completely lost this argument.

An example of this would be https://doc.rust-lang.org/rust-by-example/types/inference.html:

fn main() {
    // Because of the annotation, the compiler knows that `elem` has type u8.
    let elem = 5u8;

    // Create an empty vector (a growable array).
    let mut vec = Vec::new();
    // At this point the compiler doesn't know the exact type of `vec`, it
    // just knows that it's a vector of something (`Vec<_>`).

    // Insert `elem` in the vector.
    vec.push(elem);
    // Aha! Now the compiler knows that `vec` is a vector of `u8`s (`Vec<u8>`)
    // TODO ^ Try commenting out the `vec.push(elem)` line

    println!("{:?}", vec);
}

This is similar to implicit typing in LFortran, where we have to come back to fix up the type, depending on how it used (in there it is regarding if f is a variable or a function). This is hard to implement and make correct in all cases, and it is also confusing to the user, since the type of f can change retroactively, by how it is used later in the code.

Right now LPython does not do any implicit typing.

The main advantage of the current design is that we can always add this later, but if we add some now and later realize we added too much, we cannot take it away without breaking compatibility. The second advantage is that it might also be the right design for us, or close to it.

The third case are implicit declarations. I think Rust doesn't allow that, you always have to explicitly declare using let. In LPython, currently we have to do f: f32 = f32(5). Here the f32 on the left has two functions, one is explicit type, the other is explicit declaration. If we relax it to just f = f32(5) then we have implicit declaration and implicit type. Finally, we could do something like f: Var = f32(5) which would be explicit declaration and implicit type. The issue with implicit declarations is that it's easy to make a mistake and reassign to an already existing variable when that was not meant to be by the user. By always having to explicitly declare it (whether with explicit type like f: f32 or implicit type like f: Var) the compiler will check that the variable is not redeclared, avoiding mistakes. LPython currently does not allow implicit declarations.

In conclusion, LPython currently does not allow:

  • implicit casting
  • implicit typing
  • implicit declarations

In comparison, Rust doesn't allow implicit casting and implicit declarations, but it allows implicit typing, although I think it went too far, see above.

@rebcabin
Copy link
Contributor

rebcabin commented Jul 17, 2023 via email

@certik
Copy link
Contributor Author

certik commented Jul 17, 2023

The issue with some_var : MyClassFactoryProxyStubInterface(my_arg1, my_arg2) is that it won't work in CPython, since CPython ignores annotations. The issue with some_var = MyClassFactoryProxyStubInterface(my_arg1, my_arg2) is that now you have implicit declaration of the variable some_var, but only if it wasn't already declared before; if it was, then it's just a regular assignment. This seems very error prone. A solution thus could be:

from lpython import var

some_var: var = MyClassFactoryProxyStubInterface(my_arg1, my_arg2)

Where the var annotation tells LPython to declare the some_var variable (thus this declaration is explicit), and the type is inferred from the RHS. However, now this opens up the door to things like f: var = f32(5), which in principle we can also allow. But it's as long as f: f32(5), so we might as well do that for those simple types.

For the long struct names, another idea is:

T: Type = MyClassFactoryProxyStubInterface
some_var: T = T(my_arg1, my_arg2)

Which makes it a lot shorter.

One advantage of requiring some_var : MyClassFactoryProxyStubInterface = MyClassFactoryProxyStubInterface(my_arg1, my_arg2) or the shorter some_var: T = T(my_arg1, my_arg2) is that it is equivalent to:

some_var : MyClassFactoryProxyStubInterface
...
some_var = MyClassFactoryProxyStubInterface(my_arg1, my_arg2)

and

some_var: T
...
some_var = T(my_arg1, my_arg2)

But the "var" syntax some_var: var = MyClassFactoryProxyStubInterface(my_arg1, my_arg2) is not equivalent to:

some_var: var
...
some_var = MyClassFactoryProxyStubInterface(my_arg1, my_arg2)

Because now we have the "action at a distance", we should only allow implicit typing from the single statement; this is multiple statements.

Instead of "var", maybe we can do:

some_var: struct = MyClassFactoryProxyStubInterface(my_arg1, my_arg2)

As an alternative syntax for:

some_var: MyClassFactoryProxyStubInterface  = MyClassFactoryProxyStubInterface(my_arg1, my_arg2)

That way it can't be used for anything else than structs, you HAVE to initialize it, i.e., some_var : struct on its own is not allowed, and it's essentially equivalent to C's MyClassFactoryProxyStubInterface some_var = {my_arg1, my_arg2}.

It's not clear ot me it is worth doing, but we can. I recommend for now let's type everything explicitly and see. Once we gain more experience, let's find a simpler syntax for some of these more annoying cases.

@rebcabin
Copy link
Contributor

"Action at a distance" is a violation of the Law of Demeter, so I don't like it.

I much prefer "type aliases," i.e.,

MyTypeAlias = SomeGiganticUglyCombinationOfLongNamesAndTypeCombinatorsLikeUnionListDictAndTuple
some_var : MyTypeAlias = MyTypAlias(arg1, arg2, ...)

@rebcabin
Copy link
Contributor

btw i use type aliases heavily in the pure-Python version of lasr https://github.com/rebcabin/lpython/tree/brian-lasr/lasr/cpython, e.g.,

Line        = int
Col         = int
LTType      = Optional[str]
LTVal       = Optional[Union[str, int]]
FullForm = Dict[str, Any | Dict[str, Any]]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants