Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement type checking, orthogonally to everything else #33

Open
masak opened this issue Oct 10, 2015 · 16 comments
Open

Implement type checking, orthogonally to everything else #33

masak opened this issue Oct 10, 2015 · 16 comments

Comments

@masak
Copy link
Owner

masak commented Oct 10, 2015

It would be way cool if we could add type checking as basically an optional module on top of 007 core. It would be both a "battle test" of the macro primitives we provide (and provide clues as to whether they're powerful enough), and a prestigious thing to pull off.

I'd be totally fine with us faking much of it first, including macros, modules, and module imports/exports, as long as we were converging on having it be basically an add-on thing, expressible in 007 itself.

Syntax

The syntax extensions are the easy part. We'd want to extend:

  • my and constant declarations: my count: Int = 0;
  • Parameters (including to if and loop blocks): sub add(lhs: Int, rhs: Int) { ... } and -> s: Str { ... }
  • sub and macro declarations (return types): sub add(lhs, lhs): Int { ... }
  • the identifier in catch blocks (see Implement exceptions, throw, and try/catch/finally #65)

Basically, all the places in the language where an identifier signifies the introduction of that identifier into some scope, we'd want to also introduce a : <type> parse.

I'm totally fine with cluing the core parser in to where those places are where we'd want to introduce a typing. From the list above, that looks like about five places. We just place a rule there or something that's empty in the standard 007 grammar. If that's not enough, then I guess we'll have to re-implement enough of Perl 6 grammars inside of 007 to make that possible.

By the way, it would always be optional to add a colon and a type, and leaving a type out would need to mean the same thing as before we introduced types. (If someone still wants to put something there, they can write : Any, which means "do no type checking on this thing".)

Array is a generic type, and just writing it like that would mean Array<Any>. But you can also write Array<Int> or even Array<Array<Array<None>>>.

Function types have a -> in them to separate the types of the parameters from the return type, like Int -> Str. If you have several parameters (or zero), you need to surround them with parentheses: (Int, Int, Int) -> Str. You cannot have several things on the right-hand side of the arrow, though. (By the way, TypeScript uses => for the arrow, but we're going with Python's proposal for type annotations, and it uses ->.)

For objects, the type looks like this: { name: Str, age: Int, greet: () -> Str }.

Semantics

  • The numeric operators, like infix:<+> et al, would only be allowed on Int literals and variables. (Any would work too; see above. This fact will be assumed from now on in this text.)
  • The string operators, like infix:<~> would only be allowed on Str things.
  • If the compiler can infer the type of a sub by just bubbling up information from its constituent parts, it will do so. Thus you might actually get your subroutine typed by the compiler. (If this clashes with a type annotation you made yourself, you get a compile-time error.)
  • Similarly, if you initialize a variable as you declare it, the compiler will deduce the type for you.
  • However, both variable declarations without assignments and parameters are typed as Any if no type is provided. No guessing is being done from later handling of the variable or parameter.
  • Assignment is only allowed if the types match.
  • When a call is made to something annotated with a return type, the call expression is assumed to have that return type.
  • Indexing is only allowed on Array. The type of the thing we get back is the generic type parameter T of Array[T].
  • Typed parameters in if, for and while pointy blocks are checked as early as possible against their expression. If something is obviously impossible, that's a failure at compile time. Some things will have to wait until run time, unfortunately. (For example, my a: Any = [...]; for a -> e: Str { ... }.)
  • Control flow within a sub is analyzed enough that we make sure we always return a type we declared we would.
  • We are quite likely to need/want to do type guards in practice.
  • TypeScript does something called contextual typing which we might also start caring about once we run out of simple things to do.

Other worthy ideas

  • Union types: Int | Str
  • Optional types: Maybe<Int>
  • Fixed array types: [Str, Str, Int]
  • Generics
@masak
Copy link
Owner Author

masak commented Oct 11, 2015

We'd want to extend:

  • my and constant declarations: my count: Int = 0;
  • Parameters: sub add(lhs: Int, rhs: Int) { ... } and -> s: Str { ... }
  • sub and macro declarations: sub add(lhs, lhs): Int { ... }

I don't know whether to call it pure blind luck or language design brilliance, but neither #25 nor #32 introduces any new places to put type annotations, despite being fairly sizable additions to the syntax.

  • Object literal properties don't expect to be typed. Just as array literal elements don't. Which is lucky, because the colon is taken already for properties in the object literal DSL, separating as they do the key and the value.
  • Anything that can occur inside the class block is either a my or a sub, both of which were handled above.

@masak
Copy link
Owner Author

masak commented Oct 11, 2015

Function types have a -> in them to separate the types of the parameters from the return type, like Int -> Str.

Following tradition, the -> separator is right-associative, so Int -> Int -> Int means the same as Int -> (Int -> Int). That is "this function takes and Int and returns a function that takes an Int and returns an Int".

Do note though that Int -> Int -> Int is distinct from (Int, Int) -> Int, and that functions in 007 are not curried by default, and something like sub add(x: Int, y: Int): Int { return x + y } would be of the latter type, not the former.

@masak
Copy link
Owner Author

masak commented Oct 11, 2015

One great way to map this out would probably be to implement the whole type checker as an honest-to-blog 007 program which (magically) got the Qtree of the program-to-be-type-checked as input.

@masak
Copy link
Owner Author

masak commented Oct 14, 2015

Hm, Which one of these makes more sense?

sub foo(x: Int): Int { return x }
sub foo(x: Int) -> Int { return x }

In all the above discussion, I went with the former, because:

  • It's consistent with everything else that's type annotated and uses a :
  • It's what TypeScript does

But, taking the other side:

  • It's more consistent with the way function types look, and it speaks "return type" more
  • In that vein, it doesn't do a mental "cheat" by mixing together the return type with the type of the sub itself (or of the signature)
  • I just noticed that Swift uses -> in that position
  • And I just remembered that Python's proposal does too

I honestly don't know which one to go with. All I know so far is that I want to choose one of them, not allow both. And I want to make whatever choice I make for compelling reasons.

@masak
Copy link
Owner Author

masak commented Oct 14, 2015

One strike against using -> is that this would be allowed:

sub foo(x: Int) -> Int -> Int {
    return sub(y: Int) -> Int {    # assumed: `sub` expressions
        return x + y
    }
}

The -> Int -> Int looks ugly to me. These are two quite different -> arrows (one introduces the return type annotation; the other is part of the function type syntax).

Of course, we could arbitrarily require parentheses:

sub foo(x: Int) -> (Int -> Int) {

But I have no sense of whether that's a better solution than "just get used to reading it the 'right' way!"

All in all, this counts as a strike against switching from the status quo to ->.

@vendethiel
Copy link
Collaborator

That last one sounds Haskell-ish :P.

@masak
Copy link
Owner Author

masak commented Nov 29, 2015

It would be way cool if we could add type checking as basically an optional module on top of 007 core.

A good way to have this happen is to do all of the type-related development in a (public) branch. Then, when that branch is relatively feature-complete, find a way to integrate the whole thing into master as a pragma. Using #53's syntax, that'd be use types; or something.

@raiph
Copy link

raiph commented Feb 6, 2017

Am I right that you're proposing an optional type system as proposed by Gilad Bracha et al? That is, these types can never affect the compilation of a program save to possibly make it fail to compile if the type checker is run. If it does compile, then not running the checker, or removing some or all optional types, makes absolutely no difference to whether it compiles and what the compiled code does. So these optional types are type constraints on variables not types of values. Is that about right?

So this precludes use of these types in multiple dispatch, for example, right?

@masak
Copy link
Owner Author

masak commented Feb 6, 2017

Bracha's idea seems to me more about allowing one language to plug in many different type systems (or none at all). But yes, I would like to make the type system optional in some sense. First sentence of this issue:

It would be way cool if we could add type checking as basically an optional module on top of 007 core.

We're quite a far way from being able to do that; 007's grammar is still all in Perl 6, and we'd probably need a pure-007 parser to be able to hack type annotations into it. Beyond that it also feels like we're missing some theory/experience.

But yes, it's a goal.

@masak
Copy link
Owner Author

masak commented May 9, 2018

Array is a generic type, and just writing it like that would mean Array<Any>.

Looking at TypeScript prior art, notably the spec section about type references, it seems Array on its own is simply not a valid type reference.

I think I like that. I like the simplicity of that. I have a feeling the generics system will be tricky enough to get right without us introducing more DWIM on top of it.

Also, these days, I'm much more tempted to spell it any. Things like Int get to stay upper-cased, but any is special enough to get a lower-case character. I don't expect any (unlike basically all other types) to be accessible outside type-land either.

@masak
Copy link
Owner Author

masak commented May 9, 2018

  • Typed parameters in if, for and while pointy blocks are checked as early as possible against their expression. If something is obviously impossible, that's a failure at compile time. Some things will have to wait until run time, unfortunately. (For example, my a: Any = [...]; for a -> e: Str { ... }.)

Interestingly, it seems that in the analogous case with for-of loops, TypeScript refuses to put a type annotation on the e.

That sort of makes sense. TypeScript is not in a position to do runtime typechecking, and so it won't do any here. If you were able to put a type annotation there, you still wouldn't be able to add anything of significance compared to just typing the container you're iterating over.

Let's go with that. for loop block parameters are not typeable.

@masak
Copy link
Owner Author

masak commented May 12, 2018

In TypeScript (which much of the described type system is based on) you can't do this:

my x = 5;
x = "foo";

That is, you can't suddenly change the type of a variable after the type inference engine has decided what type it has.

It feels odd that we're being punished for changing the type of a variable, when that's a perfectly fine thing to do in the absence of a type inferencer. If type inference is opt-in, it shouldn't do anything here when we haven't opted in yet!

On the other hand, this is completely fair game:

my x: Int = 5;
x = "foo";    # type error: Str not assignable to Int

My guess is that TypeScript went the stricter way because (a) it's a lot easier to write a typechecker if it can make the assumption that the type inferred at declaration is in effect throughout the scope, and (b) because making that assumption is "good for you", the developer, anyway, in the sense that your code doesn't gain anything from you changing the type of a variable mid-flight.

Given both of those points, consider the wish in this comment a "strong opinion, loosely held". 😄

@vendethiel
Copy link
Collaborator

(b) because making that assumption is "good for you", the developer, anyway, in the sense that your code doesn't gain anything from you changing the type of a variable mid-flight.

Definitely siding with the "punish the user" camp myself.

@masak
Copy link
Owner Author

masak commented May 12, 2018

I mean, I think I can live with that. It just means in practice that, if you only add a pragma use typechecker; at the top of your program (thinking you'll gradually introduce types), you might go from a program that compiles to one that doesn't.

It's not the end of the world, and again, that's what TypeScript gives you. (Although they also claim to be a "strict superset" of JavaScript, which I feel is less true because of behavior like this.)

@vendethiel
Copy link
Collaborator

you might go from a program that compiles to one that doesn't.

You mean, a program that is well-typed even though it has no type ascriptions? :P.

(Although they also claim to be a "strict superset" of JavaScript, which I feel is less true because of behavior like this.)

That's a low blow. You could say that for any program the TS typechecker rejects.

@masak
Copy link
Owner Author

masak commented May 12, 2018

@vendethiel As an answer to both of your comments, I don't think I'm being unfair in this case. There are ways to code JavaScript that are perfectly justifiable, but that are rejected by the TypeScript parser. I believe it's a post-rationalization to say "well, maybe you shouldn't code that way anyway" — I mean, there's certainly a sense in which TypeScript does lead one to better habits... but then one cannot also claim that it's just normal, unmodified JavaScript as we know it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants