Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Primitive type implementation #875

Closed
wangyanxing opened this issue Oct 11, 2014 · 35 comments
Closed

Primitive type implementation #875

wangyanxing opened this issue Oct 11, 2014 · 35 comments
Labels
Discussion Issues which may not have code impact

Comments

@wangyanxing
Copy link

Primitive Type Implementation

We have been working on adding support for primitive types in TypeScript as part of a research project at the San Francisco State University. This issue documents our efforts and we would like to solicit feedback from the community. The following fork of the TypeScript repo contains a complete reference implementation of our proposal:

https://github.com/wangyanxing/TypeScript

The following Google Doc contains a formal specification of the syntax and semantics of primitive types in TypeScript:

https://docs.google.com/a/puder.org/document/d/120hs8AJ0-WGwHbvV7D3XOJ2b18ZceBvYjOlDcR1Dn_A/edit?usp=sharing

We have activated the commenting feature for the Google Doc. The following gives a quick overview of the changes we have made to the TypeScript language.

There are ten new primitive types in our implementation:
int, uint, i8, u8, i16, u16, i32, u32, float, double.

The value ranges of these types are documented in the Google Doc.

Here are some examples:

Variables and Arithmetic

The following code:

var var1 : u8 = 256;
var var2 : i8 = 200;
var var3 : u16 = 600000;
var var4 : i16 = 100000;
var var5 : int = 3.14;
var var6 : uint = var3 * 10000;
var var7 : double = 0.123456789;
var var8 : float = var7;
var var9 : u8[] = [var4,var5,var6];
var var10 : int = 3 / 4;
var var11 : int = 1 + 2;
var var12 : int = var5 + 2;

would emit:

var var1 = 256 & 255;
var var2 = (200 & 255) << 24 >> 24;
var var3 = 600000 & 65535;
var var4 = (100000 & 65535) << 16 >> 16;
var var5 = 3.14 | 0;
var var6 = var3 * 10000 >>> 0;
var var7 = 0.123456789;
var var8 = Math.fround(var7);
var var9 = [var4 & 255, var5 & 255, var6 & 255];
var var10 = 3 / 4 | 0;
var var11 = 1 + 2 | 0;
var var12 = var5 + 2 | 0;

Note: some platforms like node.js doesn’t support Math.fround() function. Our version of the TypeScript compiler would emit a compatibility version of that function in the generated code.

Functions

The following code:

function fun(x : i32) : u8 {
    x++;
    return x;
}
fun(255.9); // the function will return 0

would emit:

function fun(x) {
    x = x | 0;
    var _$0;
    (_$0 = x, x = x + 1 | 0, _$0);
    return x & 255;
}
fun(255.9);

The following code:

var func1 = (x:int) : u8 => { return x * 2; }
var func2 =function(x:int, y:int): u8 { return x+y };
func1(200.4);
func2(255.7, 1.9);

would emit:

var func1 = function (x) {
    x = x | 0;
    return x * 2 & 255;
};
var func2 = function (x, y) {
    y = y | 0;
    x = x | 0;
    return x + y & 255;
};
func1(200.4);
func2(255.7, 1.9);

The following code:

function fun_opt_arg(x? : int) : int {
    return x;
}

would raise an error since we forbid primitive type variables to be optional arguments:

error TS8001: Primitive type declaration 'x' cannot be optional.

Generics

The following code:

class GenericNumber<T> {
    add: (x: T, y: T) => T;
}

var geni8 = new GenericNumber<i8>();
geni8.add = function(x:i8, y:i8):i8 { return x + y; };

var genu8 = new GenericNumber<u8>();
genu8.add = function(x:u8, y:u8):u8 { return x + y; };

geni8.add(1.9, 2.9); // it will return 3
genu8.add(255, 1); // it will return 0

would emit:

var GenericNumber = (function () {
    function GenericNumber() {
    }
    return GenericNumber;
})();
var geni8 = new GenericNumber();
geni8.add = function (x, y) {
    y = (y & 255) << 24 >> 24;
    x = (x & 255) << 24 >> 24;
    return (x + y & 255) << 24 >> 24;
};
var genu8 = new GenericNumber();
genu8.add = function (x, y) {
    y = y & 255;
    x = x & 255;
    return x + y & 255;
};
geni8.add(1.9, 2.9); // it will return 3
genu8.add(255, 1); // it will return 0
@ivogabe
Copy link
Contributor

ivogabe commented Oct 11, 2014

Looks good, I have a few suggestions:

I think it should be disallowed to do an implicit dangerous type cast. i8 to int is safe, because we know that every number that fits in an i8 will also fit in an int. int to double is also safe. But a double to an int is not safe, because there are numbers that fit in a double but not in an int (like 8.5). C# disallows this. You need to do an explicit type cast.

I would also like to have a syntax for typecasts. Some proposals:

double d = 8.5;
int i;
i = int(d); // swift
i = (int) d; // C#
i = <int> d; // TypeScript casts

That would generate something like

i = d | 0;

Do you have a reason why you chose i8 instead of int8?

@wangyanxing
Copy link
Author

Thanks for the comment. I've also read your proposal on issue #195 , your idea is very practical.

Since I'm a C++ programmer, I've accustomed to use such implicit type cast. Yeah it's really dangerous but sometimes useful. On the other hand, I got a lot of ideas from LLJS, they did implicit cast as well.

About i8, I use it just because it's short. Some old C++ programmers are very lazy to write more, like me. :)

@ivogabe
Copy link
Contributor

ivogabe commented Oct 11, 2014

I understand your laziness but when you're lazy you shouldn't use static typing. One of TypeScript's design goals is to "Statically identify constructs that are likely to be errors." Such casts are likely mistakes and can give unexpected behavior:

var x: int;
// Lots of code
x = 3.5;

The user might expect that x = 3.5, but x = 3.

Also, how does this work with unions? How does this compile:

var x: int | string;

x = 5;
x = "foo";

@joewood
Copy link

joewood commented Oct 11, 2014

Interesting. Why not take this further and define the primitive types using custom ranges (subrange types), rather than a predefined set of types? Like Pascal, Ada, Modula-2 etc...
With typedef support the basic integer types from the spec could be defined, but using the subrange information more validation tests could be applied to the code at compile time.

@apuder
Copy link

apuder commented Oct 11, 2014

The types are inspired by LLJS and can be mapped nicely to asm.js. Custom ranges are an interesting idea. As you pointed out, it would also require support for typedef that might open another can of worms. Lets see what other people have to say.

@saschanaz
Copy link
Contributor

What would this emit?

var num = 512;
var u8array: u8[] = [];
u8array.push(num);

The current proposal seems to do nothing here. Would u8array.push(num % 255) be the result?

@RyanCavanaugh RyanCavanaugh added the Discussion Issues which may not have code impact label Oct 24, 2014
@wangyanxing
Copy link
Author

@saschanaz

The document has been updated.

var num = 512;
var u8array: u8[] = [];
u8array.push(num);

will output:

var num = 512;
var u8array = [];
u8array.push(num & 255);

@wangyanxing
Copy link
Author

@ivogabe
It seems type union is still a developing branch of TypeScript. We didn't take that much time to consider it. But the recent added tuple has been supported.
Thanks for reminder of type union, we are considering carefully. However, from my point of view, type union doesn't make sense for primitive types. For example:
If you have

var x: int | double;
x = 1.1; // what's the behavior of this statement?

int|string is ok but int|u8 or int|float is not a good idea because it's ambiguous.

@wangyanxing
Copy link
Author

@ivogabe
For type casting, just forbid int to float is worth considering.
Currently my proposal is to introduce a "strict mode", if this flag has been turned on, type casting for primitive types will be forbidden.

Just a proposal.

@wangyanxing
Copy link
Author

On the other hand, since our final goal is to support asm.js, a fixed-size array as well as the memory management based on such array is also under consideration. I'll update the document once the proposal has been decided.

@saschanaz
Copy link
Contributor

The updated specification now forces the compiler to distinguish built-in functions from others, which can be dangerous when it miss one.

interface Array<T> {
    // New ES6 built-in method
    fill(value: T, start?: number, end?: number): void;
}
/*
if (Array.prototype.fill)
    Array.prototype.fill = ... (polyfill)
*/

var u8array: u8[] = new Array<u8>(5);
u8array.fill(1024); // What will our compiler do?

@wangyanxing
Copy link
Author

yes it's right
so I don't distinguish built-in functions from others
just convert when every function call (with primitive type arguments).

(I'll update the document soon).

But actually I still need to convert the arguments inside the function if I plan to generate asm.js (ongoing).
asm.js specification needs such argument conversion in front of the function block, like.

function func(arg : int, ........) {
    "use asm";
    arg = arg | 0; // asm.js! convert again
}

func(x | 0, ........); // convert, it's general situation

The situation is subtle, maybe I'll do some detection when compiling. For example if generation asm.js flag is turned on, I'll do the conversion arg = arg | 0 inside the function.

Anyway, asm.js is a different story. If no asm.js codes generation, it's not necessary to convert arguments inside the function, only do it from caller side is enough.

@saschanaz
Copy link
Contributor

That's good. :D

BTW, the type union problem can possibly be solved by some type of subtype collapsing. The range of double includes the range of int, so int|double should be equivalent to double.

@wangyanxing
Copy link
Author

Nice! Thanks for the link.

@ivogabe
Copy link
Contributor

ivogabe commented Oct 25, 2014

Another part of the union problem is that normal (non-union) vars get a conversion (var number = ... | 0) but how would that be for unions?

var stringOrInt = "string" | 0; // Doesn't work

@saschanaz
Copy link
Contributor

@ivogabe Would you explain it more? Are we going to support explicit type casting by ... | 0?

@ivogabe
Copy link
Contributor

ivogabe commented Oct 25, 2014

The code above was actually javascript, so I meant: how would this compile?

var stringOrInt: string | int = "string";

or

var stringOrInt: string | int = 3.14;

Because as you can see in the first post, assignments to vars get a conversion, but that wouldn't be possible with union types. (That's also a reason I wouldn't prefer implicit casting)

@saschanaz
Copy link
Contributor

I think the compiler would be able to read var stringOrInt: string|int = 3.14; as var stringOrInt: string|int = (int)3.14; (so 3.14 | 0), while read stringOrInt = "string"; as it is. I still prefer explicit casting over implicit one, however...

@wangyanxing
Copy link
Author

Yep, this is an issue if working with type union.
Thanks for the explanation. I'll reconsider it.

@saschanaz
Copy link
Contributor

Returning to the first type union problem, I'm not sure what will happen here.

var num: int | uint = 12.1; // This cannot be solved by subtype collapsing.

I think we should give an error in this case and force explicit type casting as @ivogabe said.

var num: int | uint = <int>12.1; // Now we know what will happen here.

@wangyanxing
Copy link
Author

Yes, if using type union, implicit cast by the compiler is really a bad idea.

@saschanaz
Copy link
Contributor

I also have a question about calculations:

var u1: i8 = 127;
var m = u1 * u1; // What's the type of m? Still i8, or implicitly converted to i16?
var d = u1 / 3; // int or float?

@wangyanxing
Copy link
Author

like C, m and d is i32.

@apuder
Copy link

apuder commented Oct 31, 2014

We discussed some of the issues that were mentioned in earlier comments. Type union can be supported in certain circumstances where type collapsing is possible. For something like (i8 | u8) where type collapsing cannot be applied, we propose to generate a compile time error. Also the example (string | int) is not a problem. In this case it will just not be possible to emit asm.js compatible code.

Lastly, type coercion for primitive types during function calls can be done on the side of the caller or callee. It is more efficient (in terms of code size as well as asm.js compatibility) to do this on the side of the callee. For builtin functions (such as Array.push()) or external type libraries this needs to happen on the side of the caller. This can easily be implemented in the compiler.

@benjcooley
Copy link

Hi @wangyanxing. I've experimented with your branch and there are a few issues, but it looks very promising.

One issue is that for array types, the indexer of the array is assigned the type of the array, which I think is unintended. Other issues revolve around inferencing and coercions to basic types, which seem to not be working correctly in obejct/array initializers.

In any case, I think this is a worthy addition. There are several benefits:

  1. This ultimately allows the JS VM to emit more efficient code for integer operations.
  2. It allows TS code to more easily integrate with "emscripten" C++ code, and more effectively strongly type method parameters that call into C++.
  3. It allows TS code to be compiled more efficiently to mobile platforms and allow TS to be used as a cross Web/Mobile development language that can offer performance closer to a native application on mobile.

I would like to assist you potentially with moving this branch forward and making this more functional. I think even if this was out of scope for the MS team, it would make a fairly worthy branch to maintain over the long term for teams interested in C++ integration and high performance mobile applications.

@Griffork
Copy link

I'm curious, how does:

var a: string|int = "mystring";
var b: string|int = a; //string or int?

get resolved?

@apuder
Copy link

apuder commented Dec 23, 2014

I think your question is not related to primitive types but how TypeScript plans to define type compatibility for type unions. To my understanding, the type of b is just as you declared it: string | int which is compatible with the type of a, so the assignment should be fine. Note that it won't be possible to emit asm.js for such a declaration though.

@Griffork
Copy link

But in this proposal, assignments and arithmetic on integers is supposed to have extra code inserted (for asm), I was wondering what would happen when you couldn't be sure the type was an int (or another type that needs special makeup).

Surely in this case you might get unexpected results?

Better example :

var a: string|number = "mystring";
var b: string|int8 = <string|int8>a; //string or int8?

@Griffork
Copy link

An addition since I can't edit on my phone.

What are the inherent problems in introducing types that won't have the same behaviour in all cases, and is it worth it?

Should there be a compiler error if you try to union these primitive types since they can't actually function in a union?

Is there ever a benefit to being able to union int8 and string?

@benjcooley
Copy link

There are a couple of ways possibly to handle this.

There first is just to disallow unions of primitive types. The primitive type would be erased when assigned to a union of string | number and would need to be explicitly coerced back to int.

The second would be to allow one numeric type per union which could be number or a primitive. Number values are then coerced implicitly to the primitive type when assigned to the union and also when read from a type guarded expression of that primitive type. This would mean that typeguards of a primitive type would translate to "number".

The third is to allow any mix of primitive and number types and either not coerce values on assignment, or coerce to the least restrictive type. Then in type guard expressions match the first guard who's primitive type's range fits the value stored.

@saschanaz
Copy link
Contributor

The fourth is to emit this:

var a = "mystring";
var b =  typeof a === "number" ? a : (a & 255) << 24 >> 24;

One may think this is too long...

@wangyanxing
Copy link
Author

Hi @benjcooley, I'm quite excited that you are interested in our project.
Yeah I've noticed the issues you just pointed out and I'm on my way of fixing them.
Regarding the benefits you mentioned, we all think low level supporting for js/ts is quite necessary for future web applications. We are happy to do something with you.

@wangyanxing
Copy link
Author

For type unions, now we just disallow this feature for primitive types since it's easy to cause ambiguous situations.

@wangyanxing
Copy link
Author

We've considered @benjcooley 's second solution, it works but somehow we feel some bad smells from such inconsistent semantic.
Actually I think using type union for primitive types doesn't make too much sense so finally we simply decided to forbid it.

@benjcooley
Copy link

@wangyanxing - I think forbidding it is a good solution for now, possibly the future as well. In any case it would be a better idea to concentrate on ensuring that the base implementation is robust and has a complete set of tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Discussion Issues which may not have code impact
Projects
None yet
Development

No branches or pull requests

8 participants