Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Typescript Support? #33

Open
keeslinp opened this issue Aug 15, 2019 · 13 comments
Open

Typescript Support? #33

keeslinp opened this issue Aug 15, 2019 · 13 comments

Comments

@keeslinp
Copy link

Hey, I love the idea behind this and RESSA. I'm hoping to use this in a project of mine and I'm mostly working with typescript. Would you be opposed to a PR adding typescript support? It might be a while until I get to that portion of the project but I figured I'd float it out there just in case you don't want to support typescript.

@FreeMasen
Copy link
Collaborator

Thanks for the question!

Overall I would love to be able to support typescript, the big question is what would that look like?

Currently ress can handle tokenizing a typescript file barring the use of experimental decorators, which would probably be the first step.

I can see a few ways this could go

  • a new crate could be developed with a TS parser
  • this project could export a new parser for TS
  • the current parser could provide a config flag for allowing types

Overall I am not sure what the feasibility of each option might be, I’d be happy for any insight on the matter.

@keeslinp
Copy link
Author

I'm not super familiar with how the crate works in its entirety, but from the brief digging I did when I first looked it shouldn't be too hard to add typescript support. The nice thing about typescript is that it is a superset of javascript, so we would just need to add new tokens and enhance the parser to recognize those tokens. If we don't want those to be allowed in vanilla js then a config flag could prevent it from being allowed in vanilla js.

I don't think we'd need to support decorators right off the bat because that is experimental even in typescript.

@FreeMasen
Copy link
Collaborator

I might throw together some tests this week but I believe in its current state this crate should be able to tokenize a typescript file with no problem. Some parts of it might be slightly less ergonomic than a tokenizer built for typescript directly, it should be enough to build a parser around.

Take this example

export function stuff<T>(a: T): T {
    return a * 10;
}

would tokenize to the following

[
Token::Keyword(Keyword::Export),
Token::Keyword(Keyword::Function),
Token::Ident(Ident("stuff")),
Token::Punct(Punct::LessThan),
Token::Ident(Ident("T"),
Token::Punct(Punct::GreaterThan),
Token::Punct(Punct::OpenParen),
Token::Ident(Ident("a")),
Token::Punct(Punct::Colon),
Token::Ident(Ident("T"),
Token::Punct(Punct::CloseParen),
Token::Punct(Punct::Colon),
Token::Ident(Ident("T"),
Token::Punct(Punct::OpenBrace),
Token::Keyword(Keyword::Return),
Token::Ident(Ident("a")),
Token::Punct(Punct::Asterisk),
Token::Number(Number("10")),
Token::Punct(Punct::SemiColon),
Token::Punct(Punct::CloseBrace),
]

So, generics would be kind of a pain but I don't see any reason it shouldn't work.

@FreeMasen
Copy link
Collaborator

So, I am sure there are more edge cases to explore but I threw together a preliminary test today.

this repo is an example project that pulls down the typescript repo, unzips it and then loops over the files in this directory

You can see the output from running it here

@keeslinp
Copy link
Author

keeslinp commented Aug 27, 2019 via email

@FreeMasen
Copy link
Collaborator

I am curious what keywords you believe are missing?

If you are interested in having RESSA be able to parse TS, I will move this issue over to that repo, once any remaining issues are solved at the tokenization step.

@keeslinp
Copy link
Author

I was looking through the output you linked and it appears that type is being recognized as an ident instead of a keyword. I'm not sure what keywords could be missing.

https://github.com/FreeMasen/ress_ts_test/blob/1225973aebbb806d3710f520d72fbe3849af0652/out.log#L170

Yeah, moving over there seems like a good idea.

@FreeMasen
Copy link
Collaborator

I'm going to go ahead and move this over.

My initial reaction is that type should be treated as a contextual keyword (like async is), unless it is eventually added to the TC39 spec

@FreeMasen FreeMasen transferred this issue from rusty-ecma/RESS Aug 27, 2019
@allsey87
Copy link

Depending on how much work this is involved, I might be able to work on this a bit if we can also handle typescript definition files. This is what I have found so far by probing the code a bit.

export let SomeVariable: number;

the parser currently chokes in parse_lexical_decl when it encounters a column denoting the start of the type specification. Similarly, when a type specifier on a class field is encounted, e.g., someField: SomeType; the parser chokes in parse_class_el.

For a class function (like the following), the parser chokes on the column after someArgument and if removed, again just before SomeReturnType

someMethod( someArgument: SomeType ): SomeReturnType;

For the former, parsing breaks in parse_property_method and for the latter, in one of the methods for parsing the method body.

So basically we need to extend the parser to handle the colon in these contexts. @FreeMasen do you have any recommendations on how I could proceed?

@allsey87
Copy link

These are a few tests that I have put together that currently fail

extern crate ressa;

#[test]
fn type_annotation_in_export() {
    let tsd = "
export let someNumber: number;
";
    let _ = ressa::Parser::builder()
        .module(true)
        .js(tsd)
        .build()
        .expect("failed to create parser")
        .parse()
        .expect("failed to parse tsd");
}

#[test]
fn type_annotation_in_field() {
    let tsd = "
export class SomeClass {
   someField : number;
};
";
    let _ = ressa::Parser::builder()
        .module(true)
        .js(tsd)
        .build()
        .expect("failed to create parser")
        .parse()
        .expect("failed to parse tsd");
}

#[test]
fn type_annotation_in_method_args() {
    let tsd = "
export class SomeClass {
   someMethod(someArgument: number);
};
";
    let _ = ressa::Parser::builder()
        .module(true)
        .js(tsd)
        .build()
        .expect("failed to create parser")
        .parse()
        .expect("failed to parse tsd");
}

#[test]
fn type_annotation_in_method_return() {
    let tsd = "
export class SomeClass {
   someMethod(): number;
};
";
    let _ = ressa::Parser::builder()
        .module(true)
        .js(tsd)
        .build()
        .expect("failed to create parser")
        .parse()
        .expect("failed to parse tsd");
}

@allsey87
Copy link

So I am willing to put several hours into getting typescript support into RESSA, but I do need some support or at least acknowledgement that this feature is still desirable. As the situation stands, I am currently having to look into using swc but I still have a strong preference to use/contribute to this project.

@FreeMasen
Copy link
Collaborator

@allsey87 Thank you for getting started on this. I would be happy to work with you to implement a typescript parsing option for this project! Sorry it has taken me so long to respond.

One of the big questions that I still don't feel like I have a good answer for is how do we output something that is meaningful for a typescript file but doesn't complicate the JavaScript user experience. Currently the AST for javascript isn't going to have the option of adding a "type ident" for expressions.

If we choose to extend the AST in place and add an optional type_identifier this would drastically change the AST. If we define a seperate TypeScript AST, we would need to maintain a second parser to that returns this new AST from next.

I have a few other ideas about how this could go but I haven't had a chance to really flesh them out. If you have an idea about how to solve this I'd love to hear it!

@allsey87
Copy link

allsey87 commented May 18, 2020

These design issues can be tricky to solve and to be honest, coming from a background of microcontroller programming and C/C++, I only understand and use Javascript and Typescript at a very high level and I am not in a knowledgeable-enough position to be making any suggestions on this front. I was under the impression initially that Typescript would just involve adding an additional field for a type identifier against several node types in the AST (and that typescript could be just enabled as a dialect), however, having looked at swc a bit, I can see now that it is not so straightforward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants