-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add AssemblyScript #5640
Add AssemblyScript #5640
Conversation
- '@unmanaged' | ||
- '// @ts-ignore: decorator' | ||
- '[iuf](8|16|32|64)' | ||
- 'usize' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- 'usize' | |
- '\b[iu]size' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why only \b
on the beginning and not the end?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because we should accept isize
or usize
but not iusize
or uisize
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, that's the \b
at the front, but thy not \b[iu]size\b
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could add this at the end but usually it's unnecessary. But it's depend on how tokens match - per lines or per words.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So something like [\s:<][iu]size[\s>)]
would be more accurate.
(The :
is for variable defintions like const ptr:usize
, the <>
for generics like Array<usize>
and the )
for (ptr: usize)=>void
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's true. It could be false positive. But at the same time this should be also valid: :isize
for example function foo():isize
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also =>usize
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All possible variants: Array<usize>
, ()=>usize
, (x:isize)
:isize
, : isize{
, as isize;
, isize(x)
, <isize>x
, [isize,isize]
, {x:isize}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like a more appropriate border would be [^.\w]
, so it doesn't match inside other words and not as a property.
- '@final' | ||
- '@unmanaged' | ||
- '// @ts-ignore: decorator' | ||
- '[iuf](8|16|32|64)' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- '[iuf](8|16|32|64)' | |
- '\b[iuf](8|16|32|64)' |
pattern: | ||
- '@inline' | ||
- '@final' | ||
- '@unmanaged' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also
- '@operator'
- '@global'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and @unsafe
and maybe even @builtin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow!! Over 20 samples 😱 We really don't need that many, especially the really big files which seems to have a lot of duplicate content from the others. Please cut down the samples to only those that are most representative of the language and real world use. 2-5 is plenty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't AssemblyScript quite literally a clean subset of TypeScript? That doesn't make it its own language, and is going to be a nightmare to disambiguate from "regular" TypeScript; i.e., you'll wind up with projects classified as 80% AssemblyScript and perhaps 20% TypeScript due to the presence of .ts
files that didn't match a heuristic.
@Alhadis Yes it is, except a few very small edge cases. @lildude I included so many examples because it is so hard to distinguish and I even removed some of them, because they were valid TS and AS even for me. |
This should tell you that no real (language-level) difference exists between AssemblyScript and TypeScript—the former is simply a restrictive and specialised application of the latter, not unlike asm.js. From my reading, the only difference between the two technologies is that AssemblyScript mandates strictly-typed code and abstinence from JavaScript's more dynamic features (I'm guessing stuff like |
AssemblyScript has not only a stricter type system, but also some semantic features and fixes that TypeScript cannot afford. Some of this diffs are presented in this article: https://blog.suborbital.dev/assemblyscript-vs-typescript. Btw good point about |
All of those differences are semantic in nature, and have nothing to do with syntax. That is, they pertain to the interpretation of TypeScript, and many of these differences can also be enforced by a conventional TypeScript compiler, IIRC. Sorry, I'm not seeing anything that warrants recognition as a discrete language. |
Perhaps for reference, the most prominent syntactic difference is that decorators like |
This is what I'm currently at: - extensions: ['.ts']
rules:
- language: XML
pattern: '<TS\b'
- language: AssemblyScript
and:
- negative_pattern:
# Invalid types in AssemblyScript
- '[^"`''.\w]undefined[^"`''.\w]'
- '[^.\w]any[^.\w]'
- '[^.\w]unknown[^.\w]'
# No eval in AssemblyScript
- '[^.\w]eval\s*\('
- pattern:
# Builtin decorators
- '@inline'
- '@final'
- '@unmanaged'
- '@operator'
- '@global'
- '@unsafe'
- '@builtin'
- '^// @ts-ignore: decorator$'
# number types only available in AssemblyScript
- '[^"`''.\w][iuf](8|16|32|64)[^"`''.\w]'
- '[^"`''.\w][iu]size[^"`''.\w]'
- language: TypeScript
I think AssemblyScript counts as it's own language, but I can see that if there is no way to safely distinguish it, it can't be added here. Also I think that in cases where a file in an AssemblyScript project is misclassified (which will surely happen) and is valid Typescript, the classification is still correct. For me it would be enough to get the heuristics to a point where no TypeScript file is detected as AS, because that would probably confuse a lot of people, but false-negatives in AS are not that big of a deal (for me). |
That's only because the project's name makes it sound like a language. 😉 Would you hold the same perspective if the project were named
You're right, I'm afraid. However, ``` assemblyscript
export function negate(n: i32): i32 {
return -a;
}
```
``` typescript
export function negate(n: i32): i32 {
return -a;
}
``` |
negative_pattern: | ||
- '\s+undefined\s+' | ||
- '\s+any\s+' | ||
- '\s+unknown\s+' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw I'm not sure about unknown
. It is quite likely that it will still be used at least for variadic function declarations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also some people in the AS community have implemented various forms of unknown
, and it could conceivably be supported officially later once semantics are ironed out.
I'm not sure an alias of TypeScript would be correct, because we already have users, who expect to be able to use any TS library in their AS project, which is sadly not the case. And letting them find mostyl TS code when searching for AS wouldn't help with this. Would it be possible to add it, but without detection via heuristics, so it has be set via overrides? |
Just to be clear: AssemblyScript is not a language. It's a compilation target for TypeScript with the same syntax, semantics, and compiler. Certain aspects of the runtime may differ (predefined types, a more restrictive standard library, etc), but runtimes aren't languages. I noticed this PR was submitted in response to @comtechnet's issue at AssemblyScript/assemblyscript#2127:
The most we could do is add AssemblyScript as an alias of TypeScript, for the reasons I outlined earlier. However, this will not show up in repository listings as a project's language; i.e., here: Long story short, GitHub doesn't support user-defined languages (including arbitrary labels or descriptions). Instead, @comtechnet (and any other AssemblyScript users) are encouraged to use topics to improve "discoverability". Just tag your project with #assemblyscript. |
That's totally false. AssemblyScript is typescript-like language and compiler to WebAssembly which is compiler target. It has own subset syntax which is close to TS but has some limitations and extensions (like global function decorators, operator overloads, native types like i64/u64/i8/u16 which doesn't support by TS). It also has own parser, compiler and emitter. It has similar syntax but totally different semantics sometimes. |
@MaxGraey A compiler and its output are completely independent of its input.
It can output binaries or shared objects for a variery of architectures, which serves as the analogoue of a compiler targeting JavaScript or WASM as its output.
Yes, that ties in with what I said about runtime details:
|
Yes, and how it relate to TS and AS are the same statement? Every language should have at least one compiler. Some of them like Clang or GCC can compile C and C++ at the same time. But I don't really see how this cancels out the fact that AS and TS have pretty serious differences in semantics? |
For example Flow and TS have much more similarities between each other. Has similar types, syntax, compilation target (JS). Same extension ( And here: |
When I say "semantics", I'm talking about the exact behaviour and logic of language-level elements: classes, functions, variables, expressions, scoping, etc. Fundamental elements of JavaScript, and by extension, TypeScript.
No, we're not distinguishing syntax here. |
I also meant that cancelled PR #4515 |
That's a different language of the same name. |
Ah, you're right! |
I am agree that it's nightmare to detect it. Moreover as a programmer I prefer another extension for AssemblyScript in my personal projects. Here is an issues about supporting different extension that I am using it
AssemblyScript is still pretty young language and follow WebAssembly specification and implemented things like Operator Overloading / SIMD / Reference Types / Own STD Library / Other interfaces that will never will be in TypeScript. Also it's incompatible with TypeScript and better to write new project from scratch than adopting it. WebAssembly technology is growing and Javascript community is huge. Here statistics in Google Trends, NPM downloads in 5 years and video from Github itself about it https://www.youtube.com/watch?v=97ej9-CE3Gc Will you add it to github if we change the extension to a unique one? @Alhadis, please give it a try! Because I think we can make this decision and it's right time to change the extension finally (community used .ts just for linting LOL). |
As an AssemblyScript contributor, changing the extension and thus having to replicate the hundreds of errors that TypeScript lints for in the AssemblyScript compiler would be both costly performance-wise and would bloat WASM bootstrapping sizes. I don't intend to hijack this discussion and turn this into a pros-and-cons-of-changing-the-extension discussion so I'd be happy to voice and reiterate my stance on this in the Discord. |
Description
Adds AssemblyScript language and heuristics.
The heuristics are really rough for now and will be changing.
Checklist:
I am adding a new language.
I am changing the color associated with a language
Still discussing