Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add TextMate Language Pack feature #374

Merged
merged 1 commit into from Jun 13, 2022

Conversation

howlger
Copy link
Contributor

@howlger howlger commented May 1, 2022

Syntax highlighting and more for programming languages and file formats by taken TextMate grammars and language configurations files from the Visual Studio Code project.

I don't know whether it is allowed and legal to take files from the Visual Studio Code project in this way, and if so, what information has to be provided and where. Could this please be checked by intellectual property experts? For details which files are taken from where, see below.

The content is created and can be updated via the Maven build script org.eclipse.tm4e.language_pack/_update/pom.xml.

Some languages have been commented out due to issues.

See also my Language Pack for Eclipse project.

List of files taken from Visual Studio Code

The following TextMate grammar and language configuration files have been taken and slightly modified from the
Visual Studio Code's built-in extensions.
Most of the grammar files were in turn converted by the Visual Studio Code project from other projects (see column Original).

File Taken from ...
Extension File Original
bat/language-configuration.jsonbatlanguage-configuration.json-
bat/batchfile.tmLanguage.jsonbatchfile.tmLanguage.jsonbatchfile.cson
clojure/language-configuration.jsonclojurelanguage-configuration.json-
clojure/clojure.tmLanguage.jsonclojure.tmLanguage.jsonclojure.cson
coffeescript/language-configuration.jsoncoffeescriptlanguage-configuration.json-
coffeescript/coffeescript.tmLanguage.jsoncoffeescript.tmLanguage.jsoncoffeescript.cson
cpp/language-configuration.jsoncpplanguage-configuration.json-
cpp/c.tmLanguage.jsonc.tmLanguage.jsonc.tmLanguage.json
cpp/cpp.embedded.macro.tmLanguage.jsoncpp.embedded.macro.tmLanguage.jsoncpp.embedded.macro.tmLanguage.json
cpp/cpp.tmLanguage.jsoncpp.tmLanguage.jsoncpp.tmLanguage.json
cpp/platform.tmLanguage.jsonplatform.tmLanguage.jsonPlatform.tmLanguage
cpp/cuda-cpp.tmLanguage.jsoncuda-cpp.tmLanguage.jsoncuda-cpp.tmLanguage.json
csharp/language-configuration.jsoncsharplanguage-configuration.json-
csharp/csharp.tmLanguage.jsoncsharp.tmLanguage.jsoncsharp.tmLanguage
css/language-configuration.jsoncsslanguage-configuration.json-
css/css.tmLanguage.jsoncss.tmLanguage.jsoncss.cson
dart/language-configuration.jsondartlanguage-configuration.json-
dart/dart.tmLanguage.jsondart.tmLanguage.jsondart.json
diff/language-configuration.jsondifflanguage-configuration.json-
diff/diff.tmLanguage.jsondiff.tmLanguage.jsonDiff.plist
docker/language-configuration.jsondockerlanguage-configuration.json-
docker/docker.tmLanguage.jsondocker.tmLanguage.jsonDockerfile.tmLanguage
fsharp/language-configuration.jsonfsharplanguage-configuration.json-
fsharp/fsharp.tmLanguage.jsonfsharp.tmLanguage.jsonfsharp.json
git-base/git-commit.language-configuration.jsongit-basegit-commit.language-configuration.json-
git-base/git-rebase.language-configuration.jsongit-rebase.language-configuration.json-
git-base/ignore.language-configuration.jsonignore.language-configuration.json-
git-base/git-commit.tmLanguage.jsongit-commit.tmLanguage.jsonGit Commit Message.tmLanguage
git-base/git-rebase.tmLanguage.jsongit-rebase.tmLanguage.jsonGit Rebase Message.tmLanguage
git-base/ignore.tmLanguage.jsonignore.tmLanguage.json
go/language-configuration.jsongolanguage-configuration.json-
go/go.tmLanguage.jsongo.tmLanguage.jsongenerated.tmLanguage.json
groovy/language-configuration.jsongroovylanguage-configuration.json-
groovy/groovy.tmLanguage.jsongroovy.tmLanguage.jsonGroovy.tmLanguage
handlebars/language-configuration.jsonhandlebarslanguage-configuration.json-
handlebars/Handlebars.tmLanguage.jsonHandlebars.tmLanguage.jsonHandlebars.json
hlsl/language-configuration.jsonhlsllanguage-configuration.json-
hlsl/hlsl.tmLanguage.jsonhlsl.tmLanguage.jsonhlsl.json
html/language-configuration.jsonhtmllanguage-configuration.json-
html/html.tmLanguage.jsonhtml.tmLanguage.jsonHTML.plist
html/html-derivative.tmLanguage.jsonhtml-derivative.tmLanguage.jsonHTML %28Derivative%29.tmLanguage
ini/ini.language-configuration.jsoniniini.language-configuration.json-
ini/properties.language-configuration.jsonproperties.language-configuration.json-
ini/ini.tmLanguage.jsonini.tmLanguage.jsonIni.plist
java/language-configuration.jsonjavalanguage-configuration.json-
java/java.tmLanguage.jsonjava.tmLanguage.jsonjava.cson
javascript/javascript-language-configuration.jsonjavascriptjavascript-language-configuration.json-
javascript/JavaScriptReact.tmLanguage.jsonJavaScriptReact.tmLanguage.jsonTypeScriptReact.tmLanguage
javascript/JavaScript.tmLanguage.jsonJavaScript.tmLanguage.jsonTypeScriptReact.tmLanguage
javascript/Regular Expressions (JavaScript).tmLanguageRegular Expressions (JavaScript).tmLanguage
json/language-configuration.jsonjsonlanguage-configuration.json-
json/JSON.tmLanguage.jsonJSON.tmLanguage.jsonJSON.tmLanguage
json/JSONC.tmLanguage.jsonJSONC.tmLanguage.jsonJSON.tmLanguage
julia/language-configuration.jsonjulialanguage-configuration.json-
julia/julia.tmLanguage.jsonjulia.tmLanguage.jsonjulia_vscode.json
latex/latex-language-configuration.jsonlatexlatex-language-configuration.json-
latex/TeX.tmLanguage.jsonTeX.tmLanguage.jsonTeX.tmLanguage.json
latex/LaTeX.tmLanguage.jsonLaTeX.tmLanguage.jsonLaTeX.tmLanguage.json
latex/Bibtex.tmLanguage.jsonBibtex.tmLanguage.jsonBibtex.tmLanguage.json
latex/markdown-latex-combined.tmLanguage.jsonmarkdown-latex-combined.tmLanguage.jsonmarkdown-latex-combined.tmLanguage.json
latex/cpp-grammar-bailout.tmLanguage.jsoncpp-grammar-bailout.tmLanguage.jsoncpp-grammar-bailout.tmLanguage.json
less/language-configuration.jsonlesslanguage-configuration.json-
less/less.tmLanguage.jsonless.tmLanguage.jsonless.cson
log/log.tmLanguage.jsonloglog.tmLanguage.jsonlog.tmLanguage
lua/language-configuration.jsonlualanguage-configuration.json-
lua/lua.tmLanguage.jsonlua.tmLanguage.jsonLua.plist
make/language-configuration.jsonmakelanguage-configuration.json-
make/make.tmLanguage.jsonmake.tmLanguage.jsonMakefile.plist
markdown-basics/language-configuration.jsonmarkdown-basicslanguage-configuration.json-
markdown-basics/markdown.tmLanguage.jsonmarkdown.tmLanguage.jsonmarkdown.tmLanguage
markdown-math/md-math.tmLanguage.jsonmarkdown-mathmd-math.tmLanguage.jsonTeX.tmLanguage.json
markdown-math/md-math-block.tmLanguage.jsonmd-math-block.tmLanguage.json
markdown-math/md-math-inline.tmLanguage.jsonmd-math-inline.tmLanguage.json
objective-c/language-configuration.jsonobjective-clanguage-configuration.json-
objective-c/objective-c.tmLanguage.jsonobjective-c.tmLanguage.jsonobjc.tmLanguage.json
objective-c/objective-c++.tmLanguage.jsonobjective-c++.tmLanguage.jsonobjcpp.tmLanguage.json
perl/perl.language-configuration.jsonperlperl.language-configuration.json-
perl/perl6.language-configuration.jsonperl6.language-configuration.json-
perl/perl.tmLanguage.jsonperl.tmLanguage.jsonPerl.plist
perl/perl6.tmLanguage.jsonperl6.tmLanguage.jsonPerl 6.tmLanguage
php/language-configuration.jsonphplanguage-configuration.json-
php/php.tmLanguage.jsonphp.tmLanguage.jsonphp.cson
php/html.tmLanguage.jsonhtml.tmLanguage.jsonhtml.cson
powershell/language-configuration.jsonpowershelllanguage-configuration.json-
powershell/powershell.tmLanguage.jsonpowershell.tmLanguage.jsonPowerShellSyntax.tmLanguage
pug/language-configuration.jsonpuglanguage-configuration.json-
pug/pug.tmLanguage.jsonpug.tmLanguage.jsonPug.JSON-tmLanguage
python/language-configuration.jsonpythonlanguage-configuration.json-
python/MagicPython.tmLanguage.jsonMagicPython.tmLanguage.jsonMagicPython.tmLanguage
python/MagicRegExp.tmLanguage.jsonMagicRegExp.tmLanguage.jsonMagicRegExp.tmLanguage
r/language-configuration.jsonrlanguage-configuration.json-
r/r.tmLanguage.jsonr.tmLanguage.jsonr.json
razor/language-configuration.jsonrazorlanguage-configuration.json-
razor/cshtml.tmLanguage.jsoncshtml.tmLanguage.jsoncshtml.json
restructuredtext/language-configuration.jsonrestructuredtextlanguage-configuration.json-
restructuredtext/rst.tmLanguage.jsonrst.tmLanguage.jsonrst.tmLanguage.json
ruby/language-configuration.jsonrubylanguage-configuration.json-
ruby/ruby.tmLanguage.jsonruby.tmLanguage.jsonRuby.plist
rust/language-configuration.jsonrustlanguage-configuration.json-
rust/rust.tmLanguage.jsonrust.tmLanguage.jsonrust.tmLanguage.json
scss/language-configuration.jsonscsslanguage-configuration.json-
scss/scss.tmLanguage.jsonscss.tmLanguage.jsonscss.cson
scss/sassdoc.tmLanguage.jsonsassdoc.tmLanguage.jsonsassdoc.cson
search-result/searchResult.tmLanguage.jsonsearch-resultsearchResult.tmLanguage.json
shaderlab/language-configuration.jsonshaderlablanguage-configuration.json-
shaderlab/shaderlab.tmLanguage.jsonshaderlab.tmLanguage.jsonshaderlab.json
shellscript/language-configuration.jsonshellscriptlanguage-configuration.json-
shellscript/shell-unix-bash.tmLanguage.jsonshell-unix-bash.tmLanguage.jsonshell-unix-bash.cson
sql/language-configuration.jsonsqllanguage-configuration.json-
sql/sql.tmLanguage.jsonsql.tmLanguage.jsonSQL.plist
swift/language-configuration.jsonswiftlanguage-configuration.json-
swift/swift.tmLanguage.jsonswift.tmLanguage.jsonSwift.tmLanguage
typescript-basics/language-configuration.jsontypescript-basicslanguage-configuration.json-
typescript-basics/TypeScript.tmLanguage.jsonTypeScript.tmLanguage.jsonTypeScript.tmLanguage
typescript-basics/TypeScriptReact.tmLanguage.jsonTypeScriptReact.tmLanguage.jsonTypeScriptReact.tmLanguage
typescript-basics/jsdoc.ts.injection.tmLanguage.jsonjsdoc.ts.injection.tmLanguage.json
typescript-basics/jsdoc.js.injection.tmLanguage.jsonjsdoc.js.injection.tmLanguage.json
vb/language-configuration.jsonvblanguage-configuration.json-
vb/asp-vb-net.tmlanguage.jsonasp-vb-net.tmlanguage.jsonASP VB.net.plist
xml/xml.language-configuration.jsonxmlxml.language-configuration.json-
xml/xsl.language-configuration.jsonxsl.language-configuration.json-
xml/xml.tmLanguage.jsonxml.tmLanguage.jsonxml.cson
xml/xsl.tmLanguage.jsonxsl.tmLanguage.jsonxsl.cson
yaml/language-configuration.jsonyamllanguage-configuration.json-
yaml/yaml.tmLanguage.jsonyaml.tmLanguage.jsonYAML.tmLanguage

Syntax highlighting and more for programming languages and file formats
by taken TextMate grammars and language configurations files from the
Visual Studio Code project
(<https://github.com/microsoft/vscode/tree/main/extensions>). For
details which file is taken from where see
"org.eclipse.tm4e.language_pack/about.md".

The content is created and can be updated via the Maven build script
"/org.eclipse.tm4e.language_pack/_update/pom.xml".
@mickaelistria
Copy link
Contributor

@howlger thanks!
@waynebeaton We need your input in term of IP validation here: what would be the best way to process with the files that are coming from VSCode? For language-servers, we usually submit a subset of VSCode source tree to IP review, should we just do the same here, but with a different subset?

@sebthom
Copy link
Member

sebthom commented May 2, 2022

Could you add a testcase that verifies that all syntax files are actually parseable by tm4e? I sometimes have the problem that 3rd party syntax files cannot be loaded, e.g. because of usage of unsupported regex expressions etc., e.g. tamasfe/taplo#245 and if we want to provide a default language package we should ensure that all files contained actually work.

@waynebeaton
Copy link
Member

I don't know whether it is allowed and legal to take files from the Visual Studio Code project in this way, and if so, what information has to be provided and where.

It's really entirely about the license. If the license permits us to grab the content in use it in the manner that we intend to use it, then it is allowed.

@waynebeaton We need your input in term of IP validation here: what would be the best way to process with the files that are coming from VSCode? For language-servers, we usually submit a subset of VSCode source tree to IP review, should we just do the same here, but with a different subset?

Follow the normal process. For the time being, this means create a project code contribution CQ and the IP Team will review the content for you. When you do create the CQ, please let the IP Team know that I've already had a look and have some insight to share.

The short version is that I ran the entire extensions directory from the vscode repository through the scanner and it appears that almost everything is distributed under the MIT. Some other licenses are represented, but AFAICT, everything is permissive.

The about.html file needs to include the actual license information about the content. I wrote a quick script to extract license information from the scan results from VSCode extensions repository that I think gives us what we need (see preliminary results below; I'm pretty sure that there's at least a few extra things in the list). Before you do anything with this information, though, create the CQ and I'll point the IP team in the right place to grab this information and help them sort through it. Once that's done, we'll likely ask you to update the repository to include license information (in the easiest way possible).

bat : MIT
clojure : MIT
coffeescript : MIT
cpp : MIT and LicenseRef-scancode-boost-original
csharp : MIT
css : MIT and LicenseRef-scancode-unknown-license-reference and LicenseRef-scancode-boost-original
dart : BSD-3-Clause
diff : LicenseRef-scancode-boost-original
docker : Apache-2.0
emmet : MIT
fsharp : MIT
git-base : MIT
go : MIT
groovy : LicenseRef-scancode-boost-original
handlebars : MIT
hlsl : MIT
html : LicenseRef-scancode-boost-original
html-language-features : MIT and FTL and LicenseRef-scancode-warranty-disclaimer and LicenseRef-scancode-srgb and Apache-2.0 and MIT
ini : LicenseRef-scancode-boost-original
java : MIT
javascript : MIT and LicenseRef-scancode-boost-original
json : MIT
julia : MIT
latex : MIT and LicenseRef-scancode-boost-original and LicenseRef-scancode-unknown
less : BSD-2-Clause and MIT
log : MIT
lua : LicenseRef-scancode-boost-original
make : LicenseRef-scancode-boost-original
markdown-basics : LicenseRef-scancode-boost-original and MIT
markdown-math : MIT
objective-c : MIT
perl : LicenseRef-scancode-boost-original
php : MIT
powershell : MIT
pug : MIT
python : MIT
r : MIT
razor : MIT
restructuredtext : MIT
ruby : LicenseRef-scancode-boost-original
rust : MIT
scss : MIT
shaderlab : MIT
shellscript : MIT
sql : MIT
swift : MIT
typescript-basics : MIT
typescript-language-features : MIT and LicenseRef-scancode-unicode and W3C-20150513 and LicenseRef-scancode-unknown-license-reference and Apache-2.0
vb : LicenseRef-scancode-boost-original
xml : MIT
yaml : MIT

@mickaelistria
Copy link
Contributor

Follow the normal process

OK, we have https://gitlab.eclipse.org/eclipsefdn/emo-team/iplab/-/issues/2399 opened for VSCode (all of it!) already. So should we just wait for completion of those CQs?

@howlger
Copy link
Contributor Author

howlger commented May 3, 2022

Could you add a testcase that verifies that all syntax files are actually parseable by tm4e? I sometimes have the problem that 3rd party syntax files cannot be loaded, e.g. because of usage of unsupported regex expressions etc., e.g. tamasfe/taplo#245 and if we want to provide a default language package we should ensure that all files contained actually work.

This sounds like a good idea, and I hope someone else is able to do it. 😉

I tested all languages manually with an example file I created myself. The languages that didn't work for me are commented out in the plugin.xml file (the update script gets the information whether there are issues or not from the file _update/info.xml), but the grammar and language configuration files are included, hoping that the files will pass the IP check and the issues will get fixed.

I cannot reproduce the issue of the given example. To my understanding this should not be an issue with Java 8 or higher (see Baeldung: "Up until Java 8, we might run into the limitation that unbound quantifiers, like + and *, are not allowed within a lookbehind assertion."). But \p here (which I don't know what it means here; I'm only aware of \p{...}) causes a PatternSyntaxException.

@howlger
Copy link
Contributor Author

howlger commented May 3, 2022

@waynebeaton Thanks for the reply. Your explanation and the ScanCode license categories makes it a bit clearer to me.

Should the about.html placed in the plugin or in the feature or in both? Could you please link to an about.html that can be copied and adapted?

I guess the feature copyright note has also to be changed for the files not under the EPL, right? Could you please provide a wording for this or point me to an example?

What's about LicenseRef-scancode-unknown-license-reference and LicenseRef-scancode-unknown? Do these need to be manually checked by the IP team?

@mickaelistria
Copy link
Contributor

iP ticket approved. We''ll proceed with the merge soon.

@waynebeaton
Copy link
Member

Should the about.html placed in the plugin or in the feature or in both? Could you please link to an about.html that can be copied and adapted?

Every plug-in should have an about.html file. There's help in the handbook, including a pointer to some examples (I'll fix the double indirection).

I guess the feature copyright note has also to be changed for the files not under the EPL, right? Could you please provide a wording for this or point me to an example?

Leave the feature copyright and license statement as is. I'm thinking that the easiest way to account for the licenses is to drop a list/table based on the list I generated above into the about.html file.

What's about LicenseRef-scancode-unknown-license-reference and LicenseRef-scancode-unknown? Do these need to be manually checked by the IP team?

Yes, they need to be checked. Basically, it's scancode's way of saying something to the effect of "I found a license, but I don't know what it is". We've since tuned our tools that interpret the scancode results to ignore LicenseRef-scancode-unknown-license-reference because indicates a reference to a license, not actual statement of license (that is, it says something like "subject to the terms of the license found in the root of this repository).

AFAICT, the LicenseRef-scancode-unknown was resolved as a false positive, so just drop that from the list.

@waynebeaton
Copy link
Member

iP ticket approved. We''ll proceed with the merge soon.

The "IP ticket" that you're referring to is for the review of third party content. This appears to be a significant contribution of project code, much of which is forked from third party sources.

What I told you to do was this:

Follow the normal process. For the time being, this means create a project code contribution CQ and the IP Team will review the content for you. When you do create the CQ, please let the IP Team know that I've already had a look and have some insight to share.

@howlger
Copy link
Contributor Author

howlger commented May 20, 2022

Thanks @mickaelistria and thanks @waynebeaton for the helpful explanation.

@mickaelistria, I will change the about.html file as @waynebeaton told me so you can then create the project code contribution CQ. I intend to fix the about.html this this weekend, but may need another week.

@mickaelistria
Copy link
Contributor

@mickaelistria
Copy link
Contributor

IP approved!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants