-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding .yy and .yyp (GMS2 Project / metadata files) as JSON, and GML heuristics #4130
Adding .yy and .yyp (GMS2 Project / metadata files) as JSON, and GML heuristics #4130
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking a stab at this!
I could use some help with running my branch of linguist inside of an example project. (I get an error message
Could not locate Gemfile or .bundle/ directory
)
Do you have a proper Ruby environment configured?
Maybe I should also add some heuristic rules for .yy and .yyp files? They are pretty easily distinguishable from Yacc files.
👍
I've left a few comments to address below.
lib/linguist/languages.yml
Outdated
@@ -2079,6 +2079,8 @@ JSON: | |||
- ".topojson" | |||
- ".webapp" | |||
- ".webmanifest" | |||
- ".yyp" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The file extensions (except the first) need to be sorted.
lib/linguist/heuristics.rb
Outdated
@@ -541,5 +541,13 @@ def call(data) | |||
end | |||
end | |||
|
|||
disambiguate ".yy" do |data| | |||
Language["JSON"] if /\"modelName\"\:\s?\"/.match(data) | |||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll need to add a case for Yacc here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I thought it would fall through. Would a simple "else" be sufficient in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, my need above might be a bit strong. It's better if we have a case for Yacc 😄
Is the presence of the modelName
key sufficient to identify GML JSON files? (i.e., is it mandatory in those files?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty much. All the .yy files I've seen have a modelName. As does the .yyp file. But like you said - the .yyp file isn't documented in the languages.rb file so I've removed it in the latest commit. I've extended this testcase to first go for XML, then Graph Markup Language, and then a GameMaker file. The latter is harder because its a generic programming language and has no "required" keywords in the file. XML does, and it looks like Graph Markup Language does too (http://www.fim.uni-passau.de/fileadmin/files/lehrstuhl/brandenburg/projekte/gml/gml-technical-report.pdf) - the graph [
and potentially the node [
.
lib/linguist/heuristics.rb
Outdated
|
||
disambiguate ".yyp" do |data| | ||
Language["JSON"] if /\"modelName\"\:\s?\"GMProject\"/.match(data) | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this required? Did you find other languages using .yyp
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure, don't think so. Will look this up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious: is there a guarantee that only one or zero whitespace characters follow modelName:
? If not, you should replace \s?
with \s*
so it'll match 2+ blanks instead of 1 or none. 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good one, the * would be better. They are automatically generated files though, so I'm sure its usually 1 whitespace, but the * is better. Committing as we speak!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've extended it since I've checked pretty much all .yy files, and they all contain a modelName element with a value that starts with GM
.
"modelName": "GM
a match on that one is what I'm doing for this rule.
disambiguate ".gml" do |data|
Language["Game Maker Language"] if /^\s*(var|sprite_|vk_|image_|draw_|visible|sprite_)/.match(data)
end
EDIT: I've added checks for XML and Graph first, then make it fall through to Game Maker Language. Its probably easier to do this way and make sure. I'm still pretty unfamiliar with Ruby so I'm not even sure if this code runs. I've been testing my regexes with this online tool; http://rubular.com/ but I wouldn't feel calm getting this merged without it being properly tested. I'm sorry this is a bit slow & sluggish - I'm not familiar with Ruby. @pchaigno is there a IRC / slack / discord channel where I can ask for some guidance or this is place OK? |
So the test is failing because;
I'm not sure where I can pull an example file from - I can find some example files online but I'm not sure about the licenses on that one
I suppose this is to check if Yacc files don't get detected as JSON, while GM files do? So thats 2 birds one stone
I'm not sure what to do with this one :( |
I've also taken this opportunity to add .gmx files. They are from GameMaker:Studio 1 (EOL). Haven't found any other .gmx files in the languages.yml so this should do just fine. GameMakerStudio 2 uses the .gml, .yy and .yyp files. |
Hey @pchaigno, are the changes I made valid like this? |
Ping @pchaigno :x sorry for my impatience :P |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For next time, it would be easier to review if the independent changes were in separate pull requests. For example, everything regarding the .yy
and .yyp
extensions looks ok and could be merged as is.
However, I couldn't find the license for the 3 other samples, including the two .gmx
files. Did I miss something? If not, would it be possible to use other sample files, under a permissive license?
samples/Yacc/words2.yy
Outdated
@@ -0,0 +1,53 @@ | |||
/* File taken from http://home.adelphi.edu/~sbloch/class/271/examples/lexyacc/words2.y . */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't find the license for this. Is it indicated anywhere on the website?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Me neither, I find it very hard to find an example file for the .y files with a license - Since it was published like this online on an .edu site I was assuming. As we all know, assumptions are the mother of all f*ckups, so I'll have to find something else for this.
The GMX files are my own, I'll add a MIT license to the repository there.
@pchaigno Sorry for the inconvenience. I've included a Yacc file from an MIT repository (https://github.com/zcbenz/yacci/blob/master/examples/funccal/calc.y) and I've added the MIT license to my own repository for the 2 GMX files. |
…gamemaker-yy-and-yyp-files
I take it these failures are unrelated to this PR?
|
The test failures appear to be caused by recent changes to licensee which has resulted in a slightly modified hash for the four licenses in question. I'll address this in #4152 |
Merging master into your branch should fix the failing tests. |
…gamemaker-yy-and-yyp-files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes @RobQuistNL!
I've got two concerns below, but we're getting closer to something ready to be merged.
@@ -531,5 +541,13 @@ def call(data) | |||
end | |||
end | |||
|
|||
disambiguate ".yy" do |data| | |||
if /\"modelName\"\:\s*\"GM/.match(data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these files generated? If not, shouldn't we match on modelName: "GM
too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are generated, but I've only seen them with the double quotes. I put the \s
in there just in case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, thanks for the precision. We should probably add a case in generated.rb
for those files then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty much all the .yy and .yyp files will be generated. The GML files (which the .yy files will point to) are 1:1 contents from the IDE. .yy contains options and a lot of meta-data for the IDE.
I've put this in place to disambuguate Yacc from GameMaker's JSON files - do you think we need the same check in generated.rb? Pretty much, if its a .yy and it has that "modelName": "GM
format in there its a GM generated file for sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you think we need the same check in generated.rb?
Yes, I think it's worth it. Is that the only indication that the file is generated? We often have clear comments such as Generated by XXXX
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately GM doesn't do that. Its plain JSON, but at least it has a consistent format. I'll add a case to generated.rb.
lib/linguist/heuristics.rb
Outdated
Language["Graph Modeling Language"] | ||
else | ||
Language["Game Maker Language"] | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can be sure we've eliminated the two other candidate languages here. Are we even sure /^\s*(graph\s?\[|node\s?\[)/i
will match all Graph Modeling Language files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I could tell the Graph modeling Languages always have a graph [
or node [
element.
I haven't found a lot of docs, but I'm quite sure that it then will not be Game Maker Language. (text [
is invalid syntax), unless its in a comment ofcourse. I'll be searching a bit more for the Graph Modeling Language file formats.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here there is some more documentation. All files I've seen so far have that graph [
and node [
stuff.
I've edited the Graph Modeling Language regex - it will now match most of the files found here; https://github.com/search?p=9&q=extension%3Agml+graph&ref=searchresults&type=Code&utf8=%E2%9C%93 I can't find any hard evidence that |
Hmm, might be an issue, since |
We could anchor the |
Here goes 🤞 Thanks a ton for your quick responses and your patience @pchaigno! I really appreciate it. |
disambiguate ".gml" do |data| | ||
if /^\s*(\<\?xml|xmlns)/i.match(data) | ||
Language["XML"] | ||
elsif /^\s*(graph|node)\s+\[$/i.match(data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still want to match on:
graph
[
foobar
right?
Any reason you changed the [\s\n\r]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought \s would include \n and \r, hence I removed it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
\s matches any whitespace character (equal to [\r\n\t\f\v ])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This format is in sample.gml and now also in sample3.gml. I think the succeeding tests should prove that the regex works, right? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, right! my bad!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad for initially writing \s\r\n ;)
test/test_heuristics.rb
Outdated
def test_gml_by_heuristics | ||
assert_heuristics({ | ||
"Game Maker Language" => all_fixtures("Game Maker Language", "*.gml"), | ||
"Graph Modeling Language" => all_fixtures("Graph Modeling Language", "*.gml") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add a case for XML too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added the testcases for .yy files as well.
@pchaigno Ping! Congratulations on winning the finals BTW :) Hope your hangover is not too bad ;) |
…gamemaker-yy-and-yyp-files
@pchaigno Sorry :x ping again |
Should we add a check in |
lib/linguist/generated.rb
Outdated
# Return true or false | ||
def generated_gamemakerstudio? | ||
return false unless extname == '.yy' || extname == '.yyp' | ||
return false unless lines.count > 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why require min. 6 lines if we only look at the third?
lib/linguist/generated.rb
Outdated
def generated_gamemakerstudio? | ||
return false unless extname == '.yy' || extname == '.yyp' | ||
return false unless lines.count > 5 | ||
return /\"modelName\"\:\s*\"GM/.match(lines[3]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the comment you mention the third line, but here you match on the fourth. Typo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, yeah, was already on it ^_^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(can't run the tests locally so abusing Travis for it :x )
@pchaigno The generated test should work now ^_^ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Thanks for your patience @RobQuistNL!
Oh, if anything I should thank you for your patience @pchaigno ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for the PR and welcome to Linguist.
Awesome! Thanks :) |
Fixes #4129
This change makes sure the .yy and .yyp files are correctly detected as JSON files within a Game Maker project.
I'm having difficulties running this tool in an example project - I could use some help with running my branch of linguist inside of an example project. (I get an error message
Could not locate Gemfile or .bundle/ directory
)Description
Not sure how the Yacc language responds to this. Maybe I should also add some heuristic rules for .yy and .yyp files? They are pretty easily distinguishable from Yacc files.
Checklist: