-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make sure a GMS2 generated file is classified as such #4297
Make sure a GMS2 generated file is classified as such #4297
Conversation
lib/linguist/generated.rb
Outdated
@@ -588,7 +588,8 @@ def generated_graphql_relay? | |||
def generated_gamemakerstudio? | |||
return false unless ['.yy', '.yyp'].include? extname | |||
return false unless lines.count > 3 | |||
return lines[2].match(/\"modelName\"\:\s*\"GM/) | |||
return lines[2].match(/\"modelName\"\:\s*\"GM/) || | |||
lines[0].start_with?("1.0.0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would there be a way for me to include the weird character back in there? Its this one: ←(U+2190
- 0xE2 0x86 0x90
in utf-8).
It might make the checking a lot stricter, and then could possible replace starts with 1.0.0
with \d\.\d\.\d←
which would then support future versions :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PS: Just throwing the character in there breaks the testsuite;
1) Error:
TestBlob#test_language:
Encoding::CompatibilityError: incompatible character encodings: ASCII-8BIT and UTF-8
/home/travis/build/github/linguist/lib/linguist/generated.rb:592:in `start_with?'
/home/travis/build/github/linguist/lib/linguist/generated.rb:592:in `generated_gamemakerstudio?'
/home/travis/build/github/linguist/lib/linguist/generated.rb:93:in `generated?'
/home/travis/build/github/linguist/lib/linguist/generated.rb:12:in `generated?'
/home/travis/build/github/linguist/lib/linguist/blob_helper.rb:362:in `generated?'
/home/travis/build/github/linguist/test/test_blob.rb:281:in `block (2 levels) in test_language'
/home/travis/build/github/linguist/test/test_blob.rb:271:in `each'
/home/travis/build/github/linguist/test/test_blob.rb:271:in `block in test_language'
/home/travis/build/github/linguist/test/test_blob.rb:265:in `each'
/home/travis/build/github/linguist/test/test_blob.rb:265:in `test_language'
2) Error:
TestFileBlob#test_language:
Encoding::CompatibilityError: incompatible character encodings: ASCII-8BIT and UTF-8
/home/travis/build/github/linguist/lib/linguist/generated.rb:592:in `start_with?'
/home/travis/build/github/linguist/lib/linguist/generated.rb:592:in `generated_gamemakerstudio?'
/home/travis/build/github/linguist/lib/linguist/generated.rb:93:in `generated?'
/home/travis/build/github/linguist/lib/linguist/generated.rb:12:in `generated?'
/home/travis/build/github/linguist/lib/linguist/blob_helper.rb:362:in `generated?'
/home/travis/build/github/linguist/test/test_file_blob.rb:676:in `block (2 levels) in test_language'
/home/travis/build/github/linguist/test/test_file_blob.rb:666:in `each'
/home/travis/build/github/linguist/test/test_file_blob.rb:666:in `block in test_language'
/home/travis/build/github/linguist/test/test_file_blob.rb:660:in `each'
/home/travis/build/github/linguist/test/test_file_blob.rb:660:in `test_language'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RobQuistNL Is LANG
set in your environment? On OpenBSD, every test belched spurious encoding errors until I executed tests using LANG=en_AU.UTF-8 bundle exec rake
(aliased in my shell as simply brake
).
It fixed errors which looked extremely similar to the ones you've posted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are the reports from the tests ran on the CI: https://travis-ci.org/github/linguist/jobs/442092134
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And it was fixed by removing the ←
character from the matching line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not an expert on how encodings work in Ruby (especially in CI), but you can express Unicode characters as 7-bit ASCII using:
/\u2190/ # RegExp escape
"\u2190" # Analogous string escape
"\u2190" =~ /\u2190/ # => Returns 0, the index of the first match
"foobar" =~ /\u2190/ # => nil No match
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The =~
operator is shameless ripped off from Perl Ruby's way of performing a match between a regular expression and a value.
In PHP, the equivalent syntax is pcre_match("/\\u2190/", "input");
(if memory serves, it's been years since I bothered working with PHP. #Node4Life
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah cool, will try :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've looked into this today and attempting to match the ←
is definitely a bit of a struggle. I can get it working locally on macOS by using that char directly, but it doesn't play nice on Travis.
An alternate solution may be to more closely match the format of the line ignoring the content, for example:
lines[0] =~ /^\d\.\d\.\d.+\|\{/
... matches the 1.0.0
at the beginning of the line and the |{
at the end, ignoring everything in between and does the trick locally and on Travis here. We could probably get more fancy with the regex, but I'm not sure it's worth the effort.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah thanks, I didn't see your comment there for a while. Sorry! Yeah, I think we're going to have to walk this way since the actual char is not going to work.
Hmm, that explodes again..
|
@lildude Could you check this out, please? You're likely more familiar with this pipeline (and Ruby) than I am. |
@Alhadis can you confirm that the tests run on your own machine? I can't since I don't have the project running :x |
Yeah, no, I'm getting the same errors as CI is. This is with the I'll leave this for an expert to figure out. 😜 |
IMHO we could remove that one character - allthough it would be a pretty solid detection method :) |
I've not had the chance to look into this in-depth but this smells very similar to #4028 |
Is there any chance of it being related to
... and this doesn't look very good: # A mapping table of supported encoding names from EncodingDetector
# which point to the corresponding supported encoding name in Ruby.
# Like: {"UTF-8" => "UTF-8", "IBM420_rtl" => "ASCII-8BIT"}
#
# Note that encodings that can't be mapped between Charlock and Ruby will resolve
# to "ASCII-8BIT".
@encoding_table = {} Since |
I really wish I could help, but I'm a real big newbie when it comes to Ruby - I will at least try to setup the project on my Xenial machine soon. |
@Alhadis @lildude I have changed the regex to use lildude's Maybe its worth making an issue for the underlying problem - either UTF-8 characters ( For now this PR should be fine and fix the one wrongly detected file in GMS2 projects :) |
Seems like a decent compromise. 👍 Thanks for your patience! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution @RobQuistNL!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I'm making sure the generated .yy file is being matches as a generated GMS2 file, instead of Yacc.
Fixes #4293
Checklist: