New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Java Properties language for .properties extension #4098
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for submitting the pull request @eager!
Because
.properties
would remain unambiguous, I didn’t add samples.
We still require a sample file for each extension.
LGTM otherwise!
Using @Alhadis' Harvester script, I collected 2913 sample urls (from the 19M total) from the search @eager Based on this analysis, I think we'll need to keep the |
@pchaigno thanks for the review and pointer to Harvester. I had only been doing spot-checking of about a dozen results, so I’ll grab some more samples from there for disambiguation. A few questions:
I noticed in my initial spot-checking that there was some usage of
key = value
|
Not really. When we do find such languages we consider them for language addition though. That means we then follow the appropriate steps from the contribution guidelines. Of course, we only add them if they meet the in-the-wild criteria.
Then, we should first try to estimate their number on GitHub. Do you know keywords from that language we could use for the search? (From the documentation you linked to, it's not clear to me whether
Usually, we don't assign any language in the Heuristic strategy if we can't recognize a known pattern or keyword. Not assigning a language, means we defer to the next strategy, the Classifier strategy. I'm expecting there are much more INI files than Java Properties files, so I'd be tempted to default to INI here, if we identify such a @lildude @Alhadis What do you think (about defaulting to INI for
Yes, I think we should remove it from the Java group in any case. It feels a bit weird to have a data language being marked as a parent programming language (and I'm not sure how that would end up counting the files...). |
As this is only for the .properties extension, I think we might be safe to do so. There are definitely a lot of uses of it as an INI-type file which clearly isn't related to Java, but I could be missing something. |
Nudge. |
Sorry for the delay @eager. Could you add a test for the new heuristic rules in EDIT: Actually, there's a failure we need to handle. I'll have a look. |
lib/linguist/heuristics.rb
Outdated
Language["INI"] | ||
end | ||
end | ||
if /^[^#!][^:]*:/.match(data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be an elsif
? It's either key=value
or key:value
, right? I think that will fix the test failure.
@eager Did you get a chance to work again on this lately? |
24c8807
to
29868e3
Compare
@pchaigno apologies for disappearing on this for a while. Updated, with the test failure fixed. |
This pull adds a Java Properties language for
.properties
.Description
I noticed that our some of our project’s
.properties
files had incorrect syntax highlighting after escaped#
:Because INI and Java Properties are very similiar—Codemirror’s
properties
mode appears to be a super-set of both—the disambiguation heuristic is somewhat complex. It replicates an outer-if
block ([key]=[value]
) that will matchINI
, unless the file also matches Java Properties’ comment markers (^[#!]
), similar to the following code:Checklist:
I am adding a new language.
I am associating a language with a new file extension.
I am fixing a misclassified language
I am changing the source of a syntax highlighting grammar
I am adding new or changing current functionality