New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some .pm categorised as Perl6, some as Perl #2149
Comments
|
I've also just noticed it here: Which is also Perl 5 |
|
I'm afraid there is not much that can be done with Linguist here. A few solutions to fix your case:
|
|
@pchaigno I can certainly see how difficult this is. Just to clarify, if we use the |
That's correct. Unfortunately, the highlighting and the search results are currently not affected by the overrides :/ |
|
Dang, that's a real shame. Do you know if there's any plans to link the overrides in at some point? |
|
I think it is considered. @arfon will know for sure. |
Yeah, there's an issue for this here. We'd definitely like to add this functionality - it's just a bunch of work on our end (for GitHub/infrastructure reasons). |
|
... for now, the EMacs and Vim overrides are probably your best bet: https://github.com/github/linguist#using-emacs-and-vim-modelines |
|
Cool thanks @arfon We've actually opted for the |
|
👍 ok thanks @mintsoft |
|
If Linguist can't distinguish between Perl 5 and Perl 6, why does it default to Perl 6 here? The odds of it being Perl 5 are much much much higher. I think that's the real bug. |
Because the 'likelihood' is determined by the bayesian classifier and not any other custom rules about languages which are more likely purely based on popularity. As @pchaigno discussed, getting Linguist to do the right thing here is hard so I'd encourage you to take a look at the overrides to address this issue. |
|
Based on the linked code, there seems to only be certain heuristics, but no way to provide a default 'else'? ...
else
Language["Perl"]
end |
|
It is possible to provide a default 'else' but we try to avoid writing heuristics that way. Especially as we don't actually know the file extension we're working with when we're in the heuristic. |
|
What about adding popularity of a language, however vaguely it is defined, as part of the heuristics? Regardless of Linguist's implementation, Perl5 has 150,717 modules (not all of them on GitHub though), while Perl6 has only 330. It's incredibly more likely that an ambiguous file is Perl5 code, not Perl6. |
It's possible we could do something with popularity somewhere in the language classification but the heuristics isn't the right place. |
|
I could even suggest that, in the interim, the lack of a |
|
Perl 6's design docs have some thought in the matter: see point 8 in this bullet point list. In practice, this means that it should be safe to assume a .pm file is Perl by default, and only treat it as Perl 6 if the first line which is not empty and not a comment starts with one of: PS: |
I agree with this, it does seem odd that the weighting appears to be in favour of Perl6
This seems sensible to me |
Just asked about this on #perl6 IRC channel (irc://irc.freenode.net/#perl6) and FROGGS said that it likely won't work in this case, because we're also trying to decide whether the code is Prolog.. Here is the full discussion: http://pastebin.com/5LMub5X0 Also, to add to the discussion, nwc10 had this to suggest:
|
Surely that's relatively straight forward, if there are |
|
Would there be any possibility of re-opening this issue so someone with more intimate knowledge of Ruby/Linguist could perhaps see if any of the suggestions above are viable and could be implemented, to make Linguist's guesses more accurate? |
|
@zoffixznet I've reopened it for now, I'll leave it open unless the github staff close it |
|
excellent, thanks! |
|
This is excellent, thank you, but just wondering, does this also help with .t files? I don't see it referenced in the commits, and these seem to be the most often marked as Perl6 mistakenly. |
Not currently. We could write a heuristic that is only for What do you think @pchaigno? |
|
@arfon in my experience, it's mostly the Perl 5's |
Port of mojolicious/mojo@19cdf772 Before this clarification the project listed as 7% Perl6 code >.< The explicit listing is needed, as there apparently won't be a fix within Github itself any time soon: github-linguist/linguist#2149 (comment) github-linguist/linguist#2781 (comment) github-linguist/linguist#2074 (comment) Language names sourced from: https://github.com/github/linguist/blob/master/lib/linguist/languages.yml
Hey,
We have a collection of
.pmfiles, all of which are Perl 5; however about 10% are being classed as Perl 6 incorrectly:I've had a quick look and I can't see what the particular reason is. A specific example is:
Erroneously Perl 6:
Correctly Perl 5:
cc @jagtalon @MrChrisW
The text was updated successfully, but these errors were encountered: