Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mercury noconflict #1049

Merged
merged 10 commits into from Apr 21, 2014
Merged

Mercury noconflict #1049

merged 10 commits into from Apr 21, 2014

Conversation

sebgod
Copy link
Contributor

@sebgod sebgod commented Apr 6, 2014

lib/linguist/languages.yml:
Add the declaration for the Mercury language.

samples/Mercury:
Add samples for the classifier as Mercury shares it's filename extension
with several other languages.

PaulBone and others added 6 commits October 29, 2013 15:01
lib/linguist/languages.yml:
    Add the declaration for the language.

samples/Mercury:
    Add samples for the classifier as Mercury shares it's filename extension
    with several other languages.
Detect Inno Setup installer scripts (http://www.jrsoftware.org/isinfo.php)
extend the vendor/ exclusion to handle vendors/

Some projects use this folder to store external libaries (eg https://github.com/Elgg/Elgg)
lib/linguist/languages.yml:
    Add the declaration for the language.

samples/Mercury:
    Add samples for the classifier as Mercury shares it's filename extension
    with several other languages.
@arfon arfon merged commit 2ef1305 into github-linguist:master Apr 21, 2014
@arfon
Copy link
Contributor

arfon commented Apr 21, 2014

Thanks @sebgod. @PaulBone sorry it's taken so long :-\

@sebgod sebgod deleted the mercury-noconflict branch April 22, 2014 06:19
@PaulBone
Copy link
Contributor

@arfon Thanks for merging this, while this does add Mercury to linguists'
known languages there are problems with setting the primary extension to
.mercury. When someone creates a mercury gist and then checks it out on the
command line they will have a .mercury file, which the compiler won't
accept. I'm sure this will create other problems too. The primary
extension should be .m

Thanks.

@pchaigno
Copy link
Contributor

@PaulBone I believe they are working on a fix for the problem with the primary extension (it will probably be removed). It takes time to do real tests on a system as large as GitHub. In the meantime, the .mercury file extension is better than nothing :)

@nox
Copy link
Contributor

nox commented Apr 24, 2014

@pchaigno You believe? What is your source? Faith? There is absolutely nothing more to test to verify that my patch works.

@pchaigno
Copy link
Contributor

@pchaigno You believe? What is your source? Faith? There is absolutely nothing more to test to verify that my patch works.

So aggressive...
The source: #1098.

@arfon
Copy link
Contributor

arfon commented Apr 24, 2014

@arfon Thanks for merging this, while this does add Mercury to linguists' known languages there are problems with setting the primary extension to .mercury. When someone creates a mercury gist and then checks it out on the command line they will have a .mercury file, which the compiler won't accept. I'm sure this will create other problems too. The primary extension should be .m

There's actually nothing in the Gist workflow that 'makes' people pick .mercury - check out this screencast I made below to show what the behaviour is if you pick .m

mercury

@nox
Copy link
Contributor

nox commented Apr 24, 2014

@arfon He was probably confused about the language picker being gray when a filename is entered.

@arfon
Copy link
Contributor

arfon commented Apr 24, 2014

@nox - good point - I agree that's a little weird.

@arfon
Copy link
Contributor

arfon commented Apr 24, 2014

@PaulBone check out those language stats: https://github.com/PaulBone/pbone_thesis

@PaulBone
Copy link
Contributor

@arfon, sorry I did misunderstand. I was under the impression that you
couldn't choose the filename at all. I should have checked first rather
than making this assumption.

I created this gist without specifying a
filename. https://gist.github.com/PaulBone/537a67e6e155c468554d

It gets given the filename gistfile1.mercury, which is incorrect but that's
okay. The user will already have to rename it so that the filename is the
same as the module name, in this case hello.m. So while this may be
confusing no matter what we do the interaction here will never be perfect.
It's just going to be a good idea for the user to enter a filename no matter
what is implemented in linguist.

Thanks again, and YAY for language statistics.

@arfon
Copy link
Contributor

arfon commented Apr 28, 2014

Hi @PaulBone - yes if you omit a filename completely then this is the behaviour unfortunately. I wonder how many people do this?

With the recent update to Gist though if someone specifies filename.m then at least Mercury is in the 'suggested' languages 😄

@PaulBone
Copy link
Contributor

PaulBone commented May 2, 2014

Yep, this is good.

I've noticed that the Mercury detection isn't perfect. Some Mercury things
are being detected as M and other languages. I'll try to learn more about
linguist in particular learn how to program it so that files containing
things that are definitly Mercury-ish and not likly to appear in other
languages have a higher chance of being detected as Mercury. For example
":- module ." ":- interface." and ":- include_module list."

@arfon
Copy link
Contributor

arfon commented May 2, 2014

I'll try to learn more about linguist in particular learn how to program it so that files containing things that are definitly Mercury-ish and not likly to appear in other languages have a higher chance of being detected as Mercury. For example ":- module ." ":- interface." and ":- include_module list.

I'm going to be discussing some of the heuristics stuff with @bkeepers and others tomorrow so if there are any tell-tale signatures for Mercury that you don't think exist in other language syntaxes (such as M) then add them in here if you like 😄

@whitten
Copy link
Contributor

whitten commented May 2, 2014

As I'm more interested in telltale syntaxes that distinguish M from other languages, can anyone tell me if limited sequences like single letters both preceded and followed by a space are distinguished from multiple letters by the Bayesian Matcher? This pattern is very common in M code, and I expect rare in other programming langauges. If not, I guess I need to put that in a heuristic code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants