Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the Solidity Language #2634

Closed
wants to merge 3 commits into from
Closed

Add the Solidity Language #2634

wants to merge 3 commits into from

Conversation

@pchaigno
Copy link
Contributor

There are many data files under the .sol file extension. We should probably recognize them if we don't want to end up with many miss-classifications...

@whitj00
Copy link
Author

whitj00 commented Sep 19, 2015

How would you recommend excluding those?

@pchaigno
Copy link
Contributor

pchaigno commented Oct 7, 2015

Sorry for the delay. Busy weeks.

We can't just exclude them, we have to recognize their language.

I downloaded 913 of these files. 660 of them match the same pattern: a file with only numbers. So this looks like some kind of data file but we still need to identify it. Once that's done, we can add it to Linguist with eventually a heuristic rule to distinguish the two languages.

Is the contract keyword mandatory in Solidity files?

@whitj00
Copy link
Author

whitj00 commented Oct 7, 2015

yes

On Oct 7, 2015, at 1:13 PM, Paul Chaignon notifications@github.com wrote:

Sorry for the delay. Busy weeks.

We can't only exclude them, we have to recognize their language.

I downloaded 913 of [these files (.sol NOT contract)]. 660 of them match the same pattern: a file with only numbers. So this looks like some kind of data file but we still need to identify it. Once that's done, we can add it to Linguist with eventually a heuristic rule to distinguish the two languages.

Is the contract keyword mandatory in Solidity files?


Reply to this email directly or view it on GitHub #2634 (comment).

@arfon
Copy link
Contributor

arfon commented Dec 15, 2015

@pchaigno - are you still planning on looking into this one?

https://github.com/search?utf8=%E2%9C%93&q=extension%3Asol+NOT+contract&type=Code&ref=searchresults

Edit: looks like at least some of them are related to circuit board soldering diagrams and http://www.elekterv.hu/eaglers274x/index.html and all end with M02*.

@u2
Copy link

u2 commented Jan 18, 2016

We are keen for it. 👍

@arfon
Copy link
Contributor

arfon commented Mar 9, 2016

🎏 flagging this as stale 🎏

@pchaigno
Copy link
Contributor

pchaigno commented Mar 9, 2016

It looks like I was wrong, few of these files match the language identified by @arfon. If we exclude Solidity files and files related to the circuit board soldering diagrams, we still have many unidentified files.
I don't think we can add the Solidity language without a not-a-language option. Otherwise, we will have many false positives :/

@arfon
Copy link
Contributor

arfon commented Mar 10, 2016

I don't think we can add the Solidity language without a not-a-language option. Otherwise, we will have many false positives :/

😞 I think you're right @pchaigno. Thanks for the followup. @whitj00 - I'm afraid this is a limitation with Linguist right now - if one (and only one) language defines a file extension (.sol in this case) then any/all files with that extension will be automatically classified as the language that is listing the .sol extension.

As @pchaigno mentioned, the long-term solution is for us to have an option to run our heuristics and return no language when it doesn't match. Right now though we don't have support for such a behaviour I'm afraid.

The only solution that we can consider in the short term is to add Solidity but without defining the .sol extension. This will allow you to use the Linguist overrides to manually set the language of your repositories in the .gitattributes data. Would this be of any value to you?

@arfon
Copy link
Contributor

arfon commented Mar 17, 2016

The only solution that we can consider in the short term is to add Solidity but without defining the .sol extension. This will allow you to use the Linguist overrides to manually set the language of your repositories in the .gitattributes data. Would this be of any value to you?

Ping @whitj00. Any thoughts on my last comment ☝️ ?

@arfon arfon closed this Apr 1, 2016
@graup
Copy link

graup commented Jun 5, 2016

@arfon, I think the last thing you proposed would be quite helpful, and it would also prepare for the day when you do add support for no language.

@LogvinovLeon
Copy link

As a guy comming from a competitive programming background I can say, that *.sol files are often used to store solutions to tasks. Solutions in general can have any format, so it's not possible to identify all of them.

Also from the official Sodility grammar I can claim, that any Solidity file would have contract or library keyword, so implementing a heuristicks would be easy, after we would have not-a-language

frangio added a commit to frangio/openzeppelin-contracts that referenced this pull request Apr 2, 2017
Solidity:
type: programming
extensions:
- .sol
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the discussion in #2634 the extension would have to go until the not-a-language feature is implemented.

@frangio
Copy link

frangio commented Apr 2, 2017

@pchaigno @arfon Can we get this PR merged without the .sol extension? This would allow us to use the override as @arfon suggested, which is still not currently possible.

@arfon
Copy link
Contributor

arfon commented Apr 2, 2017

@frangio - I no longer maintain this project but you can now use .gitattribute overrides to change pretty much anything about GitHub's default behaviour.

@frangio
Copy link

frangio commented Apr 2, 2017

@arfon, as I understand it, forcing language X in an override doesn't work if language X isn't declared as such in the linguist library (i.e. in the languages.yml file).

I created a repo to test the override with Solidity, but it doesn't seem to work.

@arfon
Copy link
Contributor

arfon commented Apr 2, 2017

Ah yes. My mistake. I didn't realise that Solidity wasn't defined.

@maraoz
Copy link

maraoz commented Apr 4, 2017

+1 to @frangio's request

veox added a commit to veox/linguist that referenced this pull request Apr 10, 2017
…rmasks.

Includes sample files from storborg/regerberate:

https://github.com/storborg/regerberate/tree/ed85bb545109950946af90118b058abf9b0bcd3b/samples/eagle

Which is MIT-licensed:

https://github.com/storborg/regerberate/blob/master/LICENSE

From the looks of it, *.sol files on github mostly fit one of the following categories:

1. Solidity - Ethereum Virtual Machine programming;
2. Eagle (electronics CAD) solder masks;
3. UVa Online Judge problem solutions.

This commit attempts to differentiate between the first two, using
a simple "signature" of (2) ending in `M02*`, as mentioned by @arfon
here:

github-linguist#2634 (comment)
@veox veox mentioned this pull request Apr 10, 2017
6 tasks
veox added a commit to veox/linguist that referenced this pull request Apr 27, 2017
…rmasks.

Includes sample files from storborg/regerberate:

https://github.com/storborg/regerberate/tree/ed85bb545109950946af90118b058abf9b0bcd3b/samples/eagle

Which is MIT-licensed:

https://github.com/storborg/regerberate/blob/master/LICENSE

From the looks of it, *.sol files on github mostly fit one of the following categories:

1. Solidity - Ethereum Virtual Machine programming;
2. Eagle (electronics CAD) solder masks;
3. UVa Online Judge problem solutions.

This commit attempts to differentiate between the first two, using
a simple "signature" of (2) ending in `M02*`, as mentioned by @arfon
here:

github-linguist#2634 (comment)
@Vishesh-Gupta
Copy link

Vishesh-Gupta commented Dec 28, 2017

I recently added a solidity file, it was not recognized with the languages and got to know about linguist. I think even though there are a lot of files with the same extension now but then most of them are solidity files so I think we can add it as a language to languages.yml and for other files they can use a .gitignore? as classifying a language helps out a lot.

@pchaigno pchaigno mentioned this pull request Mar 26, 2018
15 tasks
@github-linguist github-linguist locked as resolved and limited conversation to collaborators Jun 17, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants