Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new tests (and possibly licenses and rules) from https://github.com/retrography/OS-Licenses/ #54

Closed
pombredanne opened this issue Aug 10, 2015 · 9 comments

Comments

@pombredanne
Copy link
Member

Some licenses in https://github.com/retrography/OS-Licenses/ by @retrography may not be detected as full exact licenses. They should

@retrography
Copy link

Hi @pombredanne! Thanks for the mention, but keep in mind that my list is by no means complete. I just gathered those who were the most prevalent within the Ruby community. There are alternative / more comprehensive collections at the following repos:

https://github.com/idleberg/Creative-Commons-Markdown by @idleberg
https://github.com/okfn/licenses by @okfn
https://github.com/ufal/lindat-license-selector by @ufal
https://github.com/oss-collections/licenses by @oss-collections

And the following websites:

https://www.blackducksoftware.com/resources/data/top-20-open-source-licenses
https://tldrlegal.com
http://www.gnu.org/licenses/license-list.html#Introduction
http://opensource.org/licenses/alphabetical

Given how prolific the business of open-licensing has been lately, one has to prioritize about what to include first!

One very useful step for any further project would be to gather all these in one place in vanilla text, free from formatting. That is what I had in mind when I launched my repo, and I will gradually go towards it. It is highly appreciated if you add more licenses to the collection and make pull requests. That will benefit everyone, including scancode.

@pombredanne
Copy link
Member Author

@retrography Thanks! Your pointers are valuable.

The licenses in scancode (all the 1000+ of them) are in plain vanilla text too in https://github.com/nexB/scancode-toolkit/tree/master/src/licensedcode/data/licenses
The dataset is CC-0 public domain too like yours.

I would be glad to submit a PR if you want, though your value may be diluted a bit if you get a ~1000 new licenses?

On the scancode side, and about your dataset, I found out after a quick scan that there is an opportunity for some refinements to ensure the license detection rules are more comprehensive or the need to add some not yet covered (less common) licenses that are in your list and not yet in scancode such as the hilarious https://github.com/adversary-org/wtfnmf/blob/master/COPYING.WTFNMFPL
or the more serious but less common CECIL-2.1

Generally speaking I am always on the lookout for new bits to enhance the license detection dataset.

@retrography
Copy link

@pombredanne Oh, I hadn't noticed the content of this directory. That is a great collection, I actually won't need to put together my own collection anymore... So, don't worry about my collection, it can stay the way it is.

But it'd be great if you can separate the license collection from the main repository and put it in a dedicated repository, and then link the two repos. Like that your license collection will gain visibility on its own, and it can be used as a reference for other possible projects (like text analysis of the OS licenses, etc...). And also, people will tend to contribute new licenses to your collection, if the collection becomes the reference on the matter.

By the way, this is a great piece of software. I am looking forward to using it for the next version of my paper. The README file of the project needs a "How to cite" section, so that the users get to know how they can cite your work.

@pombredanne
Copy link
Member Author

@retrography

But it'd be great if you can separate the license collection from the main repository and put it in a dedicated repository, and then link the two repos.

A great idea indeed.... I hate using git submodules though that can be easily fixed with a quick script, and to your point it makes the collection actually visible and not buried down with the code. It is carefully and continuously curated with several additions very week!

By the way, this is a great piece of software. I am looking forward to using it for the next version of my paper. The README file of the project needs a "How to cite" section, so that the users get to know how they can cite your work.

Thanks for the kudos!
About "how to cite", the generated scans contain a small disclaimer which should be all you need (see https://github.com/nexB/scancode-toolkit/blob/master/NOTICE#L20 ) but I am less familiar with citations, what would be your take on how to present a real citation?

@retrography
Copy link

@pombredanne

We can't include that kind of disclaimer in an academic paper. An academic citation follows a very specific format. I give you an example from the statnet package that I use regularly for my analysis:

Handcock M, Hunter D, Butts C, Goodreau S, Krivitsky P, Bender-deMoll S and Morris M (2015). statnet: Software Tools for the Statistical Analysis of Network Data. The Statnet Project.

You can also provide the bibliographic entry, so that the users can format the citation according to the outlet they publish in:

@Misc{,
  author = {Mark S. Handcock and David R. Hunter and Carter T. Butts and Steven M. Goodreau and Pavel N. Krivitsky and Skye Bender-deMoll and Martina Morris},
  title = {statnet: Software Tools for the Statistical Analysis of Network Data},
  organization = {The Statnet Project (\url{http://www.statnet.org})},
  year = {2015},
  note = {R package version 2015.6.2},
  url = {CRAN.R-project.org/package=statnet},
}

Have a look at here: https://en.wikipedia.org/wiki/BibTeX

@pombredanne
Copy link
Member Author

thanks mucho... @retrography should there be person names or an org name is enough (or not OK?) ?

@pombredanne
Copy link
Member Author

stupid question... sorry: your example is clear

@retrography
Copy link

@pombredanne No, it is not silly.

The most important field is the author field. That is the field that necessarily appears and is emphasized in every citation format. The organization field may be ignored/suppressed in some. So, make sure you set the author field to what you want to be known as the actual author of the software. This may be a person, or a collective, but actual individuals are preferred. Putting the name of an organization as the author does not mean that the organization holds the intellectual property rights to that work, but it rather means that the work is truly a collective work and it does not have one or several main authors.

When you put your organization as author, it will appear in the citation as follows:

Centers for Disease Control and Prevention. (2009). CDC recommendations for the amount of time persons with influenza-like illness should be away from others. Retrieved from http://www.cdc.gov/h1n1flu/guidance/exclusion.htm

@pombredanne
Copy link
Member Author

Fixed in develop

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants