MPL 2.0 License and necessary codebase changes #12
Labels
type:milestone
Issues representing milestones since they don't recognize markdown...
type:refactoring
Code changes that improve maintenability, performance, etc
Both me and @rnkazman are in agreement to go with a MPL 2.0 instead of GPL, for the reasons stated here.
As of this version, however, my guess is that the code can't "legally" be redistributed due to any potential conflict of licenses themselves OR otherwise is implicitly GPL. This issue contains collected links, evidence, or sanity checking on the best course to go about this, and to index needed changes in order to go with the said license. This whole issue is of course IANAL.
1. R is GPL, does that make anything R GPL?
Maybe.
1.1 Not because of the R language per se, according to this twitter discussion between Hadley and another person, which cites, in turn, this page from the GPL license:
However, maybe yes because all functions from
base R
are themselves license as GPL, a point raised on the discussion:There is a public statement from R foundation that seems to discourage this interpretation, but it's murky. The same argument is also raised here concerning using base R.
More on the confusion is summarized in this Reddit thread which someone even suggests the NAMESPACE forces GPL too:
In short: It is not clear if using Base R or simply using the R package mechanism implies GPL. I believe
data.table
, an R package who recently switched from GPL to MPL 2.0 gets away with this because they do not use base R themselves.Apparently, we don't use it either for now, but it's hard to say if we won't in the future.
2. Dynamically Linking with GPL packages.
2.1
library(aGPLlibrary)
The second and more prominent problem comes from this. Specifically, using
library(aGPLlibrary)
seems to falls under the clause from GPL that the entire codebase must turn GPL. According to GNU website:The interpretation above has been associated to give a margin of doubt on
library(aGPLlibrary)
being viral, and is why data.table turned MPL 2.0.In short, for our purposes, this means we have to let go of 4 of the current packages currently in use to avert trouble:
I am not sure yet if this is possible for
stringr
(the other option would be baseR), and forjsonlite
(not sure on an option yet).lubridate
alternative may also be just baseR.igraph
can possibly be circumvented, save for the projection operation to generate co-change, and thus file-file network projections weighted by number of modifications). Then again, DV8 provides HDSMs, which could be parsed to construct a file-file network weighted by number of modifications.The second change is that the parse_igraph functions would be modified to not output igraph objects (as I can't load the library). Rather, it would output edgelists with type and weight, a common public format widely adopted to represent graphs, much like adjacent matrices and not particular to
igraph
. This in turn could be used by a GPL igraph notebook in a separate repo to showcase the work.The good news is that the visualization library itself I use is MIT licensed: https://github.com/datastorm-open/visNetwork/blob/master/DESCRIPTION so maybe even the visualizations can be made available still, without relying on igraph.
2.2 Can we refer to any GPL code at all?
This was the next pertinent question: Currently Kaiaulu relies on data output by a GPL program, namely
Perceval
. Perceval parses the gitlog for us, so we don't need to reinvent the wheel. Is that a problem? Well, again the answer depends, but this time it seems favorable to us.Specifically, Kaiulu interacts with Perceval (and to many more tools in the future) via command line. For example:
kaiaulu/R/parsers.R
Lines 27 to 31 in 77f0d66
What does GPL says about this? GPL says maybe again. However, the general intuition I draw from it is that it is ok if your code does not look like a wrapper to the program itself. I am emphasizing the items by separating in bullets below, but the original text contains none:
4.1 But if the semantics of the communication are intimate enough, exchanging complex internal data structures, that too could be a basis to consider the two parts as combined into a larger program.
In essence, we fall under item 4, and I believe items 1 to 3 are the reason why Section 2.2 of this issue is so confusing.
Does our use of Perceval fall under 4.1? I doubt it. We just get the data it downloads from elsewhere parsed. Moreover, I think my reasoning on going forward with OO-R6, which is also a package under MIT license, also seems to make sense due to the following discussion:
This means #11 is ideal. Having abstract classes defining how other interfaces can integrate clearly goes along with the description above:
Etc.
3. Another use case and a few more examples: Xgboost
This library was also subject to the same concerns. The discussion covers a lot of points not covered here. But, being a prominent library we can piggyback on some practices. Namely:
stringr
isstringi
, which is not GPL, as xgboost did.ok
to be added underSuggested
packages, and even code can be made available to plot, provided an alternative interface is available. See here for an example where igraph code is available as an alternative.DESCRIPTION
(https://github.com/dmlc/xgboost/blob/646def51e02d4017ac85065a10ca763e8941d62a/R-package/DESCRIPTION) file has a good guideline overall to where GPL packages and other permissive licenses packages should be.4. Remaining Challenges
Finding a json library that is not GPL based, and any package that helps with replacing lubridate should be it. For lubridate maybe
?IDateTime
will do, which is from data.table, albeit experimental.Edit: Disregard jsonlite concern. It is MIT licensed. I don't know where I got the idea it wasn't.
Edit 2: See also: https://stat.ethz.ch/pipermail/r-help/2008-July/169332.html
Edit 3: See this to handle dates without lubridate: http://biostat.mc.vanderbilt.edu/wiki/pub/Main/ColeBeck/datestimes.pdf
The text was updated successfully, but these errors were encountered: