-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch back to the Apache-v2 license #1310
Comments
My last comment on the R-data.table project members list 10 days ago was :
This remains unanswered. Can you confirm? This came up recently here : #1232 Note that I explicitly left the door open to Apache in Rdatatable/data.table#2456. But my problem with you is that you've been sneaky. You tricked the data.table contributors by rewriting data.table and licensing fread differently to the rest of it. You ignored all those discussions. You ignored the agreement. You're saying that Rdatatable/data.table#2456 didn't matter, that it was a waste of time, and you don't need to discuss with the past contributors to data.table in R. Otherwise, what was the point of Rdatatable/data.table#2456? |
@mattdowle I did not answer your comment, because the comment was rude. It is rude because you're talking to me, while at the same time referring to me in 3rd person. Also, it would be easier to answer if instead of accusations, you actually asked a question to be answered. Do I believe that "all the rest I did myself"? No, of course not. I never said or implied as much. Many people contributed to the project, and it is thus a collective effort. Some parts were borrowed from other projects -- all properly attributed, according to their respective licenses. You also correctly noticed that |
This is now like the twilight zone. You did say you wrote everything other than fread yourself. You said you could change the license without asking data.table contributors. I asked you that question and you replied yes. |
Could be I misunderstood your question, or you misunderstood my answer, or both. Glad that we cleared that up. |
What you propose, Pasha, sounds reasonable to me. By the way, are there any disadvantages that we should anticipate switching back to Apache license? |
@oleksiyskononenko Apache license will allow the code to be more easily reused. To me, and many other people, this is an advantage: it will allow the code to survive longer, and be useful to more people. But to some other people, this is actually a drawback, as the code can be incorporated in a proprietary software. Attractiveness to developers. Stemming from the same ideological difference, the license will have an effect on who wants to contribute to the project. There are developers who would never want to contribute to a GPL/MPL project. Conversely, there are developers who will not contribute to an Apache/MIT/BSD project. (And of course, there are those who don't care either way). By switching into Apache, we will become more attractive to the former group, and less attractive to the latter. It is my understanding that within the Python community the first group (Apache advocates) is much larger than the second, and therefore the benefit outweighs the cost. |
The MPL is very liberal in terms of reuse. I was clear in Rdatatable/data.table#2456 that R-data.table can be used in proprietary software: that is the express wish of R-data.table contributors (to pick one example of a group of people). Please will you address the issue of datatablePRO being created which I explained clearly in Rdatatable/data.table#2456 too. If datatable is Apache then anyone can create a closed-source improvement of the library itself: datatablePRO. What you've written above appears to misunderstand this concern. datatablePRO would not be reuse but competition with the original work by standing on its shoulders and taking advantage of the contributors who contributed on the basis of the library remaining open-source. Lets say a company creates datatablePRO. Would you continue to contribute your evenings and weekends to datatable, only for that effort to be ingested by that company into datatablePRO for free, making it better for free and that company make all the money from it? The only restriction of MPL pertains to the library itself to prevent datatablePRO being created. What is wrong with that? |
@mattdowle There is a difference between being able to reuse the software in binary form, and in source code form. For example, you wrote the sorting function in data.table. Since it provided superior performance, it was later included into the base R. This was possible only because data.table and R had compatible licenses. Similarly, because data.table is licensed as MPL, and the majority of R packages are GPL, there is no problem for anyone to incorporate data.table code into their projects. Thus, data.table is "nice" to other R developers. The situation is different with Python. The Python itself, as well as the majority of Python modules, are licensed under Apache-like licenses. Which means no other python module can incorporate or otherwise benefit from datatable's code. As such, datatable is "not nice" to other Python developers. Which is a shame, considering that we include code from several other Apache/MIT/BSD projects. Thus, MPL license prevents not only the creation of a hypothetical "datatablePRO" but also other quite legitimate forms of reuse.
First, I don't believe this to be a very realistic scenario. I do not know of any closed-source proprietary module in the Python ecosystem. But even if, by a bizarre twist of fate, such product does appear -- wouldn't it be great? It means they'll be developing the product, and I'll be happy to work on other things. Or not happy. But regardless of how I personally feel, the user community will be the ultimate winners -- they'll have an even better tool than before. I think this is a good commitment to make: to work for the benefit of the users, and not for my own. And certainly, I do not strive to prevent others from making a profit where I could not. But even more importantly, the possibility for datatablePRO (or datatable2) to appear, is actually important for project's survival. No matter what happens to me or other project developers, with Apache license someone else would be able to create their own clone and continue the development. @mattdowle I appreciate your concern for my evenings and weekends; and your efforts to prevent |
I spoke with all |
Another point of reference: Google's policies on the use of external packages (https://opensource.google.com/docs/thirdparty/licenses/) state the following with respect to MPL license:
This shows that MPL is not particularly corporate-friendly. Other companies may have similar, or even more stringent restrictions (unfortunately, not many will publish their policies openly). Previously I said that the choice of license only impacts developers, not users. Now I stand corrected: the users are affected, if they are constrained by the policies of the companies they work at. |
The context of that Google document is important.
The penultimate paragraph (their bolding) :
So for instance, AGPL is the only license Google asked me not to use for data.table. Because then they couldn't use it in their web services (AGPL considers a web service to be distributing; stricter than GPL). Your comment :
Another twilight zone moment. Most corporations do not ship software, and even for those that do, MPL is pretty amazingly friendly even allowing it to be used in closed-source software for goodness sake. It's even less restrictive than the LGPL. MPL FAQ 6 :
How on earth you can label that "not particularly corporate-friendly" beats me. |
@mattdowle This feels like a "glass-half-empty / half-full" kind of argument. Surely, it's only the external reuse which is restricted, while internally the package can be used freely. But why have the glass half-full, when it can be 100%-full with an open license?
This is actually quite easy to answer. Imagine yourself at the helm of a small (or large) company. Would you want to build your stack with software that restricts you? Today you may be running a small pizza shop and it doesn't matter; but tomorrow you'll want to sell your innovative pizza-making software to all pizzerias around the world -- and it suddenly does. Today you're happily writing data-transformation pipelines at Google -- but tomorrow it's suddenly not Google but Alphabet, and your pipeline suddenly connects several legally distinct companies. Surely, MPL is more friendly than GPL or AGPL; but it is definitly less friendly than Apache or MIT. And if a corporation has any choice, then the license might become one of the crucial factors in their decision-making. |
Ok. I'm imagining. I think I would be perfectly happy to use MPL software. Because I would know that I did not want to take advantage of the contributors of that library by competing with them and trying to kill their library. I would understand that I don't want to create datatablePRO. On the contrary, I want to create closed-source pizza software that uses the datatable library, and I'd appreciate that's encouraged. Further, if I had a contribution to make to the datatable library, I would be more likely to contribute to the library because I would feel protected that nobody else would try and create datatablePRO and make a ton of money thanks to my free and stupidly trusting significant contribution. I would also be suspicious if the copyright holder of the library was a small company who might i) go bust or ii) change the license on me later. That would be a risk for my pizza software business. Finally, I would look at the history of the library and check that past contributors were respected because that would be a sign the project will flourish. |
A small business owner doesn't care about whether anyone makes money off datatable or not. If anything, having 2 competing versions of the library is better for him: competitions spurs faster innovation and makes it more likely that at least one of them will survive longer. A smart business owner does care about whether or not he can make private modifications to the library. These can be small, such as changes necessary to accommodate his build environment; or large, such as his own pizza-related functionality built into the internal C++ code.
Are you talking about the H2O.ai company here? It is very unlikely to "go bust". Far less likely than a project supported by volunteers only, who may not be able to make the ends meet next week.
Practical difficulties with "looking back at the history" aside, this is a great maxim! Surely, past contributors ought to be respected. Now, what happens if there is a disagreement? Well, we don't have any kind of formal procedure yet -- but presumably, some kind of vote has to occur. It also seems desirable to give the voting power to contributors according to a measure of their contributions. Doesn't have to be a linear function, but it should be at least increasing. Say, a person who once fixed a typo somewhere in the documentation shouldn't have as much say as a person who spent many years working on the project. Also, it could be a good idea to weigh more recent contributions more compared to older contributions -- this would encourage new people to join the project, and existing members to keep contributing. |
After re-reading the text of the MPL-2 license, I've come to the conclusion that we have started the "datatable PRO" argument based on an invalid premise. In fact, the MPL license does not preclude a third party from creating and distributing a closed-source solution which is based on So in summary, MPL does not offer any real protection against creation of "datatable PRO", and neither does Apache/MIT. |
Comparing the MIT vs the MPL-2 licenses, summary of the arguments presented so far is as follows:
Given all the above, I believe the Apache/MIT license to be a greatly superior choice for the |
But the MPL-2 license contains this paragraph :
Why do you write that a LICENSE file is not sufficient when the license itself says it is? |
The text of the license says "...You may include the notice in a location (such as a LICENSE So, in order to properly license a binary file, the following mechanism is suggested:
|
Given your arguments here, I don't see any reason not to support an Apache-v2 license. |
@arnocandel
Given this analysis, I'd say Apache-v2 handily defeats MIT. |
The absolute majority of Python packages are using Apache, MIT, BSD, or similar open licenses. It would be courteous to the broader Python community, and invite broader collaboration/contribution, if we did as well.
Historically, this project has been Apache from the very first commit. However, sometime before the public release, we switched to MPL-2 license. The idea was to have the same license as R data.table project (which at that time switched from GPL to MPL too). Unfortunately, we failed to grasp the primary difference between R and Python communities at that point: the majority of R packages are licensed as GPL, and within such environment, an MPL-licensed project can be integrated freely and will be seen as more open compared to others. On the contrary, within Python community, an MPL license is more restrictive and will be eyed with suspicion. In fact, MPL license creates a perfectly tangible barrier: ASF includes this license into the Category B list of software that can only be integrated in binary, but not in source code form.
Please, share your thoughts/comments.
The text was updated successfully, but these errors were encountered: