-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow external dependencies #7339
Comments
Hello. Maybe a first step would be to see which external dependencies could be useful ? |
Same requirement for soft external dependencies. Not sure about the issue with git bisect. I can't believe we're the first to face the problem, and it's been around for long enough so a good solution should have been developed. On useful external dependencies: |
@toolforger -- one solution for |
I don't think that the |
At the same time I'd like to see |
So you should modify the requirements draft so that multidispatch makes it. Sent from my mobile phone.
|
You didn't list what to me is the main argument against dependencies, which is bitrott. Based on your last bullet point for for dependencies, I think we believe in opposite things. I think that moving unused code away is the best way for it to die. At least code that is in SymPy has tests run against it, and has visibility. I also want to avoid fragmentation of the community. Also, you missed a pretty important soft dependency: gmpy. |
I would put an extra stipulation that hard dependencies have a liberal license (BSD-style). |
I disagree that you won't be able to build a community about multipledispatch. If you really believe that, then to me that's a strong argument that it shouldn't be a dependency. If Matthew Rocklin gets hit by a bus and multipledispatch ceases to exist, then we have a mess on our hands. But if Matthew Rocklin gets hit by a bus and multipledispatch lives on, the mess is less so. |
Oh, just realized that you want people to just edit the top post. I'll do that. |
So I've edited the top post with those issues. |
I think we are maybe conflating a few issues here into one, which will make them harder to resolve. I think the following are not equal:
That also doesn't take into account how core they would be to the functionality of SymPy or how "optional" the dependencies would be (for instance, gmpy is a lot more optional than numpy: everything that can be done with gmpy installed can be done without it, only slower). Really, each case has to be looked at individually. |
I strongly agree with everything that Aaron said. Very good points about multipledispatch. I think that separating a project from sympy, let's say "polys" gets less exposure, not more. (commenting to the last point in the issue description). |
A project being outside of the codebase doesn't stop us from being able to test against it.
We are fragmented anyway; this isn't necessarily a bad thing. It's impossible to keep a group of this size entirely cohesive. I don't think that I've ever touched the physics code. I think that the PyDy model has been super-successful. They develop at their own (generally faster) pace but keep in touch with and contribute back to the core.
I'm going to claim that multipledispatch is simple enough, of small enough scope, and has sufficient documentation, so that a moderately experienced developer can pick it up without much trouble in the case of my unexpected demise. Small projects with limited scope are more robust to large automobiles.
I think that |
I guess it boils down to how to best manage our work. Things are different for depenencies like Python, gmpy, that are managed by different people and communities. But things like mpmath, pydy, multipledispatch are managed by people from the same community really (i.e. they contribute to sympy as well as other projects). Github makes maintaining and contributing to multiple projects very simple (and if anything, this will only become simpler in the future). In fact, I think it's a good workflow to develop an idea outside of sympy, get contributors to it, get it up and running and then start talking how to best use it with sympy, as you have done with multipledispatch, or as I am doing with CSymPy, or as PyDy guys are doing with it. For multipledispatch, if I was doing it, I would maintain it as a separate project, tried as many other projects to use it as possible, and simply copied it to sympy to be used within it. Two options can happen:
Anyway, that's just what I would do, you might want to do things differently. (Note: mpmath is the same thing --- it's developed outside, but copied in sympy for now. Last time I checked sympy was the only project depending on it, so it belongs to the second option. If lots of other projects in Python start depending on it, it will become the first option.) |
The general issue of dependencies is a large part of why I lost interest in contributing to SymPy. FWIW, I agree with everything @mrocklin mentioned. I'll just add that in practice SymPy is very far from being dependency-free. The minimal setup to do useful things with SymPy includes IPython, numpy and possibly matplotlib. The Debian package pulls in even more stuff: the last time I sudo apt-get installed python-sympy on a new system, it installed about 500 MiB of random stuff. Anyway, I hope that |
Why ever forking again every external project sympy depend on? AFAIR - the only sane reason from mpmath-related discussions was mythical "hard to install". It's not hard, really.
According to debian's popcon: there is ~300 mpmath installations vs ~1300 of python-sympy (which still uses bundled copy of mpmath). |
@rlamy, @skirpichev --- just so you know, I am not "dead-set" against dependencies (for example in CSymPy we use dependencies instead of reimplementing things on our own). But I do think there are pros and cons, as written up above in the issue description, so it is not a black and white decision. Why don't you help us clearly spell out the conditions in "Requirements for hard (required) external dependency"? Because clearly, as it is written now, mpmath or multipledispatch does not satisfy it (e.g. mpmath doesn't pass the first point "widely used library", as @skirpichev pointed out, it's used within sympy, other usage is minor, and there are no packages in Debian that depend on it, besides sympy would if we split it). So from your comments and our past discussion, I think you must not agree with this point --- either you don't agree that the requirement "widely used" should be here, or you don't agree that "mpmath does not pass this requirement". Would you agree? And so I think we need to spell this point out more clearly and clarify whether or not the listed packages pass this point. I am really open about this --- let's have a frank discussion about these requirements. I offered above what I think are good requirements, let's call them A. Since you disagree with them, please write up requirements which you think are better for sympy, let's call them B. And let's discuss req. A and req. B. |
On Thu, Mar 27, 2014 at 06:13:57AM -0700, Ondřej Čertík wrote:
I don't think that my data can prove this but not reverse. The
First. I think it's a very minor issue (through, I'm not sure that mpmath In my view, requirements for hard external dependency should
|
I've edited the hard requirements section to show off what I think of as the virtues we're looking for, e.g. "no license issues" rather than specific hard requirements e.g. "BSD style license." Please review and edit. I like @skirpichev s note that the project should be able to stand on it's own. This would be specifically important for projects that we wanted to pull out of SymPy. |
@skirpichev --- check out the issue description, @mrocklin updated it, does it reflect the way you see it? For reference, this is what's in there now:
|
I think so. Probably, my first item in the list is self evident, so it's ok to omit it. |
Related to this discussion is this document: http://web.ornl.gov/~8vt/TribitsLifecycleModel_eScience_2012.pdf, see the chapter "V. SELF SUSTAINING OPEN SOURCE SOFTWARE: DEFINED", which talks about dependencies too, e.g.: |
@certik If you read the rest of the chapter, you'll see that the authors don't support your position, cf. "For example, a given downstream customer may only fundamentally need a few classes but if the software has entangling dependencies within itself, the customer may be forced to port hundreds of thousands of lines of code just to get functionality that should be contained in a few thousand lines of code." or "the goal is not to have zero dependencies". |
@rlamy I agree. |
I like the discussion that has happened so far, we've raised and discussed a number of issues. I think that this discussion would be better focused with a particular action that we could consider, alter, and decide on. To that end I propose the following:
I'm happy to do this work and submit a pull request. Disclaimer, this is a bit self serving, as I'm also trying to gain a bit of exposure for I don't think that discussion is over, I just want to up the ante a little by proposing something concrete. |
Here is a use case of multiple dispatch to clean up set simplification From my perspective this PR is ready to go if we accept dependencies. |
One thing that I am worried about the new |
Yup, that's a valid concern. On the flip side once multipledispatch has users it's much harder for me to experiment and change things. However, in this particular case there are a couple of things we can do.
Finally, multipledispatch has a very small scope. I consider it to be fairly complete now. I mostly expect only performance tweaks and some Python 3 sugar in the future. |
Very cool. If we do decide to use it as a dependency, I at least want to spend few days playing with it (I haven't had a chance yet, besides sending some trivial PRs). This is a huge change, so I want to make sure we don't screw up. |
Right, I should also add the disclaimer that |
I am fine with external deps in general, but we should consider each of them separately. |
We talked about it a bit on G+ with 8 GSoC mentors. I am ok with external Sent from my mobile phone.
|
@rlamy so if we change our policy on this will you start contributing again? |
I'd like to keep the BSD-style restriction, at least for hard dependencies. I don't want to prevent SymPy from being usable by people who cannot use (L)GPL code. Plus unlike @skirpichev I absolutely hate the GPL :) |
Also, more practically, having an (L)GPL dependency would mean having this isolated codebase of code that cannot ever be imported into SymPy, even minor chunks of copy-paste (this is one reason I hate the GPL btw). |
I also share Ondrej's concern about API stability. I think in general, a requirement for a hard dependency should be that the project adhere to some of the same standards that SymPy does (or at least tries to :). For example:
At the end of the day, it should make little difference to me if I am contributing to core SymPy or to a dependency. So far, the best way to do this has been to keep code in SymPy itself. I think it's not impossible to have it otherwise, though, as numerous other projects have shown. I think it goes back to my comment above of different kind of dependencies. @certik @mrocklin @rlamy @skirpichev etc., what are you opinions on each of the six bullet points from that comment? |
You can't copy&paste GPL'd code, but you can inspect it and write it "in your own words". |
(Actually, most people believe you can't inspect it, as that would also be considered "derivative work". The only way is that one person inspects and writes a specification, and another person implements this specification without looking at the GPL code.) |
Those people are mistaken. Just by looking at the code you do MOST DEFINITELY NOT create a derivative work. The "clean-room approach" exists for another reason: To make it 100% provable that no copying ever could have happened, not even subconsciously. |
About the bullet points: Is there a way to have different versions of a library coexist in SymPy? I'm a bit uneasy about using a different policy for test. In general, I do not think that external dependencies are a problem. Enough rambling :-) |
Here is a good discussion about using mpmath as an external dependency, starting with this comment: |
It's essentially another potential problem: Is upstream cooperative enough with bug fixing that it's worth it using their code? |
@certik , @mrocklin , @smichr , @asmeurer In my point of view, if the 3rd part dependency is mature, robust, approval, api-stable and pure python. It would be nice to installed as external dependency.
What is your opinion? |
@certik , @mrocklin , @smichr , @asmeurer , Currently, Python language provide the mechanism in nature for dependency management. And |
@stevenleeS0ht SymPy now uses many optional dependencies (NumPy, SciPy, LLVM, ...). So I think this issue is now fixed. |
Currently SymPy depends only on the standard library; it does not allow dependencies on external code. Should we stop this tradition and allow external dependencies?
Reasons against dependencies
git bisect
stops working, because each commit in general depends on a different external library version --- so one would need to quickly install proper external dependencies for each commit (that is hard to do currently)Reasons for dependencies
Possible requirements for hard (required) external dependency
This is a list of possible requirements, we should select some of these.
Some of the virtues above are subjective.
Here are some other nice things that tend to imply these virtues
Requirements for soft (optional) external dependency
Hard Dependencies Already Found Bundled In SymPy
Soft Dependencies Used in SymPy
Useful External Dependencies
SymPy Modules that could gain more exposure if separated
The text was updated successfully, but these errors were encountered: