New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Voxelwise stabilisation #675

Closed
wants to merge 37 commits into
base: master
from

Conversation

Projects
None yet
6 participants
@samuelstjean
Contributor

samuelstjean commented Jul 9, 2015

So a small pr, this adds a function to convert any rician/noncentral chi noisy signal to a noisy gaussian signal. This is mainly a cython file with the core computation for working on one voxel, as I have a nice function wrapping everything else in a multiprocessed way. I'll post it after this code here, or it woud be too big at once for a proper review. Oh and we can also fix the hardcoded array in dipy.denoise.estimate_sigma also.

@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Jul 9, 2015

Apparently pip gets its install file from a scikit image server instead of pypy and can't install new dependencies because of that, is that indeed the case?

@arokem

This comment has been minimized.

@arokem

This comment has been minimized.

Member

arokem commented Jul 9, 2015

What new dependency do you need and why?

@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Jul 10, 2015

Any reason why it's not the pip default instead of a custom repo? I'd need either cython binding to the GSL library (recommended for speed) or mpmath as a fallback for the confluent hypergeometric function. Turns out the mpmath version relies on an asymptotic expansion of the serie as soon as z > 2^7 , which does not work so well for the floating point version (in our case, z = -eta^2/sigma^2, with eta the value of one voxel and sigma^2 the variance of the noise). Hence, you can use the numeric version for the simple cases, and fallback to the symbolic version in mpmath proper when it does not converge (fast enough, because it never actually converges).

Having a look at the other scientific packages, matlab, maple and mathematica all use the symbolic version, while R and octave use the GSL version. So in this PR the choice of either the GSL or mpmath is offered, with a speed advantage at the gsl version, but you absolutely need one or the other since the floating point version does not always work.

@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Jul 12, 2015

Wow, it really does not like the option I set. I guess the precompiled numpy/scipy were to save build time, but is there an easy way to specify another source + the official python packaging index? According to https://pip.pypa.io/en/latest/user_guide.html#installing-from-wheels it should still install from the somewhere else if the wheel is not present, maybe I mistook the original error for something else, but I am at a loss.

@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Jul 13, 2015

Turns out the problem is twofold : you cannoit instal something not present on the travis scikit-image builder thingy and since cython compiles stuff, it complains when it cannot find a header, which is optionnaly used in our case, but it's still needed at compile time apparently.

@matthew-brett

This comment has been minimized.

Member

matthew-brett commented Jul 13, 2015

If you want to install via the 'DEPENDS' environment variable, then one of
these things has to be true:

Otherwise, you can install via another call to pip in the travis file.

@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Jul 13, 2015

I'll just remake another pip call then, it takes like 5 sec to install anyway and it will be easier than adding another package maintaining burden.

@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Jul 14, 2015

Ok, I really suck at this. Apparently, cython won't compile a cimported pxd file if it does not exist, thus rendering useless the choice of two different implementations. Any idea how to circumvent this? Maybe putting a glue file in python importing the real function and importing a dummy function fi not could work, but I am no computer scientist, so maybe it would kill the performance.
Any suggestion?

@arokem

This comment has been minimized.

Member

arokem commented Jul 16, 2015

I don't think that we can use cythongsl anyway. GPL license...

Please edit all of that out of here, and let's go with an mpmath only version of this. That should simplify your Travis installation pains as well.

@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Jul 16, 2015

I would strongly advise against that, for various reason.

  1. Cython-gsl works much faster than the numerical mpmath version and the symbolic mpmath version, which is needed in the latter case. Just for comparison, on a small test the cython-gsl version takes 2 minutes, while the fully symbolic version takes like 6 hours, which is riduculously slower.
  2. Scipy is numericaly unstable, so that would be the de-facto one to use if it is fixed.
  3. Between having something partly working or fully working, it is better to use the fully working one in my opinion. If people have a gripe agaisnt licensing ,they can still use the latter one. Besides, shore is totally unuseable without cvxopt (which is also gpl licensed), as this one at least gives you the choice.
@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Jul 21, 2015

Yay, it apparently builds now!

@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Aug 2, 2015

Feel free to review and comment it now that's it's working (and especialy that now some conflict are in apparently), I plan on doing another smaller one after this one to provide ease of use functions to apply it on more than one voxel at the same, but it would be too big for this one.

@arokem

This comment has been minimized.

Member

arokem commented Aug 3, 2015

I'm still with a firm -1 on adding the GPL dependency.

If there is a workable (if slow) alternative that doesn't drag us down that path, I think we should stick to that alternative.

@arokem

This comment has been minimized.

Member

arokem commented Aug 3, 2015

One option would be to develop this as a separate dipy "add-on", with a different license (like the cythongsl thing...). In the long run, I think that's what we should do with SHORE as well. Remove it from the main repo and have it be a separate installation, with a separate license.

@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Aug 3, 2015

Well, my first version had only the gsl dependency, so I made a second one with the mpmath dependency, and people complained it was too slow. So apparently ,from a end user point of view, if it's easy to install and it works, then people seems ot just install it and not bother with licensing and other issues, as long as it works. For those bothering with that, they can still install mpmath and run it.

Since it's not a hard dependency, it's not that bad. And now that a new comment popped up, wouldn't having a separate repo interfacing with this one bring it the same licensing problem? That's what is happening currently with shore and this function. Scipy did sort of that with something called scikit sparse (https://github.com/njsmith/scikits-sparse), and while very useful (phantomas is using it for example), the thing is hard to install and does not seem super maintained anymore.

@grlee77

This comment has been minimized.

Contributor

grlee77 commented Aug 3, 2015

I am not familiar with the confluent hypergeometric function, but it seems that is the only thing being used from GSL? If so, might the version in cephes (a version is included in scipy) be a possible alternative?
https://github.com/scipy/scipy/blob/master/scipy/special/cephes/hyperg.c

@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Aug 3, 2015

I already tried it and it's unstable. Scipy own version is actually wrapping something else I think, which is alos unstable. The whole point is to get a working version of the said confluent hypergeometric function, but sadly none I have tested actually work in all cases, except the GSL one and the mpmath symbolic version.

@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Aug 5, 2015

Oh and also, because it's a pyx file, the code that calls the gsl library is only written if it's present, meaning that if the user does not installs it, there is absolutely no reference in the code to it (because it would not compile in that case, so I had to dig into conditionnal compiling to make it work), which would mean that technically it dodges the GPL bullet if it's not used.

@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Aug 5, 2015

I also ran some bash time command on a 55x55x55x65 synthetic dataset, with multiprocessing and all the required functions, since this is only the innermost single voxel function. The difference in time thus only comes from the hyp1f1 implementation used,

For the mpmath version
real 3m40.836s
user 23m53.466s
sys 0m2.124s

For the GSL version
real 1m8.101s
user 6m15.715s
sys 0m1.672s

@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Aug 6, 2015

Well, while we argue about licensing and whatnot, scipy version got fixed, yay! I'll test it to see if it suits the purpose now, and we'll from there.
Also, I just tried the direct mpmath floating point version, and there still seems to be some stability problems on some dataset, so I hope scipy fixes it.

@arokem

This comment has been minimized.

Member

arokem commented Aug 6, 2015

👍

@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Aug 6, 2015

I just ran it on a full brain dataset, and the max error in the end with respect ot the gsl version si 10**-5, so based on this single test it seems good. Now the two questions arising :

  1. it's slower, but whatever... this is only on a 2x2x2 mm 65 DWI dataset, but as soon as you go to abusive dataset (HCP, HCP++ acquistion at 0.8x0.8x0.8 mm and 200+ DWI), the time constraint will become important since it scales with the number of voxels. So we could suggest gsl + scipy 0.17 as the slower alternative to not cripple it too much.
gsl version
real    7m13.706s
user    43m52.905s
sys 0m14.445s

fixed scipy version
real    11m50.516s
user    78m43.619s
sys 0m17.549s
  1. It's only gonna be in scipy 0.17, so 5+ years until it hits a stable linux repo such as red hat or debian. The problem is that the fix is in fortran files (specifically scipy,special.specfun.f, which becomes thescipy.special module through magical and probably complicated wrappers). Apparently @matthew-brett might have some idea how we can use this without installing a brand new scipy.
@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Aug 9, 2015

Ok. after all I think I'll just drop the mpmath version, I tried it on another plain old regular dataset and it trashed the whole thing for some reason (tons of brain voxels ended on 0 with mpmath.fp.hyp1f1).

So, verdict : recommended gsl version for speed, if not use scipy 0.17dev version since it seems to work on the aforementionned dataset. Any suggestion on how to handle the fortran files coming from scipy 0.17dev until it has a proper stable release?

@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Aug 18, 2015

Should be a bit simpler now, it can use either gsl for speed or a recent scipy version since it's now supposedly fixed. So far it's slower, but at least seems to work with what I had thrown at it.

@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Jan 15, 2016

Well, as this thing as been sadly sitting idly for some time, anyone knows where I could find working laguerre polynomials? The ones in scipy are not working, they might be using the same thing under the hood after all.

from scipy.special import *

def h1f1(a,b,z):
    return (gamma(1-a)*gamma(b) / gamma(b-a)) * eval_genlaguerre(-a, b-1, z)

a=0.5

In [10]: hyp1f1(a,12,-400)
Out[10]: 0.1655629069122436

In [11]: h1f1(a,12,-400)
Out[11]: 0.16556290691224354

In [12]: h1f1(a,12,-1400)
Out[12]: nan

In [13]: hyp1f1(a,12,-1400)
Out[13]: nan

In [14]: import mpmath
In [16]: mpmath.hyp1f1(a,12,-1400)
Out[16]: mpf('0.089318702604585311')

In [18]: eval_genlaguerre(-a, 12-1, -1400)
Out[18]: nan

Well at least it works in theory

@arokem

This comment has been minimized.

Member

arokem commented Jan 16, 2016

Have you reported this on the scipy mailing list/github repo?

@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Jan 16, 2016

The fact that hyp1f1 doesn't work? Its been flagged for a couple of years
(before they moved on github), lots of people complaining with issues, some
pr never merged to partially fix it. Its coded in fortran, and they must
use an unstable expansion or something, this family of functions is just a
nightmare numerically. I doubt a robust numerical implementation is the
even possible, as the function is ill defined (or even unexistent) for some
numbers.
On Jan 16, 2016 13:11, "Ariel Rokem" notifications@github.com wrote:

Have you reported this on the scipy mailing list/github repo?


Reply to this email directly or view it on GitHub
#675 (comment).

@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Jan 21, 2016

Seems like people decided to rewrite the whole thing, let's hope they can do it more robust than the old one.

https://mail.scipy.org/pipermail/scipy-dev/2016-January/021164.html

@samuelstjean samuelstjean force-pushed the samuelstjean:voxelwise_stabilisation branch from 47e57ae to 9ea53bb Feb 29, 2016

samuelstjean added some commits Feb 29, 2016

@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Feb 29, 2016

That's weird, rebasing always delete what I do and I have to rewrite it each time...

samuelstjean added some commits Feb 29, 2016

@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Feb 29, 2016

And now it works ;)

@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Mar 8, 2016

So, might @arokem or @Garyfallidis look at this fine piece of code? I know elef told me math is hard ot review some time ago, but it's just a singe fixed point formula (and downstream code), so I tried to keep it simple (and rather short) for now.

@arokem

This comment has been minimized.

Member

arokem commented Mar 8, 2016

Math is not a problem (not that it's easy). The GSL dependency still seems unnecessary to me. Besides the license issues brought up above, I can't even install this on my computer -- a mac laptop.

Or? How do I get GSL going on this platform?

@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Mar 8, 2016

Well easy, that's why I thinkered with the setup.py

you can

1a. get scipy 0.17
1b. get the gsl

if you don't have one or the other, it will complain. If it finds the gsl and scipy, it will use the former as it is about twice faster (so roughly a 64 dirs 128x128x60 datasets takes 7.5 mins vs 15 mins to process, and that is by using multiprocessing).

For the gsl, it's in every linux distro repository. On Mac, don't you guys use homebrew or some other custom repo? JC and maxime would know about that, but apparently you can install it with anaconda also, some people packaged it such as https://anaconda.org/asmeurer/gsl

And on windows, better go wth scipy 0.17, and it's gonna land in the next ubuntu lts anyway, so no big deal. And besides it should help the non linear fitting for the free water elimination stuff. Oh and the whole point is to offer both scipy and gsl as valid alternatives, so people can freely choose between the speed and ease of installation use if not, kind of like scipy and the fftw wrapper I'd say. And anyone else not using it of course can just stay on old versions, having everything optional is doable.

@fmorency

This comment has been minimized.

fmorency commented Mar 14, 2016

I'm not a dmri expert but I'm willing to perform a code review if that would help. This piece of code fixes an issue on real-world dataset.

Licensing is not an issue for us since we don't sell software and the license permits us to sell service using GPL software. That being said, we would very much like to have the fastest version available (GSL) since processing time is critical for us.

@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Mar 14, 2016

And also, the scipy 0.17 fallback is always available, and hopefully
works. By the way, this is kind of a software version of the mgh paper
from last summer on having real valued distributed noisy datsets using
the phase, but without needing the phase in the first place.

Le 2016-03-14 21:22, Félix C. Morency a écrit :

I'm not a dmri expert but I'm willing to perform a code review if that
would help. This piece of code fixes an issue on real-world dataset.

Licensing is not an issue for us since we don't sell software and the
license permits us to sell service using GPL software. That being
said, we would very much like to have the fastest version available
(GSL) since processing time is critical for us.


Reply to this email directly or view it on GitHub
#675 (comment).

@arokem

This comment has been minimized.

Member

arokem commented Mar 14, 2016

The choice of license is important to the project because we don't want to
restrict down-stream users and developers. See also:
http://nipy.sourceforge.net/nipy/devel/faq/johns_bsd_pitch.html

On Mon, Mar 14, 2016 at 1:22 PM, Félix C. Morency notifications@github.com
wrote:

I'm not a dmri expert but I'm willing to perform a code review if that
would help. This piece of code fixes an issue on real-world dataset.

Licensing is not an issue for us since we don't sell software and the
license permits us to sell service using GPL software. That being said, we
would very much like to have the fastest version available (GSL) since
processing time is critical for us.


Reply to this email directly or view it on GitHub
#675 (comment).

@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Mar 14, 2016

Well, that is why since python is dynamically loading stuff, people can
freely choose between the two. Kind of like using the positivity
constraints with shore is available if I wish it, and I can perfectly
use it with my own stuff. If I ever decide to distribute somethign based
on it, just don't use the positivity constraints and the license stays bsd.

Kind of like scipy does with fftw as I mentionned before I'm sure, it
offers speed vs bsd license for those who need it
https://scipy.github.io/devdocs/tutorial/fftpack.html

Le 2016-03-14 21:39, Ariel Rokem a écrit :

The choice of license is important to the project because we don't want to
restrict down-stream users and developers. See also:
http://nipy.sourceforge.net/nipy/devel/faq/johns_bsd_pitch.html

On Mon, Mar 14, 2016 at 1:22 PM, Félix C. Morency
notifications@github.com
wrote:

I'm not a dmri expert but I'm willing to perform a code review if that
would help. This piece of code fixes an issue on real-world dataset.

Licensing is not an issue for us since we don't sell software and the
license permits us to sell service using GPL software. That being
said, we
would very much like to have the fastest version available (GSL) since
processing time is critical for us.


Reply to this email directly or view it on GitHub
#675 (comment).


Reply to this email directly or view it on GitHub
#675 (comment).

@Garyfallidis

This comment has been minimized.

Member

Garyfallidis commented Mar 14, 2016

Sam, sorry if we didn't make it clear. But I think that it should have been obvious by now that if you want this PR merged you have to forget about the GSL dependency. Any example given cvxopt or fftw is irrelevant as you have now a working scipy version. A bit slower time is not as important as the license. If you want this merged you have to remove the GSL dependency otherwise please make another project and put there whatever dependencies or licenses you wish. But NOT in DIPY. I am giving you one week to make the change and remove the dependency. If you haven't done that in one week from now I will close this PR.

Felix thank you for the suggestion but here it's more of license issue that we are struggling with and even more than that a lack of understanding of the philosophy and the priorities of DIPY. Sam has been really stubborn for more than a year about this. But he is definitely wrong. A bit of speedup is not as important as the license. I hope this is understandable now. It would be really helpful if you can review his code. The more eyes the better.

@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Mar 14, 2016

Well, actually that is why you should definitely go read the setup.py.
It actually checks if you have either scilpy or the gsl and only compile
the needed parts. You always say that whatever is in scilpy should go
here down the road, but I feel like you did not even check whats is
contained in the pr in the first place. Well, if only scipy would work,
I would use it, but it might be fixed in 0.17 now, which still has some
numerical issues unresolved, but maybe they don't trigger here. The
whoel point of using the version form the gsl is that it works, unlike
old version fo scipy, and is actually useable in a reasonable amount of
time, unlike mpmath.

In any case, why not make a submodule with those depencies if you want
to have a clearer cut? It could only be built if explicitly requested
(and yes, I do know how to use setup.py magic flor cases like these) and
would also be possible to put shore there, which would be cleaner than
printing a warning.
Le 2016-03-14 22:31, Eleftherios Garyfallidis a écrit :

Sam, sorry if we didn't make it clear. But I think that it should have
been obvious by now that if you want this PR merged you have to forget
about the GSL dependency. Any example given cvxopt or fftw is
irrelevant as you have now a working scipy version. A bit slower time
is not as important as the license. If you want this merged you have
to remove the GSL dependency otherwise please make another project and
put there whatever dependencies or licenses you wish. But NOT in DIPY.
I am giving you one week to make the change and remove the dependency.
If you haven't done that in one week from now I will close this PR.

Felix thank you for the suggestion but here it's more of license issue
that we are struggling with and even more than that a lack of
understanding of the philosophy and the priorities of DIPY. Sam has
been really stubborn for more than a year about this. But he is
definitely wrong. A bit of speedup is not as important as the license.
I hope this is understandable now. It would be really helpful if you
can review his code. The more eyes the better.


Reply to this email directly or view it on GitHub
#675 (comment).

@Garyfallidis

This comment has been minimized.

Member

Garyfallidis commented Mar 15, 2016

I read your setup. Nothing changes.

Sorry Sam, you are again continuing the same repeating discussion that we are having now for years. This definitely shows me that you don't understand the project's priorities and that you do not respect the other peoples time. Please from now I would suggest to you to focus only on new patches and do not make new PRs for new features or new modules as we are every single time struggling to explain you basic concepts. Please also stop giving advice to newcomers as you are far from the philosophy of this project.

However the work you have done has value. So, what it seems to me is the best solution for you is to make your own project in github and add there the code the licenses or libraries that you want. Be happy to depend on Dipy and do not forget to cite us. Good luck with this and sorry that we cannot satisfy your needs.

@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Mar 15, 2016

So, I don't get it, why is it ok for shore and whatever I do is always
bad? That does not seem coherent. And I also thought you always
encouraged people putting stuff here instead of in scilpy, why the
change of heart?

Le 2016-03-15 03:35, Eleftherios Garyfallidis a écrit :

I read your setup. Nothing changes.

Sorry Sam, you are again continuing the same repeating discussion that
we are having now for years. This definitely shows me that you don't
understand the project's priorities and that you do not respect the
other peoples time. Please from now I would suggest to you to focus
only on new patches and do not make new PRs for new features or new
modules as we are every single time struggling to explain you basic
concepts. Please also stop giving advice to newcomers as you are far
from the philosophy of this project.

However the work you have done has value. So, what it seems to me is
the best solution for you is to make your own project in github and
add there the code the licenses or libraries that you want. Be happy
to depend on Dipy and do not forget to cite us. Good luck with this
and sorry that we cannot satisfy your needs.


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub:
#675 (comment)

@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Mar 16, 2016

In the end, since people are asking why the gsl is used here, I think that there is a lot of confusion on the reasoning that lead to this and the core issue it tackles, which is numericla stability. This PR requires the 1F1 hypergeometric function [1]. While many implementations are available, they all suffers for numerical innacuracies in certain value range. Here are my finding :

  1. Direct sum -> blows up numerically pretty fas.t It is also fast to evaluate, but if you ever see some code doing that, don't trust it.
  2. mpmath (matlab mathematica) -> Pretty much works almost everytime, because it is symbolic math, they use the formula linked here [1] in mpmath.
  3. scipy -> Blows up for our range of values (large third negative argument). Partially fixed in 0.17, the core recurence they use is known to have some issue, so unlike suggested previously, rewriting it in cython is garanteed to not fix the problem, as it lies in the formula itself. Still has some issues as of today, which would need a rewrite.
  4. gsl (octave, R, this PR) -> They implement a tons of workaround and different formula from aboitz and stegun. Works in all my test, but according ot mpmath has some problem in other ranges of values I did not encounter.
  5. Cephes -> It chooses on the fly between asymptotic and power series to compute the sum. Pretty much the same as scipy < 0.17, but it is nice enough to estimate the error it commits, so you can discard nonsense values in these cases, which sadly is pretty much always in our case

So this is the issue here is absolutely no way I can compromise on because it is the same as having a broken algorithm because of dependencies. This pretty much leaves 3 contenders, mpmath, scipy > 0.7 and gsl. Which brings us to issue no. 2, speed.

  1. mpmath is symbolic math, so it is always gonna be slow, but super accurate, by design. Also cannot be rewritten to f64 as we tested ourselves because of precision issues. About 100x to 1000x slower than scipy.
  2. I'd really like this one to work, but it did not in the past, and I did not thorougly test it it it's new fom, because the formula itself is known to have problems. Currnetly rewritten by 3 guys on the mailing list using another algrithm (Miller's for reference). Also 2x slower than gsl.
  3. The one that has been used for the past year and a half by everyone internally. Prety much the fastest, and does not have numerical problems in our cases.

So, before the issue that is licensing, if we stop there, I can put back the mpmath version, because I can not garantee that scipy works 100% of the time, and I'd hate for people to come here in 2-3 months with weird problems which simply cannot be fixed. It is still useful in this form for single voxel experiments, like marco phase stuff (he has been using it since he came to sherbrooke) or the people from shore who want to deal with the noise nature problem.

So, my offer is to use mpmath, because I can put my stamp of approval that it does work (by design). And the third one I went with is licensing for speed, which is exactly what is done in shore for speed reasons by using cvxopt instead of scipy, as written in comments here https://github.com/nipy/dipy/blob/master/dipy/reconst/shore.py#L231

So, take home message, I trust mpmath and it is still useful for single voxel experiments in this case. For speed, allow the choice of implementation (as I have done here, since it seems ok for other modules), or people can just come see me for full datasets I guess if you really don't want to.

[1]https://en.wikipedia.org/wiki/Confluent_hypergeometric_function#Asymptotic_behavior

@Garyfallidis

This comment has been minimized.

Member

Garyfallidis commented Apr 4, 2016

Hi @samuelstjean. Thank you for your answer. However, it is not a good answer because you don't say the full story:

a) You found the functions that do the actual computations correctly but you did not want to put the work to port them or understand the issue deeply. Neither you wanted to help scipy to have a faster and more robust version. You were looking for a quick fix and being completely heartless on that fact that your small feature may delay upcoming releases by adding extra dependencies. Forgetting that all the rest of us struggle with dependencies too and we have repeatedly recoded from scratch. Many times!!

b) You keep bringing the argument that because shore and mapmri use cvxopt then you should be able to do the same when actually it was you who made it a big deal and brought the discussion on the licenses in the first place. Do you remember? You were so sure that changing the project to GPL was a good idea and everyone should comply when our project and many others like scikit-learn, scikit-image, nibabel etc. have very different opinion than you and they have developers with much longer experience.

With this behaviour you redirected everyone's time from trying to solve problems to talking about licences. Which is the last thing that we want to do. That was very counter productive and it costed to the project valuable time. And you repeated the same behaviour for years! Mostly in personal interactions with other developers or in the list or in github.

Unfortunately, you still don't understand after years of discussions what this project is about. So, just to be sure that we are clear here. There will be no new dependency added neither we will accept slow versions of any sort in this PR. The only way for this to work is to do the actual job with the current policies as all of us do, independently of how hard it is. And if that method doesn't work then you need to invent something else or not do it.

So, how many times we have to reimplement things from scratch because there was nothing available with the right licenses? Hundreds!!! We even rewrote SyN. And that is what you should do rather than trying to change the philosophy or licensing of the project because of laziness or of a small or large difficulty. Nobody said that high quality development is easy. Or that everyone should or can do it.

Because of this repeating behaviour for the last years you have disrupted the development of the project always focusing our attention to your personal coding micro-issues. This PR is going to close now and that will be your last feature in the project.

From now on focus on patches and small bug fixes but not new features. Consider this as a warning. If you continue the same behaviour I will have to give you a red card. And believe me I don't want to but you have pushed us beyond our capacity. Finally, if DIPY is not for you, you can make your own fork and do there whatever your want at that fork. That is something great with open source that allows for different views to create new projects. Or you can also put your features in Scilpy or Exploredti etc. Many opportunities but no more new features here. Good luck.

@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Apr 4, 2016

It hink you 1. don't really understand the issue 2. really give me little credit for the time I spent on this,

A. Scipy has nothing to do with this, I pretty much tried all version and they all have issues. Because of the license, you simply cannot translate the gsl version as it would be a derivative work, and stay under gpl. So, nothing to gain there licensewise. Beside, they implement over 25 corner cases fixes vs 4 for scipy, so for such a small function, this also falls under the 'no, rewrinting an optimizer would be too much trouble' category like we told you about shore some time ago. People are still working on it, but you seem to underestimate the task from a lack of understanding about numerical computations I think. This is somehting that requires specialized algorithm for each region of the domain (or arbitrary precision math), which is probably why no version is perfect. I even think the function 1F1 (a,b,z) is undefned for some large negative values of z, and of course our case involve z= -eta2/(2*sigma2), with eta the signal value, so we always end up near singularities because of that, which makes the function blows up for about 90% of a b0 image beause of that in scipy. I did not wan ot bore people wiht the details, but if anyone wishes to enquire more about the function and various implementation, I do remeber tryign a lot of them,

B. Well, of course I brought the issue, you cannot call yourself an open source project if you don't play by the rules. It seems like you prefer downstream projects support vs direct user features (which is fine by itself, each project needs guidelines and goals), but you should definitely write that somewhere more explicitely (I suggest here for example http://nipy.org/dipy/mission.html, or contributing.md). At best, it is sneaky to include gpl code in an otherwise majorly bsd codebase without explicitely telling people/giving them the choice. And yes the whole reason behind this was speed, you can use the scipy version, but it will be so slow it would cripple the function for no reason. Better be obvious about it than people complaining about it in a few years. Other projectcan do what they want, but mixing both without tellign people or making special exception (although I do agree shore is much better with cvxopt as it is) is dubious in my opinion.

And well, pretty sure the part about sklearn could also be rewritten from scratch (actually, Emmanuelle did it for her machine learning classin about a month and a half roughly), but as you mentionned, this takes time, and it also will end up slower and less maintened than a real project which would use it directly. The whole point of libraries is to share code for better maintenance and not doing everything twice. Some are easier than others, while some took many years to develop and would bring little benefit if a good one already exists.

You alos mention rewriting everything, but for this one I also feel like you never really wanted to work with me on this. Sedning me half assed direct sum implementation from random people on matlab filexchange which would clearly not work and telling me 'did you even try it?' when just a single glance tells me it definitely won't work, and calling me lazy because I can say it won't work without even trying is just plain rude and insulting.

So, anyway, like I said, mpmath or wait if those guys from the scipy mailing list come up with a working solution. In the meantime, people can use the fast, complete and multithreaded version from scilpy as in the past year and a half. Don't come complaining nobody puts stuff back in dipy with emails or making us feel bad IRL about not contributing back if you keep saying what we do is useless and we should just slave away at rewriting everything everytime. Be happpy instead people are even willign and happy to help others. I seriously think you underestimate the amount of time everyone puts in this (not only me, but pretty much everyone from Sherbrooke and friends), and sometimes redoing everything is too much effort for too little gain when good alternative exists.

P.S. Before I forget, why is slow not a good idea? Between gpl, slow, not working or nothing, it seems to be a good compromise for people wanting to use it. Oh and also, this might sounds rude, but it would be difficult to come up with another solution not using 1F1 because it arises naturally in the moments generating function (i.e. the function which defines uniquely all parametric distributions) of the (non)central chi (squared) function http://mathworld.wolfram.com/NoncentralChi-SquaredDistribution.html But if you do find a viable alternative, feel free to tell me.

@Garyfallidis

This comment has been minimized.

Member

Garyfallidis commented Apr 4, 2016

Sam you can sing as much as you want, and keep blaming me for all your problems. No problem with that. But let's also hear what other people from Sherbrooke have to say that know the story from inside out. I believe that what you say above is only your personal view.

It is indeed shame because I know that you have put a lot of your time in helping with Dipy but that doesn't change the fact that you have continuously harassing me and other developers in the team. In Dipy the character and the team spirit of the developer is more important than the amount of time put in the project or the level of his skill. As you know and others have told you it is extremely difficult to work with you in open source as you become so territorial with your beliefs. It shows a large amount of insecurity which is something that you really need to improve. I gave you plenty of time of improve but it hasn't been enough.

I hope you will learn from this and you will grow to be a wiser person. Unfortunately, I think that the best for Dipy is that you move out of Dipy as we have visions for the project that are going in completely opposite directions. This has been obvious in the last years and it will lead to be destructive for the future of this project. But look at the positive side. You have learned a lot and improved in many areas.

When I first arrived in Sherbrooke you barely knew how to code so see how much you have improved now and how much wider knowledge you have. So, I think if you are smart you will learn from this and move ahead and improve your working relationships in your new projects. I think also that it would help you massively if you would try to lead a project by yourself. In that way you will see how easy it is to judge but things get much more complicated when you actually have to make the decisions.

Finally, you said something about me being rude but if we look at the history of your posts in the different PRs, in the e-mails and in github we can see that you have been rude to most people and at the other side making your work the centre of attention although your actual contribution with features is very small (you mostly contribute small fixes and typo fixes) but at the same time you give advice like you are in the core team and you have become an expert now. For example, I definitely didn't like the way you commented to Rafael's ('fishy'), Stephan ("unrealistic/blurring"), Ariel (several bogus comments on fitting) and to the new students for GSoC.

Neither I like that in last ISMRM you were hiding your work on denoising rather than explaining it. You didn't had a single information about how your method works in your poster. This is definitely against the philosophy of Dipy which is all about increasing openess in open source. And let me not forget how extremely rude you were at ISMRM to Prof. Roland Henry (Bago's supervisor) trying to explain him how you knew it all about diffusion MRI and he was not understanding anything. You indeed have shown a consistent lack of modesty. I can give many other examples that you acted against the mission of Dipy.

I will let the others say what they think about this and then block your access in the project and the relevant applications. As I said above you have improved from previously. Be wiser and try to adapt and evolve to be a better scientist and team member in your next endeavours. Unfortunately, it is too late to be helpful to this specific project.

@Garyfallidis

This comment has been minimized.

Member

Garyfallidis commented Apr 6, 2016

Finally, I would like to tell you that on a personal level although it is true that you made it really hard to work together and managed to upset most of the people in the team there is no hatred of any kind as we do understand that you struggle with your own personal battles.

At the same time we didn't see enough progress from your side during the last years so that we can move beyond you past/current attitude and the only thing that we can do to improve is to show you that we have been affected by your actions in a point that it makes it impossible to consider working in the same project now or in the future.

What you need to do from now on is to understand that you need to control yourself and ask for advice on how to improve your social/working skills. That is the only way to win back the trust of colleagues and friends. Otherwise, you will continue forcing yourself and your surroundings in unnecessary pain. I wish all the best of luck to you and I am confident that at some point you will understand that this action was necessary so that you can grow to become a wiser and more useful to the community individual.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment