New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add procrustes? #3786
Comments
I think this would be nice to add in to scipy. I haven't contributed to any scipy enhancements before, but I'd like to try this one/help out with it, if the decision is made to add in this feature. |
If there's interest to add this to scipy, we would be willing to contribute the implementation in cc @gregcaporaso, @jairideout, @ebolyen, @antgonza, @rob-knight, @wdwvt1 |
Sounds good to me, if there is nothing there to do that. 👍 |
Looks good to me. I see Could you bring it up on the mailing list? That's where decisions on including new functionality are made. |
|
I think it would be better to return the orthogonal transformation matrix omega. In case of a solver for the orthogonal Procrustes problem in scipy.linalg it should be the only result that is returned. |
Procrustes analysis is regarded a spatial statistics problem. But solving the orthogonal Procrustes problem is a linear algebra problem. I think a low level solver for the orthogonal Procrustes problem would fit best in scipy.linalg, but a high level Procrustes analysis (with full statistical tests) could go in scipy.spatial. |
Also, don't use np.transpose(a) in the code. Using a.T is preferable, and np.dot is smart enough to handle C and Fortran arrays correctly with transpose flags in BLAS. In SciPy, code that uses scipy.linalg.svd should preferably use Fortran arrays consistently to avoid f2py overhead. So the procrustes solver would typically start with calling np.asfortranarray on its inputs. This also allows the input to be "array-like". |
@sturlamolden Maybe a u, s, vh = np.linalg.svd(np.dot(np.transpose(mtx1), mtx2))
q = np.dot(np.transpose(vh), np.transpose(u)) in the scikit-bio helper function? |
scipy.linalg.procrustes would be a shorter name. I would also suggest thet we use scipy.linalg.svd instead of numpy.linalg.svd in SciPy. The main thing would be that a function in scipy.linalg.* should solve the linear algebra problem, and nothing else. Cf. that scipy.linalg.lstsq solves the OLS problem, but does not actually do linear regression analysis. scipy.spatial.procrustes could do full Procrustes analysis and call scipy.linalg.procrustes as a utility function. |
Added a PR #3809 for the orthogonal part that is purely linear algebra, without translation or scaling. This follows the API suggested by @sturlamolden, but with the longer function name. |
As I was adding that PR, I noticed small inefficiencies in the scikit-bio procrustes code which I thought I'd write down while I still remember. No extra dimension is needed for reflection. If you remove the extra-dimension-related code you will find that the 2d example in the docstring, which requires reflection, will still work. Also, code like |
@sturlamolden, @argriffing should I wait for #3809 to be merged and then work on porting over the procrustes code with the improvements that @argriffing has pointed out thus far? Otherwise what would be the best way to do this? |
I tried setting up scikit-bio for development but its tests keep failing and I'm not patient enough to finish fixing it, so I probably won't make a scikit-bio PR for those improvements soon. Currently I think I have an incompatible version of "mpl_toolkits." |
@argriffing, what issues were you having with scikit-bio install? pip install works for us on linux and mac (though we don't currently support windows, so we expect issues there). |
When I run
I assume this is because I screwed up the installation or in trying to set up a development environment or because I have an obsolete version of some package. This is on Ubuntu with Python 2.7. If I find myself motivated to try to fix it then if I have any questions I'll be sure to ask on a scikit-bio forum! |
Two skbio PRs are submitted! |
Thanks! On (Jul-22-14|18:15), argriffing wrote:
|
Just to give an update for this issue: the two skbio PRs are now merged, I think the scipy linalg procrustes PR should be in a reviewable state, and the hypothetical scipy spatial procrustes PR has not yet been created. |
@argriffing, I just added one minor documentation comment to #3809! Thanks, I'll try to get a PR ready in the next few days. |
Linalg procrustes PR merged in e346aa5, can this be closed? |
@ev-br this isn't quite finished, because the merged |
Ok, let's keep this open then. |
Add scipy.spatial.procrustes which includes a main function, `procrustes`, that translates, rotates and scales two sets of points to optimally superimpose them. This code is being ported from scikit-bio in response to scipy#3786: https://github.com/biocore/scikit-bio Fixes scipy#3786
The problem is to find a transformation of n p-dimensional points
X_i
, maybe allowing translation/rotation/scaling, so that the transformed points closely match up with n corresponding target pointsY_i
with mismatch penalized by maybe sum of squares of errors. I've written code for this at one time, downstream projects e.g.skbio.math.stats.spatial.procrustes
also have implementations, matlab has it, web forums have questions about how to do this in numpy/scipy, and it would seem to be in scope to be added to scipy.http://en.wikipedia.org/wiki/Orthogonal_Procrustes_problem
The text was updated successfully, but these errors were encountered: