Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add procrustes? #3786

Closed
argriffing opened this issue Jul 11, 2014 · 23 comments
Closed

add procrustes? #3786

argriffing opened this issue Jul 11, 2014 · 23 comments
Labels
enhancement A new feature or improvement scipy.linalg
Milestone

Comments

@argriffing
Copy link
Contributor

The problem is to find a transformation of n p-dimensional points X_i, maybe allowing translation/rotation/scaling, so that the transformed points closely match up with n corresponding target points Y_i with mismatch penalized by maybe sum of squares of errors. I've written code for this at one time, downstream projects e.g. skbio.math.stats.spatial.procrustes also have implementations, matlab has it, web forums have questions about how to do this in numpy/scipy, and it would seem to be in scope to be added to scipy.
http://en.wikipedia.org/wiki/Orthogonal_Procrustes_problem

@jacobcvt12
Copy link
Contributor

I think this would be nice to add in to scipy. I haven't contributed to any scipy enhancements before, but I'd like to try this one/help out with it, if the decision is made to add in this feature.

@ElDeveloper
Copy link
Contributor

If there's interest to add this to scipy, we would be willing to contribute the implementation in skbio.math.stats.spatia.procrustes.

cc @gregcaporaso, @jairideout, @ebolyen, @antgonza, @rob-knight, @wdwvt1

@antgonza
Copy link

Sounds good to me, if there is nothing there to do that. 👍

@rgommers
Copy link
Member

Looks good to me. I see skbio puts it in spatial, but it looks like in scipy it would belong in linalg right?

Could you bring it up on the mailing list? That's where decisions on including new functionality are made.

@ElDeveloper
Copy link
Contributor

Sounds good I am waiting for the e-mail confirmation from the scipy-dev list, once I get that I'll reference this issue there.

@sturlamolden
Copy link
Contributor

I think it would be better to return the orthogonal transformation matrix omega. In case of a solver for the orthogonal Procrustes problem in scipy.linalg it should be the only result that is returned.

@sturlamolden
Copy link
Contributor

Procrustes analysis is regarded a spatial statistics problem. But solving the orthogonal Procrustes problem is a linear algebra problem. I think a low level solver for the orthogonal Procrustes problem would fit best in scipy.linalg, but a high level Procrustes analysis (with full statistical tests) could go in scipy.spatial.

@sturlamolden
Copy link
Contributor

Also, don't use np.transpose(a) in the code. Using a.T is preferable, and np.dot is smart enough to handle C and Fortran arrays correctly with transpose flags in BLAS.

In SciPy, code that uses scipy.linalg.svd should preferably use Fortran arrays consistently to avoid f2py overhead. So the procrustes solver would typically start with calling np.asfortranarray on its inputs. This also allows the input to be "array-like".

@argriffing
Copy link
Contributor Author

@sturlamolden Maybe a scipy.linalg.orthogonal_procrustes(A, B) function which uses the Wikipedia notation A, B, and R, and which is implemented similarly to the two lines

u, s, vh = np.linalg.svd(np.dot(np.transpose(mtx1), mtx2))
q = np.dot(np.transpose(vh), np.transpose(u))

in the scikit-bio helper function?

@sturlamolden
Copy link
Contributor

scipy.linalg.procrustes would be a shorter name.

I would also suggest thet we use scipy.linalg.svd instead of numpy.linalg.svd in SciPy.

The main thing would be that a function in scipy.linalg.* should solve the linear algebra problem, and nothing else. Cf. that scipy.linalg.lstsq solves the OLS problem, but does not actually do linear regression analysis.

scipy.spatial.procrustes could do full Procrustes analysis and call scipy.linalg.procrustes as a utility function.

@argriffing
Copy link
Contributor Author

Added a PR #3809 for the orthogonal part that is purely linear algebra, without translation or scaling. This follows the API suggested by @sturlamolden, but with the longer function name.

@argriffing
Copy link
Contributor Author

As I was adding that PR, I noticed small inefficiencies in the scikit-bio procrustes code which I thought I'd write down while I still remember. No extra dimension is needed for reflection. If you remove the extra-dimension-related code you will find that the 2d example in the docstring, which requires reflection, will still work. Also, code like np.sqrt(np.trace(np.dot(result, np.transpose(result)))) can be replaced by norm(result).

@ElDeveloper
Copy link
Contributor

@sturlamolden, @argriffing should I wait for #3809 to be merged and then work on porting over the procrustes code with the improvements that @argriffing has pointed out thus far? Otherwise what would be the best way to do this?

@argriffing
Copy link
Contributor Author

I tried setting up scikit-bio for development but its tests keep failing and I'm not patient enough to finish fixing it, so I probably won't make a scikit-bio PR for those improvements soon. Currently I think I have an incompatible version of "mpl_toolkits."

@gregcaporaso
Copy link
Contributor

@argriffing, what issues were you having with scikit-bio install? pip install works for us on linux and mac (though we don't currently support windows, so we expect issues there).

@argriffing
Copy link
Contributor Author

what issues were you having with scikit-bio install?

When I run nosetests --with-doctest skbio from my home directory as suggested on http://scikit-bio.org/ I get the error

[...]/site-packages/mpl_toolkits/mplot3d/axes3d.py", line 1040, in cla
        self.zaxis._set_scale('linear')
    AttributeError: 'ZAxis' object has no attribute '_set_scale'

I assume this is because I screwed up the installation or in trying to set up a development environment or because I have an obsolete version of some package. This is on Ubuntu with Python 2.7. If I find myself motivated to try to fix it then if I have any questions I'll be sure to ask on a scikit-bio forum!

@argriffing
Copy link
Contributor Author

Two skbio PRs are submitted!

@ElDeveloper
Copy link
Contributor

Thanks!

On (Jul-22-14|18:15), argriffing wrote:

Two skbio PRs are submitted!


Reply to this email directly or view it on GitHub:
#3786 (comment)

@argriffing
Copy link
Contributor Author

Just to give an update for this issue: the two skbio PRs are now merged, I think the scipy linalg procrustes PR should be in a reviewable state, and the hypothetical scipy spatial procrustes PR has not yet been created.

@ElDeveloper
Copy link
Contributor

@argriffing, I just added one minor documentation comment to #3809! Thanks, I'll try to get a PR ready in the next few days.

@ev-br
Copy link
Member

ev-br commented Aug 18, 2014

Linalg procrustes PR merged in e346aa5, can this be closed?

@argriffing
Copy link
Contributor Author

@ev-br this isn't quite finished, because the merged orthogonal_procrustes just implements the core linear algebra common to all procrustes-related applications. It is not really useful by itself. I think there is interest in adding a spatial data analysis layer on top of it, so that if you pass two matched sets of points to a function, then the function will return enough information to let the caller easily transform a new point from the first space towards the second space, subject to some constraint. For example you could allow only transformations consisting of compositions of translation, rotation, reflection, and scaling, but not shear. On the other hand if it's bad form to have multiple github PRs associated with a single github issue then a new issue could be opened and this one could be closed.

@ev-br
Copy link
Member

ev-br commented Aug 18, 2014

Ok, let's keep this open then.

ElDeveloper added a commit to ElDeveloper/scipy that referenced this issue Feb 20, 2015
Add scipy.spatial.procrustes which includes a main function,
`procrustes`, that translates, rotates and scales two sets of points to
optimally superimpose them.

This code is being ported from scikit-bio in response to scipy#3786:
https://github.com/biocore/scikit-bio

Fixes scipy#3786
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A new feature or improvement scipy.linalg
Projects
None yet
Development

No branches or pull requests

8 participants