Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SO(N) rotation matrix generator #5622

Merged
merged 1 commit into from
Jan 19, 2016
Merged

Conversation

philipdeboer
Copy link
Contributor

I am looking for feedback on this code. It generates random rotations, ie elements of SO(N). I plan to add O(N) as well, but I'd like to be sure of the format first as the two algorithms are very similar.

I have only implemented rvs. The N is specified by dim. You can ask for multiple samples using size, and I return those in a list. Is that acceptable? Mathematically the natural way to embed n results would be as diagonal matrices in an n*N space. Would that be preferable to a list? One argument against it is that it is easier to construct the block-diagonal matrix than to decompose it.

Is it important to have a frozen version as well?

I have not submitted any tests. What are some good ones? A good test that this is generating the samples from the Haar distribution is to
a. generate n samples from SO(N)
b. compute the eigenvalues (N*n of them) and describe their distribution
c. then rotate all n samples by another element of SO(N)
d. update the eigenvalue distribution
e. show that the two eigenvalue distributions are similar
You could repeat step c, but I would find a single sample convincing.

Would we fix n and N? N = 5, n = 1000? Are there timing benchmarks to keep beneath?

Additional tests:
2. Show determinant == 1 for a single sample in each N in range(10)
3. Show x x.T == eye(N) for each sample in test 2

"""
size = int(size)
if size > 1:
return np.array([ self.rvs(dim, size=1, random_state=random_state) for i in range(size) ])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use 4-space indents.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I noticed that after submitting it. Easily enough fixed.

On 23 December 2015 at 18:35, Robert Kern notifications@github.com wrote:

In scipy/stats/_multivariate.py
#5622 (comment):

  •        Random size N-dimensional matrices, dimension (size, dim, dim)
    
  •    Notes
    

  •    Return a random rotation matrix, drawn from the Haar distribution
    
  •    (the only uniform distribution on SO(n)).
    
  •    The algorithm is described in the paper
    
  •    Stewart, G.W., "The efficient generation of random orthogonal
    
  •    matrices with an application to condition estimators", SIAM Journal
    
  •    on Numerical Analysis, 17(3), pp. 403-409, 1980.
    
  •    For more information see
    
  •    http://en.wikipedia.org/wiki/Orthogonal_matrix#Randomization
    
  •    """
    
  •    size = int(size)
    
  •    if size > 1:
    
  •      return np.array([ self.rvs(dim, size=1, random_state=random_state) for i in range(size) ])
    

We use 4-space indents.


Reply to this email directly or view it on GitHub
https://github.com/scipy/scipy/pull/5622/files#r48387247.

@argriffing
Copy link
Contributor

linking to the related mailing list thread: https://mail.scipy.org/pipermail/scipy-dev/2015-October/021021.html

This is not such a constructive thought, but I think it would be cool if scipy bundled something like MatrixDepot as its source of test matrices. MatrixDepot has several test matrix tags, one of which is 'random' which looks like it already includes random correlation matrices:
"random" => ["rosser", "rando", "randcorr", "randsvd", "rohess", "wathen", "oscillate", "golub"],

@@ -2642,3 +2644,104 @@ def entropy(self):
method_frozen.__doc__ = doccer.docformat(
method.__doc__, wishart_docdict_noparams)
method.__doc__ = doccer.docformat(method.__doc__, wishart_docdict_params)

class rotation_gen(multi_rv_generic):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rotation is a much too general name, can you propose something more specific?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about special_ortho? And ortho for O(N).
Sent from my BlackBerry device on the Rogers Wireless Network

-----Original Message-----
From: Ralf Gommers notifications@github.com
Date: Wed, 23 Dec 2015 23:21:15
To: scipy/scipyscipy@noreply.github.com
Reply-To: scipy/scipy reply@reply.github.com
Cc: philipdeboerphilip.deboer@gmail.com
Subject: Re: [scipy] Add SO(N) rotation matrix generator (#5622)

@@ -2642,3 +2644,104 @@ def entropy(self):
method_frozen.doc = doccer.docformat(
method.doc, wishart_docdict_noparams)
method.doc = doccer.docformat(method.doc, wishart_docdict_params)
+
+class rotation_gen(multi_rv_generic):

rotation is a much too general name, can you propose something more specific?


Reply to this email directly or view it on GitHub:
https://github.com/scipy/scipy/pull/5622/files#r48399833

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

special_ortho sounds fine to me. ortho is fairly generic as well, but I don't have a better suggestion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bikeshed alert: special_ortho_group and ortho_group? Or if brevity is at all a virtue: so_n_group and o_n_group?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 I like those, especially the long versions

@rgommers rgommers added scipy.stats enhancement A new feature or improvement labels Dec 24, 2015
@rgommers
Copy link
Member

Would we fix n and N? N = 5, n = 1000? Are there timing benchmarks to keep beneath?

Aim for decent test coverage with tests that in well under one second. If really needed you can also add a slower test (say a few seconds), that is marked with @dec.slow (from numpy.testing). This will then only run for stats.test('full') and not for stats.test().

@rgommers
Copy link
Member

Your proposed tests make sense - they demonstrate the functionality. In addition I would suggest to test against one hardcoded value, so the reproducibility of the random numbers for a given seed is checked.

@philipdeboer philipdeboer force-pushed the rancorr_pr branch 2 times, most recently from 5fdd446 to 2052654 Compare January 6, 2016 12:54
dim : scalar
Dimension of matrices

TODO: frozen object not implemented yet.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leftover?
BTW, I'm not sure if a frozen version is that useful. If you've implemented it already, fine.

@ev-br
Copy link
Member

ev-br commented Jan 6, 2016

As an idea for an additional test. The plane rotations, SO(2), should be essentially univariate. So one can do a KS test for the rotation angle (cf https://github.com/scipy/scipy/blob/master/scipy/stats/tests/test_continuous_basic.py#L352 ).

Re remaining test failures: pep8 bot is annoying but easy to appease. Refguide-check failure asks you to add an entry to the list at https://github.com/scipy/scipy/blob/master/scipy/stats/__init__.py#L116 so that this is included into the html and pdf docs. Finally, there are syntax errors on python 2.6. We are likely to drop support for python 2.6 in a foreseeable future. Still, it'd be nice to fix the syntax if that's not too taxing.

@philipdeboer
Copy link
Contributor Author

If all tests pass, I will move on to ortho_group. Should that be a separate PR, or shall I tack it on here?

@ev-br
Copy link
Member

ev-br commented Jan 8, 2016

A separate one would be better IMO.
Re tests: do you know that you can run the tests locally via $ python runtests.py -s stats and $ python runtestes.py -s stats --refguide-check (the latter checks the documentation for common reST errors and runs modified doctests.)

When all tests pass, it'd be good to squash the commits and reformat the commit message to start with ENH prefix. If you're not sure how to git rebase -i, we'll do it when merging.

@codecov-io
Copy link

@@            master   #5622   diff @@
======================================
  Files          234     234       
  Stmts        43140   43176    +36
  Branches      8156    8159     +3
  Methods          0       0       
======================================
+ Hit          33456   33496    +40
+ Partial       2604    2602     -2
+ Missed        7080    7078     -2

Review entire Coverage Diff as of 01bd996

Powered by Codecov. Updated on successful CI builds.

@philipdeboer
Copy link
Contributor Author

I have been using the in-place build and just running
python tests/test_multivariate.py

Is there a way to run these other tests like that?

@ev-br
Copy link
Member

ev-br commented Jan 8, 2016

python runtests.py -t <path-to-pyfile> or python runtests.py <path-to-pyfile>:test_function or :TestClass.test_method.

Which essentially does an in-place build and runs tests in the specified files.

@philipdeboer
Copy link
Contributor Author

I had done
python setup.py build_ext -i
a while ago, and since then I have been sitting in the stats directory just running the multivariate test suite.

I will try the full in-place build, but it will take a few hours I think. That turns into days since I only run my laptop for 40min twice a day on the train.

@philipdeboer
Copy link
Contributor Author

I stand corrected; it is done already.

@argriffing
Copy link
Contributor

I use runtests.py with ccache, and it's not too slow.

@philipdeboer
Copy link
Contributor Author

  1. Should the O(N) branch be based on scipy/master? That will cause a merge conflict with this PR (the ordering of SO(N) and O(N) code will need to be determined at merge time). I am assuming this is the usual situation, but can easily base that PR on this one if there is a good way to do so.
  2. I see there is now a conflict on this PR. Should I rebase on the latest master? Do I need to keep it up to date?

"""

if dim is None or not np.isscalar(dim) or dim <= 1:
raise ValueError("Dimension of rotation must be specified, and must be a scalar greater than 1.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please split this line.

@ev-br
Copy link
Member

ev-br commented Jan 18, 2016

Yes, it'd be good to rebase on master (the conflict is most likely just in THANKS.txt). While you're at it, would you please split a those too long lines so that they are under 80 characters. This could also use a line in the release notes for scipy 0.18.

If it were my PR, I'd branch the O(N) PR off this one with an intent to rebase the second PR on master once the first one is merged. This way, the rebase is likely simpler. But it's your PR, so it's up to you :-).

@ev-br
Copy link
Member

ev-br commented Jan 19, 2016

I'm pretty sure the rvs algorithm can be vectorized to avoid a python loop. OTOH, this can be attempted separately if needed, and there clearly is value in just reusing the tried-and-tested code. I could quibble about style issues like whitespace some more, but it's fine as is, so I'll go ahead and merge it. Thanks @philipdeboer, all.

ev-br added a commit that referenced this pull request Jan 19, 2016
Add SO(N) rotation matrix generator
@ev-br ev-br merged commit 1129971 into scipy:master Jan 19, 2016
@ev-br ev-br added this to the 0.18.0 milestone Jan 19, 2016
@ev-br
Copy link
Member

ev-br commented Jan 19, 2016

I think it would be cool if scipy bundled something like MatrixDepot as its source of test matrices. MatrixDepot has several test matrix tags, one of which is 'random' which looks like it already includes random correlation matrices:
"random" => ["rosser", "rando", "randcorr", "randsvd", "rohess", "wathen", "oscillate", "golub"],

Agreed it'd be useful. Sounds like a cool project (GSoC even maybe?).
Probably better to move the comment from a PR comment to somewhere where it won't get buried.

@akhmerov
Copy link
Contributor

Hi @philipdeboer, nice work—I just noticed your work in the release notes.

A quick question: QR decomposition uses Householder reflections, and I believe you could use QR out of the box, did you try?

Check out this implementation I've played around with some time ago. cf also tests.

@philipdeboer
Copy link
Contributor Author

I did not try it myself, but Mezzadri mentions it in his paper. The problem
with the QR decomposition is that it is not uniquely defined. To implement
the distribution that we want, we need to impose additional numerical
stability conditions that force a particular choice of the QR
decomposition. So it is best to just implement it explicitly.

It would be fun to reproduce his results and see that this implementation
is good from this perspective.

On 26 July 2016 at 13:50, Anton Akhmerov notifications@github.com wrote:

Hi @philipdeboer https://github.com/philipdeboer, nice work—I just
noticed your work in the release notes.

A quick question: QR decomposition uses Householder reflections, and I
believe you could use QR out of the box, did you try?

Check out this
https://gitlab.kwant-project.org/kwant/kwant/blob/master/kwant/rmt.py#L155
implementation I've played around with some time ago. cf also tests
https://gitlab.kwant-project.org/kwant/kwant/blob/master/kwant/tests/test_rmt.py#L50
.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#5622 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AJ_rHpKKAau-CCPppUljpyWOrLFrQAmHks5qZkjLgaJpZM4G61Kf
.

@akhmerov
Copy link
Contributor

Sure it's not uniquely defined, but you can deal with that by post-processing the result (essentially dividing out by some phases).

@philipdeboer
Copy link
Contributor Author

Generally when a numerical algorithm loses resolution, it cannot be
restored by post-processing. I suspect that this will be true here too. But
if you can show that this approach would work, I would definitely be
interested in understanding it.

On 27 July 2016 at 02:27, Anton Akhmerov notifications@github.com wrote:

Sure it's not uniquely defined, but you can deal with that by
post-processing the result (essentially dividing out by some phases).


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#5622 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AJ_rHmJjNrrJF43ev9Xi6BS7BXraQoLLks5qZvpogaJpZM4G61Kf
.

@philipdeboer
Copy link
Contributor Author

Actually Mezzadri does exactly this for U(N). I will have to take another
look at it.

from scipy import *
def haar_measure(n):
’’’’’’A Random matrix distributed with Haar measure’’’’’’
z = (randn(n,n) + 1j*randn(n,n))/sqrt(2.0)
q,r = linalg.qr(z)
d = diagonal(r)
ph = d/absolute(d)
q = multiply(q,ph,q)
return q

On 27 July 2016 at 02:27, Anton Akhmerov notifications@github.com wrote:

Sure it's not uniquely defined, but you can deal with that by
post-processing the result (essentially dividing out by some phases).


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#5622 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AJ_rHmJjNrrJF43ev9Xi6BS7BXraQoLLks5qZvpogaJpZM4G61Kf
.

@akhmerov
Copy link
Contributor

Feel free to adapt my version if you like it (it also has other symmetric spaces).

@philipdeboer
Copy link
Contributor Author

philipdeboer commented Jan 19, 2017 via email

@akhmerov
Copy link
Contributor

One could imagine generating a random triangular matrix and symmetrizing it well, but since the whole algorithm scales as N^3 regardless of implementation details, I always considered that part to be unimportant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A new feature or improvement scipy.stats
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants