Decorrelation #137

acliu · 2018-06-23T03:28:02Z

This PR restores decorrelation capability. This includes the M = H^{-1} option to give delta function windows, as well as the M \sim V^{-1/2} option that decorrelates the error covariances.

Note that while this PR allows power spectra to be computed using these options for the M matrix, I did not include extensions to uvpspec to store the bandpower covariances. This is because @anguta has a PR open that contains some of this capability, and I don't want to reinvent the wheel. So once that other PR is done, we can recycle some of that stuff.

vectors apply to data weighting matrix to produce symmetric R

modified: tests/test_pspecdata.py

philbull · 2018-06-27T20:40:39Z

@acliu This branch is failing to build. It looks like the cause is a time-out, but there are also some 'divide by zero' warnings showing up in the log that we never had before.

philbull · 2018-06-28T04:55:06Z

OK, I tracked down what was causing the slowdown. scalar_delay_adjustment() was calling get_G and get_H every time it was called, instead of using the copies already computed earlier in pspec(). I've changed it to take the precomputed copies by default now, which speeds things up a lot.

acliu · 2018-06-28T13:47:01Z

Thanks @philbull for looking into this! This is very interesting. I guess it's giving us a quick glimpse into what it'll be like for a fully time-dependent set of matrices.

I think the divide by zero errors are ok. I will check this, but if I recall from running the tests on my computer, those come from making sure that singular matrices are properly dealt with. It doesn't actually cause any of the tests to fail.

On my computer I also got a failed test for pspec_run, and I didn't know if that was my fault or not. Has pspec_run been doing fine on all other branches? I didn't touch it at all, but maybe there was some indirect change due to the other new methods.

philbull · 2018-06-30T16:52:21Z

Yes, I think it would be useful to do a bit of optimisation work to pave the way for time-dependent matrices. We haven't really tried to make things faster yet, but I feel like there could be some relatively simple things we could do to help. Once we hit a limit, though, we might need to offload some of the calculations to C code...

It'd be good to get rid of the div/0 errors if we can, as it will worry users (and potentially mask other problems).

I think the pspec_run issues are caused by problems elsewhere. Lots of tests are broken right now due to changes in pyuvdata.

coveralls · 2018-07-01T16:40:30Z

Coverage increased (+0.03%) to 96.997% when pulling c5d7ff3 on decorrelation into 472be38 on master.

philbull

A few minor comments about default options and function names, plus the usual few comments about docstrings.

philbull · 2018-07-01T18:46:19Z

hera_pspec/pspecdata.py

+            subsequent indices specify the baseline index, in _key2inds format.
+
+        model : string, optional
+            Type of covariance model to calculate, if not cached. options=['empirical']


Do we want to set empirical as a default? We already know that it's a problematic in various ways, so perhaps isn't very safe as a default?

We don't have anything better currently, so we'll leave this for now.

philbull · 2018-07-01T18:46:56Z

hera_pspec/pspecdata.py

+        Returns
+        -------
+        cross_covar : array-like
+            Cross covariance model for the specified key.


Does this have dimensions Nfreq x Nfreq?

Yep! Added to the docstring.

philbull · 2018-07-01T18:47:58Z

hera_pspec/pspecdata.py

+        if model == 'empirical':
+            covar = utils.cov(self.x(key1), self.w(key1),
+                              self.x(key2), self.w(key2),
+                              conj_1=conj_1, conj_2=conj_2)


This needs an else statement to raise an error if model != empirical

Actually, this is already done a few lines above: assert model in ['empirical'], "didn't recognize model {}".format(model)

philbull · 2018-07-01T18:49:26Z

hera_pspec/pspecdata.py

@@ -894,30 +937,120 @@ def get_H(self, key1, key2, sampling=False):

        return H / 2.

-    def get_V_gaussian(self, key1, key2):
+    def get_unnormed_E(self,key1,key2):


I feel like we have too many matrix variable names floating around now. Would it be OK to call this get_unnormed_RQR instead? It's a bit easier to keep track of.

I'd like to keep this name, because in principle E is more general than RQR, so in the future we might not have the two equal.

philbull · 2018-07-01T18:56:05Z

hera_pspec/pspecdata.py

+        R1 = self.R(key1)
+        R2 = self.R(key2)
+        for dly_idx in range(self.spw_Ndlys):
+            E_matrices[dly_idx] = np.einsum('ij,jk,kl', R1, self.get_Q_alt(dly_idx), R2)


This einsum might be slower than just doing 2 dot products. IIRC, R has shape (Nfreq, Nfreq)? So we shouldn't have any issues fitting this into memory. (It's only if we have a dimension of size Nblpairts that we consume lots of memory.)

(N.B. When I was looking into einsum, it turns out that operations including 2 matrices have lots of optimisations, but operations with 3+ matrices, like this one, are probably brute-forced, and so are slower.)

Ok, converted to two applications of np.dot

philbull · 2018-07-01T19:04:16Z

hera_pspec/pspecdata.py

        Parameters
        ----------
        key1, key2 : tuples or lists of tuples
            Tuples containing indices of dataset and baselines for the two
            input datavectors. If a list of tuples is provided, the baselines
            in the list will be combined with inverse noise weights.

+        model : str
+            How the covariances of the input data should be estimated.


Default: 'empirical'

philbull · 2018-07-01T19:06:39Z

hera_pspec/pspecdata.py

@@ -955,6 +1088,11 @@ def get_MW(self, G, H, mode='I'):
            Definition to use for M. Must be one of the options listed above.
            Default: 'I'.

+        band_covar : array_like, optional
+            Covariance matrix of the unnormalized bandpowers (i.e., q). Used only
+            if requesting the V^-1/2 normalization.


Can you note which function to use to compute the covariance matrix for this please?

philbull · 2018-07-01T19:09:29Z

hera_pspec/pspecdata.py

+                raise ValueError("Covariance not supplied for V^-1/2 normalization")
+            # First find the eigenvectors and eigenvalues of the unnormalizd covariance
+            # Then use it to compute V^-1/2
+            eigvals, eigvects = np.linalg.eigh(band_covar)


Do the eigenvectors/vals change each time this function is called, or should we cache them somewhere?

The philosophy here was the same as for why V is provided as an argument to this function---the idea is to give flexibility to the user to use different matrices, so there isn't an obvious right way to compute them. Caching it seems counter to this.

philbull · 2018-07-01T19:10:27Z

hera_pspec/pspecdata.py

@@ -1233,7 +1393,7 @@ def delays(self):
            return utils.get_delays(self.freqs[self.spw_range[0]:self.spw_range[1]],
                                    n_dlys=self.spw_Ndlys) * 1e9 # convert to ns    

-    def scalar(self, pol, little_h=True, num_steps=2000, beam=None):
+    def scalar(self, pol, little_h=True, num_steps=2000, beam=None, taper_override='no_override'):


taper_override = None would be a more idiomatic default.

As discussed, we'll keep this in service of being more explicit.

philbull · 2018-07-01T19:23:04Z

hera_pspec/pspecdata.py

@@ -1710,13 +1896,22 @@ def pspec(self, bls1, bls2, dsets, pols, n_dlys=None, input_data_weight='identit

                    # Normalize power spectrum estimate
                    if verbose: print("  Normalizing power spectrum...")
-                    Mv, Wv = self.get_MW(Gv, Hv, mode=norm)
+                    if norm == 'V^-1/2':
+                        V_mat = self.get_unnormed_V(key1, key2)


This always uses the default model argument, which is model='empirical'. So, the user can't change that option. Is this intentional?

For now, I think it's probably a good idea to keep it like this. This way it's symmetric w.r.t. the iC weighting of the data, which does not allow the user to set a non-empirical estimate. If we want to extend our covariance estimation capabilities in the future, we can address both of these examples of hard coding then.

…stions

philbull

Looks good, ready to merge!

Adrian Liu and others added 14 commits June 5, 2018 18:27

Implemented time-independent RFI flagging

42f81b3

Finished writing RFI flag tests

10168d3

Merge branch 'master' into rfi

68d750f

Fixed docstring in set_R

5904c9d

Implemented time-independent RFI flagging

7dc758d

Finished writing RFI flag tests

8985e5f

updated matrix caching to store R, and made weight and taper

558ffd0

vectors apply to data weighting matrix to produce symmetric R

modified: pspecdata.py

a814923

modified: tests/test_pspecdata.py

added PSpecData.broadcast_dset_flags and added tests

fc7f1b4

modified: hera_pspec/tests/test_pspecdata.py

0ac8a47

Merge branch 'rfi' of https://github.com/HERA-Team/hera_pspec into rfi

fe68221

Finished H^-1 decorrelation

f6c87ae

Finished V^-1/2 decorrelation

5fa49bd

Merged master

134d57c

ghost assigned acliu Jun 23, 2018

ghost added the in progress label Jun 23, 2018

acliu requested review from philbull and nkern June 23, 2018 03:28

philbull added 2 commits June 27, 2018 15:42

Merge branch 'master' into decorrelation

56fdde6

Make travis output verbose to help track down issues with tests

ebc6c1a

ghost assigned philbull Jun 27, 2018

Use precomputed values of Gv and Hv to speed up scalar_delay_adjustment

69165d5

Merge branch 'master' into decorrelation

17d0fa9

philbull requested changes Jul 1, 2018

View reviewed changes

philbull added this to the Ready for IDR 2.1 analysis milestone Jul 1, 2018

philbull mentioned this pull request Jul 1, 2018

Check OQE window function algebra #71

Closed

Minor changes to decorrelation docstrings in response to Phil's sugge…

c5d7ff3

…stions

philbull approved these changes Jul 2, 2018

View reviewed changes

acliu merged commit f5a8a7d into master Jul 2, 2018

ghost removed the in progress label Jul 2, 2018

acliu deleted the decorrelation branch July 2, 2018 23:01

nkern mentioned this pull request Jul 5, 2018

handling flagged data #55

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decorrelation #137

Decorrelation #137

acliu commented Jun 23, 2018

philbull commented Jun 27, 2018

philbull commented Jun 28, 2018

acliu commented Jun 28, 2018

philbull commented Jun 30, 2018

coveralls commented Jul 1, 2018 •

edited

Loading

philbull left a comment

philbull Jul 1, 2018

acliu Jul 2, 2018

philbull Jul 1, 2018

acliu Jul 2, 2018

philbull Jul 1, 2018

acliu Jul 2, 2018

philbull Jul 1, 2018

acliu Jul 2, 2018

philbull Jul 1, 2018

acliu Jul 2, 2018

philbull Jul 1, 2018

acliu Jul 2, 2018

philbull Jul 1, 2018

acliu Jul 2, 2018

philbull Jul 1, 2018

acliu Jul 2, 2018

philbull Jul 1, 2018

acliu Jul 2, 2018

philbull Jul 1, 2018

acliu Jul 2, 2018

philbull left a comment

Decorrelation #137

Decorrelation #137

Conversation

acliu commented Jun 23, 2018

philbull commented Jun 27, 2018

philbull commented Jun 28, 2018

acliu commented Jun 28, 2018

philbull commented Jun 30, 2018

coveralls commented Jul 1, 2018 • edited Loading

philbull left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

philbull left a comment

Choose a reason for hiding this comment

coveralls commented Jul 1, 2018 •

edited

Loading