Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements to gwdetchar-lasso-correlation and gwpy-0.12.0 compatibility #134

Merged
merged 6 commits into from
Sep 27, 2018

Conversation

duncanmmacleod
Copy link
Member

This PR makes some general improvements to gwdetchar-lasso-correlation inspired when trying to update the plotting interface to be gwpy-0.12.0 compatible.

The changes are almost backwards compatible, meaning that they almost exactly reproduce the old results, however they are subtly different due to a change in the channel ordering (use of OrderedDict compared to dict). The results exactly match if the upstream code is patched to use OrderedDict (with no further changes).

This is the companion PR to #118, so should be merged immediately after that one.

cc @macedo22

@coveralls
Copy link

coveralls commented Sep 18, 2018

Pull Request Test Coverage Report for Build 306

  • 0 of 42 (0.0%) changed or added relevant lines in 1 file are covered.
  • 8 unchanged lines in 4 files lost coverage.
  • Overall coverage decreased (-1.5%) to 38.909%

Changes Missing Coverage Covered Lines Changed/Added Lines %
gwdetchar/lasso.py 0 42 0.0%
Files with Coverage Reduction New Missed Lines %
gwdetchar/tests/test_cds.py 2 100.0%
gwdetchar/tests/test_const.py 2 100.0%
gwdetchar/tests/test_daq.py 2 100.0%
gwdetchar/tests/test_cli.py 2 81.13%
Totals Coverage Status
Change from base Build 305: -1.5%
Covered Lines: 428
Relevant Lines: 1100

💛 - Coveralls

@duncanmmacleod
Copy link
Member Author

The reduction in coverage is purely down to a new module being added that doesn't have unit tests (yet). We (me and the CSUF team) should also work to migrate more code out of gwdetchar-lasso-correlation into gwdetchar.lasso to increase modularity.

Copy link
Contributor

@macedo22 macedo22 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@duncanmmacleod I'm a big fan of all of this abstraction into gwdetchar.lasso. From a gwdetchar-lasso-correlation perspective, this all looks great to me. From the gwpy side of things, I'm not as informed. Can you explain in more detail what you are referring to when you say that the results are almost the same, since I'm unable to generate a sample page? And what upstream source are you referring to with the OrderedDicts?

@duncanmmacleod
Copy link
Member Author

@macedo22, sorry for the delay in replying. The change to the results has nothing to do with gwpy, but to a subtle change in the ordering of the channels.

The posted example was generated using the following command (run from any machine on the LIGO-Livingston LDAS system):

gwdetchar-lasso-correlation 1172363755 1172375019 --ifo L1 --nproc 4 -o 1172363755-1172375019-orig -f /home/alexandra.macedo/public_html/detchar/LASSO/test/O2/L1/20170301/1172363755-1172375019/L1-CHANNELS-1172363755-11264.txt -P SenseMonitor_hoft_L1_M -O 2.5

The results from an unmodified code are here.

If you apply only the following patch:

diff --git a/bin/gwdetchar-lasso-correlation b/bin/gwdetchar-lasso-correlation
index 679999f..207f484 100755
--- a/bin/gwdetchar-lasso-correlation
+++ b/bin/gwdetchar-lasso-correlation
@@ -27,6 +27,7 @@ from subprocess import call
 import tempfile
 import atexit
 import shutil
+from collections import OrderedDict

 from math import (isnan, isinf, log, log10)
 import numpy
@@ -264,8 +265,8 @@ auxdata = TimeSeriesDict.get(
     observatory=args.ifo[0], pad=0, **io_kw)

 # -- removes flat data to be re-introdused later
-flatdata = dict()
-gooddata = dict()
+flatdata = OrderedDict()
+gooddata = OrderedDict()

 for k, ts in auxdata.items():
     flat = ts.value.min() == ts.value.max()

the results are slightly different, see here. The use of OrderedDict preserves the relative order of the auxiliary channel data when you squeeze them into an ndarray for use in the Lasso model; I presume that this subtle change in the input data is responsible for the change in the fit.

The results from after applying all of the changes from this PR are here, and match the modified results above.

Copy link
Contributor

@macedo22 macedo22 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@duncanmmacleod That's so interesting, but I think you're right. The fit probably just changes because it is iterating through and updating the model over a different order of auxiliary channels. This all seems good to me

@duncanmmacleod duncanmmacleod self-assigned this Sep 27, 2018
@duncanmmacleod duncanmmacleod merged commit 0a98c07 into gwdetchar:master Sep 27, 2018
@duncanmmacleod duncanmmacleod deleted the lasso-improvements branch September 27, 2018 09:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants