Add regression tests replicating current phat_small example run #140

karllark · 2017-11-23T17:35:34Z

The start of getting regression tests into the travis ci tests. Addresses #97.
The full phat_small model is run in stages and the output of each stage compared to cached versions of the files.
A number of changes needed to get the tests to run in the automated system where the code is run after being installed (not just cloned). Hence this work has also advanced the efforts to get the BEAST installable and working.
Changes include:

adding in the ability to pass the location of various library files directly as the standard location in the source code directory does not work for the installed BEAST
removed the keep column: addresses Use of SED grid 'keep' column? #73
removed the C code from fit_metrics as there was something keeping the test code from being able to import anything from that directory. This is likely related to something in the install process. I could not fix it so simplified the fit_metrics directory to only have the needed python code. All the old fit_metrics code is still around, it just move to the beast/old directory in a subdirectory: addresses Remove C code in fit_metrics? #153.
updated the travis setup
discovered that eztables is not used in the current phat_small BEAST runs via the test coverage report. Will investigate removing this code/package in a separate pull request.

In the process, found that there are small differences between the linux and mac versions of the BEAST computations. Not clear why this is the case. Spent time trying to debug before, during, and after the Nov 2017 BEAST HackDay. Looks like it may be due to a small difference in the numerical computation between machine types.

…oo deep?)

includes changes to various files to support explicit setting of library filenames overriding defaults includes a few PEP8 changes (coudn't resist)

Might be useful to upgrade the reading code to use something other than the extensions to decide the filetype

… (bugfix)

coveralls · 2017-11-23T18:05:14Z

Coverage increased (+6.6%) to 18.739% when pulling 51994e5 on karllark:ci_regression into 03207d2 on BEAST-Fitting:master.

mfouesneau

It seems ok to me.

Imports on the tests should be back to relative imports

I do not know travis and astropy packaging to tell if anything is wrong with these parts.

For the stellib interpolation, I have a faster and simplified version
http://mfouesneau.github.io/docs/pystellibs/
We could think about adapting it into beast. It also has more libraries and handles different interpolators (mostly need to adapt the units, that are not astropy)

mfouesneau · 2017-11-23T20:53:38Z

beast/examples/phat_small/datamodel.py

@@ -146,9 +146,9 @@
 #   MISTWeb() -- `rotation` param (choices: vvcrit0.0=default, vvcrit0.4)
 #
 # Default: PARSEC+CALIBRI


it's COLIBRI (like te bird)

mfouesneau · 2017-11-23T20:54:46Z

beast/examples/phat_small/datamodel.py

 # Alternative: PARSEC1.2S -- old grid parameters
-oiso = isochrone.PadovaWeb(modeltype='parsec12s', filterPMS=True)
+#oiso = isochrone.PadovaWeb(modeltype='parsec12s', filterPMS=True)


I don't think the default works properly with BEAST. There are changes in the outputs that need some mapping to work (defining aliases and some testing)

I do not understand your comment. The line 151 does work in the BEAST in that it produces reasonable output. Can you be more specific?

parsec12s and the latest version parsec12sr16 do not have the same output format and column names. This may generate bugs downstream.

Have not seen any bugs or issues between the two versions. I assume that this is handled in the parsec.py code. @lcjohnso any comments on this topic?

mfouesneau · 2017-11-23T20:59:11Z

beast/physicsmodel/creategrid.py

@@ -566,6 +570,9 @@ def add_spectral_properties(specgrid, filternames=None, filters=None,
        naming format to adopt for filternames and filters
        default value is '{0:s}_0' where the value will be the filter name

+    filterLib:  str
+        full filename to the filter library hd5 file


Is it a filename or filtername?

This is the full filename of the filter library file. This is needed for the travis testing as the library files are not in the source directory because the testing is done on an installed version of the beast. This allows the testing code to directly pass the location of the filter library file instead of it being in the default location.

This raises the issue of how to handle the library files for the installed version of the BEAST (should make this for those just using the beast). I've opened issue #141 for this.

mfouesneau · 2017-11-23T21:00:07Z

beast/physicsmodel/dust/tests/test_extinction.py

@@ -4,7 +4,7 @@
 import numpy as np
 import pytest

-from . import extinction
+from beast.physicsmodel.dust import extinction


What absolute import? relative is better to use it without installing into the system.

from .. import extinction

mfouesneau · 2017-11-23T21:02:51Z

beast/physicsmodel/stars/tests/test_padova_isochrone_download.py

+from astropy.tests.helper import remote_data
+from astropy.table import Table
+
+from beast.physicsmodel.stars import isochrone


Same as before. Imports should be relative as much as possible.

karllark · 2017-11-24T14:53:07Z

The Mac versus the linux issue is that the no dust photometry is slightly different between the Mac and Linux versions. The relevant error message from Mac run is below (as well as URL to the full output). As the file being tested to is from the linux run, this is showing that the calculation is 0.5% different between the Mac and Linux. I can dig deeper to see if this is some kind of edge case. I'm asking here to see if there is a "quick" fix or a known issue here. We use the scipy.integrate.trapz function for doing the integration. I should run the comparison of the two files outside of the testing system to compare the full file as the testing system stops when it finds the first issue. But I don't have access to a Mac, so need to get someone else to run the testing code on their Mac. Probably something to do on Monday during the BEAST@ST hack day.

https://travis-ci.org/BEAST-Fitting/beast/jobs/306428490

    # go through the file and check if it is exactly the same

    for sname in hdf_cache.keys():
        if isinstance(hdf_cache[sname], h5py.Dataset):
            cvalue = hdf_cache[sname]
            cvalue_new = hdf_new[sname]
            if cvalue.dtype.isbuiltin:
                np.testing.assert_equal(cvalue.value, cvalue_new.value,
                                        'testing %s'%(sname))
            else:
                for ckey in cvalue.dtype.fields.keys():
                    np.testing.assert_equal(cvalue.value[ckey],
                                            cvalue_new.value[ckey],

                                          'testing %s/%s'%(sname, ckey))

E AssertionError:
E Arrays are not equal
E testing grid/logHST_ACS_WFC_F475W_nd
E (mismatch 0.4805491990846633%)
E x: array([-17.706708, -17.705989, -17.70227 , ..., -17.403936, -17.196232,
E -30.209619])
E y: array([-17.706708, -17.705989, -17.70227 , ..., -17.403936, -17.196232,
E -30.209619])

Issue seems to be that the radius is *slightly* (1e-14) different for a small number of models (<10 in test)

karllark · 2017-11-27T21:24:12Z

Difference in radii column may be due to differences in the results from np.power or np.sqrt. The differences are small (1e-14) and are for very few elements (< 10). No clear reason why, not clearly correlated to the logT or logL values. Putting Mac test with remote-data in the allowed failures.

https://stackoverflow.com/questions/44765611/slightly-different-result-from-exp-function-on-mac-and-linux

coveralls · 2017-11-27T22:13:52Z

Coverage increased (+6.6%) to 18.739% when pulling 2a67f7e on karllark:ci_regression into 03207d2 on BEAST-Fitting:master.

karllark · 2017-11-28T14:38:02Z

More info on the differences. The differences are not the same for different columns. This makes me think it might be due to how the numbers are being saved in the hdf5 file. Not clear how different columns could be different. For example, the radius differences should effect all the columns that depend on flux, but this does not seem to be the case.

[kgordon@grimy comptests]$ python test_two_spec.py

***** logHST_ACS_WFC_F475W_nd *****

min/max diff [%]: -2.12102363115e-14 2.00556172964e-14
values: [-17.9000348 -17.90809893 -19.45450982 -18.18661445 -18.64675207
-17.34072645 -18.63916794 -17.82567481 -21.21896864 -17.7143073
-17.24287029 -18.59103382 -18.69728022 -18.78303055 -19.13966618
-16.74999574 -17.75385644 -18.56444092 -17.59229913 -17.71455725]
diffs [%]: [ -1.98475239e-14 -1.98385864e-14 1.82616458e-14 1.95347721e-14
1.90527212e-14 -2.04876865e-14 -1.90604736e-14 -1.99303180e-14
-1.67431025e-14 2.00556173e-14 -2.06039576e-14 -1.91098231e-14
1.90012325e-14 -1.89144860e-14 -1.85620462e-14 -2.12102363e-14
-2.00109407e-14 1.91371973e-14 -2.01947094e-14 -2.00553343e-14]
logL: [ 2.552 2.767 0.829 1.892 1.36 3.248 1.677 2.981 -0.344 2.644
3.145 1.719 1.66 1.358 0.86 4.285 2.745 1.681 2.854 3.147]
logT: [ 4.1849 4.2822 3.696 3.7944 3.9684 3.6161 3.684 3.5817 3.5445
4.1496 3.6591 3.6822 3.6665 3.7593 3.8952 3.5548 3.6342 4.0934
3.6419 3.5732]

***** radius *****

min/max diff [%]: -5.89805831405e-14 1.83949685264e-14
values: [ 1.84757113 2.25944185 17.55905615 112.90187844 93.36866897
4.82837695 1.88235342 2.19887817 88.04428434]
diffs [%]: [ -4.80727591e-14 -3.93096384e-14 -2.02329422e-14 -2.51738145e-14
-1.52201535e-14 1.83949685e-14 -5.89805831e-14 -4.03923433e-14
-1.61405761e-14]
logL: [ 1.36 2.076 3.159 3.099 3.19 1.358 1.201 1.336 3.139]
logT: [ 3.9684 4.1037 3.9292 3.5101 3.5741 3.7593 3.9246 3.9246 3.5741]

***** logHST_ACS_WFC_F814W_nd *****

min/max diff [%]: -2.01701987515e-14 2.03811248134e-14
values: [-18.34342942 -17.6136771 -17.4313916 -18.49406467 -18.35498535
-20.31699145 -18.42723283 -19.3992071 -18.54534658 -21.66570469
-18.34487901]
diffs [%]: [ -1.93677725e-14 -2.01701988e-14 2.03811248e-14 1.92100208e-14
-1.93555789e-14 -1.74864162e-14 1.92796917e-14 1.83137056e-14
-1.91569010e-14 -1.63978681e-14 -1.93662421e-14]
logL: [ 3.142 4.172 4.434 2.606 2.012 0.073 2.705 1.299 2.588 -1.296
2.023]
logT: [ 4.2906 4.4183 4.4615 4.1512 3.6528 3.6171 4.1583 3.9783 4.1632
3.5858 3.7208]

***** logHST_WFC3_F336W_nd *****

min/max diff [%]: -2.09138723297e-14 2.0003631678e-14
values: [-17.7603434 -20.73491907 -16.98735472 -18.8233656 -19.24527613
-19.85497323 -19.13784628 -17.02266876 -18.19092006 -24.23210154
-19.26460376 -22.44189561 -18.53689665]
diffs [%]: [ 2.00036317e-14 1.71339645e-14 -2.09138723e-14 1.88739557e-14
-1.84601855e-14 1.78933189e-14 -1.85638113e-14 -2.08704859e-14
-1.95301484e-14 1.46611868e-14 1.84416649e-14 1.58307201e-14
-1.91656335e-14]
logL: [ 2.568 0.419 3.175 3.765 2.37 0.84 1.358 3.228 3.537 -2.72
1.702 -1.134 1.681]
logT: [ 4.2155 3.6588 4.0842 3.563 3.6259 3.7203 3.7593 3.9692 3.6233
3.4683 3.688 3.6283 4.0934]

***** logHST_WFC3_F275W_nd *****

min/max diff [%]: -2.14910554968e-14 1.94470571415e-14
values: [-16.531127 -17.58087307 -18.3233043 -24.33471439 -19.23846426
-17.13818814 -20.15395223 -25.15072835 -20.58416648 -18.26864421
-19.2465264 -20.32651345 -19.79023364 -20.97112735 -23.10661902
-19.63074666 -19.41112975 -25.09666515 -17.39911034 -19.17312964
-24.80866684 -19.4178416 -19.7039245 ]
diffs [%]: [ -2.14910555e-14 -2.02078342e-14 1.93890448e-14 -1.45993646e-14
-1.84667218e-14 -2.07298091e-14 1.76278759e-14 1.41256890e-14
-1.72594488e-14 1.94470571e-14 -1.84589863e-14 -1.74782246e-14
-1.79518531e-14 -1.69409761e-14 -1.53753073e-14 -1.80977002e-14
1.83024570e-14 -1.41561186e-14 -2.04189387e-14 1.85296493e-14
-1.43204538e-14 -1.82961307e-14 -1.80304877e-14]
logL: [ 3.714 2.528 2.163 -1.134 1.067 2.894 3.411 -2.098 0.096 3.033
3.247 0.267 2.12 2.685 -0.037 1.358 1.507 -1.888 3.398 1.668
-1.695 1.668 2.114]
logT: [ 4.3962 4.2033 3.878 3.4931 3.9137 4.1257 3.5986 3.4823 3.7805
3.725 3.6403 3.7961 3.6653 3.5681 3.6098 3.7593 3.7694 3.5203
3.8309 3.7866 3.5628 3.7239 3.6691]

***** logHST_WFC3_F110W_nd *****

min/max diff [%]: -2.07851307728e-14 2.03829590486e-14
values: [-31.73744785 -19.54151472 -19.83521631 -18.33141659 -18.24172346
-21.23477622 -17.42982297 -18.33777493 -20.80867991 -19.33387339
-20.44828891 -17.09257314 -17.402233 ]
diffs [%]: [ -1.11940749e-14 1.81803393e-14 -1.79111416e-14 1.93804645e-14
-1.94757567e-14 -1.67306387e-14 2.03829590e-14 -1.93737446e-14
1.70732295e-14 -1.83755919e-14 1.73741368e-14 -2.07851308e-14
-2.04152747e-14]
logL: [-9.999 2.235 1.206 2.197 2.281 -0.766 3.126 2.229 -0.353 1.358
0.226 3.337 3.129]
logT: [ 4.191 4.2021 3.9394 3.5975 3.5908 3.5554 3.6329 3.6472 3.5491
3.7593 3.7476 3.5266 3.5752]

mfouesneau · 2017-11-28T14:59:44Z

Smells that it's a compiler issue: mac os uses LLVM and clang while linux uses GCC or g++.
You have 1e-14 % variations?

coveralls · 2017-11-28T15:15:05Z

Coverage increased (+6.6%) to 18.741% when pulling 2a67f7e on karllark:ci_regression into 03207d2 on BEAST-Fitting:master.

karllark · 2017-11-28T18:12:29Z

Yes. 1e-14 variations. Very small. Would like to understand to make sure it is not an indication of a bigger issue.

…d code

coveralls · 2017-11-28T18:42:59Z

Coverage increased (+7.6%) to 19.731% when pulling 7ecbb87 on karllark:ci_regression into 03207d2 on BEAST-Fitting:master.

coveralls · 2017-11-30T18:26:09Z

Coverage increased (+7.5%) to 19.673% when pulling de0f160 on karllark:ci_regression into 03207d2 on BEAST-Fitting:master.

coveralls · 2017-11-30T19:16:43Z

Coverage increased (+9.7%) to 21.792% when pulling 41c6e5f on karllark:ci_regression into 03207d2 on BEAST-Fitting:master.

coveralls · 2017-11-30T23:34:02Z

Coverage increased (+9.7%) to 21.784% when pulling ef57548 on karllark:ci_regression into 03207d2 on BEAST-Fitting:master.

coveralls · 2017-12-01T00:05:22Z

Coverage increased (+11.4%) to 23.524% when pulling 3296dfb on karllark:ci_regression into 03207d2 on BEAST-Fitting:master.

coveralls · 2017-12-01T03:13:56Z

Coverage increased (+12.0%) to 24.085% when pulling 06297fa on karllark:ci_regression into 03207d2 on BEAST-Fitting:master.

Included moving most of the code in fit_metrics to fit_metrics_old. The fit_metrics code would not import in the built version of the beast and I could not figure out why. As a test, I created a simple version of the fit_metrics code and only included what was needed. This worked. So, there is something complicated with the inclusion of the C and python versions and the switching between them that I could not fix. We have only been using the python versions for a few years, and tests I did back in the day found them to have the same speed. Old code saved, so we can always go back.

This means it will be ignored for the testing coverage

coveralls · 2017-12-29T22:01:54Z

Coverage increased (+17.6%) to 29.692% when pulling 78c93b5 on karllark:ci_regression into 03207d2 on BEAST-Fitting:master.

karllark · 2017-12-29T22:05:18Z

Ready for merging!

karllark · 2018-01-10T14:59:52Z

Bump.

mfouesneau · 2018-01-11T16:42:26Z

beast/examples/phat_small/datamodel.py

 # Alternative: PARSEC1.2S -- old grid parameters
-oiso = isochrone.PadovaWeb(modeltype='parsec12s', filterPMS=True)
+#oiso = isochrone.PadovaWeb(modeltype='parsec12s', filterPMS=True)


Again very careful, some parameters do not produce correct files on the parsec website when you use this model version.

Can you be more specific?
I've run the BEAST with the parsec12s option and it works. Maybe the BEAST is not sensitive to the differences?

mfouesneau · 2018-01-11T16:43:40Z

beast/fitting/tests/test_fit_grid.py

+from astropy.tests.helper import remote_data
+from astropy import units
+
+from beast.observationmodel.observations import Observations


Assuming beast is installed, correct?

When running tests (with 'python setup.py tests' or via the travis service) the beast is first installed, then the tests are run. So this is as it should be.

karllark · 2018-01-18T15:36:08Z

One review is enough for merging.

Add regression tests replicating current phat_small example run

karllark added 8 commits November 21, 2017 14:45

First regression test using cached remote data

dbb1086

Moving dust extinction tests to tests subdirectory

e0013c7

Moving the tests back one directory, python build not finding them (t…

b2089d8

…oo deep?)

Getting the tests setup correctly (in subdirectories)

6586b7a

Updating travis to include the remote-data tests in the coverage report

52e3e7d

Code to regression test the spectral grid creation

76777ea

includes changes to various files to support explicit setting of library filenames overriding defaults includes a few PEP8 changes (coudn't resist)

Rename downloaded files to have the correct extension for reading code

90e7b9f

Might be useful to upgrade the reading code to use something other than the extensions to decide the filetype

Test each column of spectral grid instead of full grid multiple times…

51994e5

… (bugfix)

karllark added the DO_NOT_MERGE label Nov 23, 2017

karllark self-assigned this Nov 23, 2017

karllark requested a review from mfouesneau November 23, 2017 17:35

mfouesneau reviewed Nov 23, 2017

View reviewed changes

karllark mentioned this pull request Nov 27, 2017

Use of SED grid 'keep' column? #73

Closed

All Mac test with remote-data to be an allowed failure

2a67f7e

Issue seems to be that the radius is *slightly* (1e-14) different for a small number of models (<10 in test)

karllark closed this Nov 28, 2017

karllark reopened this Nov 28, 2017

karllark changed the title ~~Isochrone download and spectral grid regression tests~~ Add regression tests replicating current phat_small example run Nov 28, 2017

karllark added 2 commits November 28, 2017 13:14

Adding in regression test for adding in stellar priors to spectral grid

5e25aa8

Upgrade the padova isochrone test to include the appropriate make_gri…

7ecbb87

…d code

karllark added 2 commits November 28, 2017 13:44

Update travis to fix the ci_helpers issue and have relative imports

0a8b384

Minor fix to travis ci setup

cbf7d78

Adding in the test for the full grid (stars+dust)

41c6e5f

karllark added dust fitting stars testing labels Nov 30, 2017

karllark added 3 commits November 30, 2017 18:13

Adding test of calculating the noisemodel from ASTs

ef57548

Forgotten test code for noisemodel

a45ac35

And needed __init__.py file (sigh)

3296dfb

Test for trimming the sed and noise grids

06297fa

karllark added 5 commits December 28, 2017 17:44

Adding in the actual regression fitting test file.

262c274

Moving the fit_metrics_old directory to the old directory

c4d5016

This means it will be ignored for the testing coverage

Fixes to the regression filenames

ad100ce

Fixing conflict in setup.py

78c93b5

karllark requested a review from lcjohnso December 29, 2017 22:04

karllark removed the DO_NOT_MERGE label Dec 29, 2017

mfouesneau approved these changes Jan 11, 2018

View reviewed changes

karllark merged commit da0b573 into BEAST-Fitting:master Jan 18, 2018

karllark deleted the ci_regression branch January 22, 2018 22:56

karllark mentioned this pull request Jan 25, 2018

Address regression test failure on Macs #169

Closed

galaxyumi pushed a commit to galaxyumi/beast that referenced this pull request Jun 7, 2020

Merge pull request BEAST-Fitting#140 from karllark/ci_regression

74e2967

Add regression tests replicating current phat_small example run

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add regression tests replicating current phat_small example run #140

Add regression tests replicating current phat_small example run #140

karllark commented Nov 23, 2017 •

edited

Loading

coveralls commented Nov 23, 2017

mfouesneau left a comment

mfouesneau Nov 23, 2017

mfouesneau Nov 23, 2017

karllark Nov 24, 2017

mfouesneau Nov 27, 2017

karllark Nov 27, 2017

mfouesneau Nov 23, 2017

karllark Nov 24, 2017

mfouesneau Nov 23, 2017

mfouesneau Nov 23, 2017

karllark commented Nov 24, 2017

karllark commented Nov 27, 2017

coveralls commented Nov 27, 2017

karllark commented Nov 28, 2017 •

edited

Loading

mfouesneau commented Nov 28, 2017

coveralls commented Nov 28, 2017

karllark commented Nov 28, 2017

coveralls commented Nov 28, 2017

coveralls commented Nov 30, 2017

coveralls commented Nov 30, 2017

coveralls commented Nov 30, 2017

coveralls commented Dec 1, 2017

coveralls commented Dec 1, 2017

coveralls commented Dec 29, 2017

karllark commented Dec 29, 2017

karllark commented Jan 10, 2018

mfouesneau Jan 11, 2018

karllark Jan 11, 2018

mfouesneau Jan 11, 2018

karllark Jan 11, 2018

karllark commented Jan 18, 2018

Add regression tests replicating current phat_small example run #140

Add regression tests replicating current phat_small example run #140

Conversation

karllark commented Nov 23, 2017 • edited Loading

coveralls commented Nov 23, 2017

mfouesneau left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

karllark commented Nov 24, 2017

karllark commented Nov 27, 2017

coveralls commented Nov 27, 2017

karllark commented Nov 28, 2017 • edited Loading

mfouesneau commented Nov 28, 2017

coveralls commented Nov 28, 2017

karllark commented Nov 28, 2017

coveralls commented Nov 28, 2017

coveralls commented Nov 30, 2017

coveralls commented Nov 30, 2017

coveralls commented Nov 30, 2017

coveralls commented Dec 1, 2017

coveralls commented Dec 1, 2017

coveralls commented Dec 29, 2017

karllark commented Dec 29, 2017

karllark commented Jan 10, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

karllark commented Jan 18, 2018

karllark commented Nov 23, 2017 •

edited

Loading

karllark commented Nov 28, 2017 •

edited

Loading