New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tests for TS map image computation #475

Merged
merged 8 commits into from Mar 17, 2016

Conversation

Projects
None yet
3 participants
@OlgaVorokh
Member

OlgaVorokh commented Feb 26, 2016

No description provided.

@OlgaVorokh

This comment has been minimized.

Member

OlgaVorokh commented Feb 28, 2016

Hi @adonath,

could you please review?

@cdeil cdeil added the tests label Feb 29, 2016

@cdeil cdeil added this to the 0.4 milestone Feb 29, 2016

@cdeil cdeil self-assigned this Feb 29, 2016

@cdeil

This comment has been minimized.

Member

cdeil commented Feb 29, 2016

@OlgaVorokh - Thanks!

I haven't looked in detail at what this test does, but here's some high-level feedback.

The main goal of the tests of the command line scripts is to exercise the command line script to make sure it's working.

We had the experience that everything that's untested breaks within weeks / months, because Gammapy is under rapid development and we refactor quite often or rename arguments to try and make things better and more consistent. This will get better / more stable after a Gammapy 1.0 release this summer, but for now we take every opportunity to improve things.

As you can see here, there are two blocks of code that aren't tested for gammapy-image-ts:
https://coveralls.io/builds/5224587/source?filename=gammapy%2Fscripts%2Fimage_ts.py#L75
https://coveralls.io/builds/5224587/source?filename=gammapy%2Fscripts%2Fimage_ts.py#L91

Is it possible without too much work / extra test code to exercise these options?
Sometimes @pytest.mark.parametrize is a good way to exercise more options without writing much more and much more complex test code.
In this case the list of options passed to parametrize could be a few different args to call the comand line tool with.

The other point is that there is already a similar test in
https://github.com/gammapy/gammapy/blob/master/gammapy/detect/tests/test_test_statistics.py#L15

Personally I like high-level tests, and I don't care if we test the library code by calling it directly or via the command line interface. The main goals are that most important things are tested, and ideally that there are no duplicate tests.
In this case it's not obvious to me to what degree the command line test duplicates the library test ... is it OK or can this be improved?

I think in your test here lines 21 to 46 set up an example dataset, right?
Can you move that out into a separate function for better test readability?
Maybe it's even possible to share the test dataset with test_compute_ts_map?

The test you add here takes 1.66 seconds:

$ time python setup.py test -V -a '--durations=100' --coverage
...
1.66s call     lib.macosx-10.11-x86_64-3.5/gammapy/scripts/tests/test_image_ts.py::test_command_line_gammapy_image_ts

That's OK.

But if you can test the same things in a faster way (e.g. by downsampling a bit more), please do.

@adonath

This comment has been minimized.

Member

adonath commented Feb 29, 2016

@OlgaVorokh Thanks for this PR!

The code itself looks good, but as @cdeil already mentioned it would be better to restructure it a bit. So as a summary:

  • Move set-up of the dataset in a separate function, that writes input_all.fits
  • Run the command line tool image_ts for multiple scales on the input file, read the result files and check the values in it
  • Once the command line test works, the library test can be removed

The library test was old code and partly incomplete, e.g. the up-sampling code is now in compute_ts_map_multiscale and is actually not longer needed in the test.

@OlgaVorokh

This comment has been minimized.

Member

OlgaVorokh commented Mar 7, 2016

Oh, it's been a busy week :)

@adonath Sorry for delay, but I am still not sure how can we check correctness of values for multiple scales? When I asked this in the previous ticket you suggested looking at https://github.com/gammapy/gammapy/blob/master/gammapy/detect/tests/test_test_statistics.py, but it only has information for one scale -- several asserts
assert_allclose(1705.840212274973, result.ts[99, 99], rtol=1e-3)
assert_allclose([[99], [99]], np.where(result.ts == result.ts.max()))
assert_allclose(6, result.niter[99, 99])
assert_allclose(1.0227934338735763e-09, result.amplitude[99, 99], rtol=1e-3)

Moreover, I found a silly bug in my unit test that made it's always true.
I can't get same results from library call of compute_ts_map and command line calls because I can't pass same kernel:

  1. library call(from the outdated test test_test_statistics) uses kernel astropy.convolution.Gaussian2DKernel(2.5)
  2. command line call calls compute_ts_multiscale, which constructs multi_gauss_psf_kernel()
    I am not sure how to convert multi_gauss_psf_kernel to Gaussian2DKernel so that we get same results from command line tool so we can use these asserts, so we probably need another way of validating these values?
@adonath

This comment has been minimized.

Member

adonath commented Mar 7, 2016

@OlgaVorokh

No worries, we're not in hurry...:-) You're right, I forgot about the fact, that the command line tool uses
a different Kernel format. But you should be able to reproduce the number from the test, when only using a single Gaussian kernel. Here's a corresponding json file, that can be read by the command line tool:

{
    "psf1": {
        "ampl": 1, 
        "fwhm": 5.8870501125773735
    }, 
    "psf2": {
        "ampl": 0, 
        "fwhm": 1E-5
    }, 
    "psf3": {
        "ampl": 0, 
        "fwhm": 1E-5
    }
}

The numbers for the other scales don't exist yet and have to have to added. So run the the tool and check the values in the output files. You can look the values up using e.g. astropy:

from astropy.io import fits
hdulist = fits.read('result.fits')
print(hdulist['ts'].data[99, 99])

Does that make sense?

@OlgaVorokh

This comment has been minimized.

Member

OlgaVorokh commented Mar 7, 2016

@adonath

I am having several issues:

In compute_ts_map_multiscale when we create kernel we pass parameter mode='oversample'. Also we always use it in compute_ts_map_multiscale(for example, we pass it Gaussian2DKernel constructor in this line). We can do the same in the test_image_ts.py, but after that asserts

assert_allclose(1705.840212274973, result.ts[99, 99], rtol=1e-3)
assert_allclose([[99], [99]], np.where(result.ts == result.ts.max()))
assert_allclose(6, result.niter[99, 99])
assert_allclose(1.0227934338735763e-09, result.amplitude[99, 99], rtol=1e-3)

will be failed.
What do you think? Perhaps we shouldn't merge library and image_ts tests, so that asserts above can stay in the test_compute_ts_map and we can make regression tests for different scales in test_image_ts.

Another argument for not merging: in psf_core.py, when we create new kernel we normalize kernel before returning result. We can observe this in compute_ts_map_multiscale also (if scale > 0, for example).
We can also normalise kernel to compensate for this in test_image_ts.py, but this will break asserts as well.

By the way, is this normal, when we convolve normalized and unnormalised kernels?

@adonath

This comment has been minimized.

Member

adonath commented Mar 7, 2016

@OlgaVorokh Thanks for looking into this in that much detail! I will try to address your comments one by one:

  • You're right with the mode='oversample' argument, that will give slightly different numbers than mode='center' (the default). As long as the number don't differ much from the one in the existing test (let's say < 1%) I'm totally fine with changing the numbers (obtained with mode='oversample') to let the test pass.
  • Concerning the normalization of the kernel, that shouldn't make large a difference either. By default the Gaussian2DKernel is normalized i.e. sums to unity. But because the Gaussian has infinite large tails, it doesn't sum exactly to one (the difference is stored in the truncation attribute of the kernel). If kernel.normalize() is called the kernel is re-normalized explicitly i.e. kernel /= sum(kernel), which guaranties, that the kernel exactly sums to one. So again, if the difference isn't large then ~1%, I'm fine with just changing the numbers in the test.

I think the only argument against the library test is, that the test takes rather long (few sec.) to run. So my suggestion would be to just change the numbers, as long as they don't differ by more than ~1% and remove the library test. Otherwise keep the library test and make a "TODO" comment that we should check, why the numbers changed that much.

@OlgaVorokh

This comment has been minimized.

Member

OlgaVorokh commented Mar 10, 2016

@adonath @cdeil
Quick question : when testing with multiple scales should I save expected output to gammapy-extra or add several asserts checking specific parts of image in code?

@adonath

This comment has been minimized.

Member

adonath commented Mar 10, 2016

@OlgaVorokh I think testing on single, selected values in the image might be sufficient. But I actually like the idea, to have a full set of reference maps in gammapy-extra. This would make the testing more complete (e.g. when only values at the boundaries of the image change) and the reference maps could be used e.g. in the docs as well. So I'd prefer if you save the expected output in gammapy-extra/test_datasets/unbundled/poisson_stats_image/ and retrieve the data from there.

@OlgaVorokh

This comment has been minimized.

Member

OlgaVorokh commented Mar 11, 2016

@adonath

I have a suggestion: maybe it's better to create correct file for scale=0?
We will be able to remove first half of the test. What do you think?

@adonath

This comment has been minimized.

Member

adonath commented Mar 11, 2016

@OlgaVorokh Yes, that's what I actually meant with one of my previous posts. I think it would be only necessary to check that the numbers don't change too much, compared to the first version of the library test. Once the numbers are verified the first part of the test can go...

There's one thing that I noticed: in the test there's still code that does upsampling, but this shouldn't be necessary. The upsampling is done in compute_ts_map_multiscale (here). All in all I think it's sufficient to run the image_ts_main function only once and then compare the output to the reference output (which only contains the upsampled images).

Thanks for your work!

@OlgaVorokh

This comment has been minimized.

Member

OlgaVorokh commented Mar 12, 2016

@adonath

Another quick question: if delete downsample and upsample from the test code as you said before the test will fail. I believe this happens because of different "order" arguments in upsampling/downsampling.

Compare:

  • downsample this in the test_image_ts.py and this in compute_ts_map_multiscale
  • upsample this in the test_image_ts.py and this in compute_ts_map_multiscale

Should I keep downsampling/upsampling in the test or regenerate expected files by using downsample in compute_ts_multiscale(this will breaks asserts from library test)?

@adonath

This comment has been minimized.

Member

adonath commented Mar 13, 2016

@OlgaVorokh What you said in in your last sentences, is what I meant. Please just regenerate the output files using compute_ts_multiscale or the command line tool (Using scales 0 and e.g. 0.05, 0.1, 0.2). Then check by hand, that the numbers for scale 0 correspond to the numbers used in the library test. If they agree well enough (~1%) the library test can go, if not keep the test, as it is and make a comment, that the numbers are different and we can take a look later, why that is the case.

@OlgaVorokh

This comment has been minimized.

Member

OlgaVorokh commented Mar 13, 2016

@adonath

Output files before and after regeneration were significantly different because of different downsampling & upsampling parameters.

@adonath

This comment has been minimized.

Member

adonath commented Mar 14, 2016

@OlgaVorokh Sorry, that was my fault. I missed the downsampling factor, when giving you the PSF parameters. So it should actually be 2 * 5.887...:

{
    "psf1": {
        "ampl": 1, 
        "fwhm": 11.7741002252
    }, 
    "psf2": {
        "ampl": 0, 
        "fwhm": 1E-5
    }, 
    "psf3": {
        "ampl": 0, 
        "fwhm": 1E-5
    }
}

Otherwise this look very good! Could you re-run the TS computation once again with new corrected PSF parameters and update both PR's (the one in gammapy-extra as well)? I think then we're ready to merge!

@OlgaVorokh

This comment has been minimized.

Member

OlgaVorokh commented Mar 14, 2016

@adonath

Also rename and regenerate expected files in gammapy-extra PR.

@adonath

This comment has been minimized.

Member

adonath commented Mar 15, 2016

@OlgaVorokh

I just merged gammapy/gammapy-extra#31. In this PR there's still a fail on Travis-CI: https://travis-ci.org/gammapy/gammapy/jobs/115938139. Could you look into that? And you might have to rebase... Otherwise this PR is in good shape!

@OlgaVorokh OlgaVorokh force-pushed the OlgaVorokh:unit_test branch from 349d932 to eceea71 Mar 17, 2016

@OlgaVorokh

This comment has been minimized.

Member

OlgaVorokh commented Mar 17, 2016

@adonath I was able to reproduce the problem on other machine, looks like test pass now.

adonath added a commit that referenced this pull request Mar 17, 2016

Merge pull request #475 from OlgaVorokh/unit_test
Add unit test for the compute_ts_map

@adonath adonath merged commit b6d3671 into gammapy:master Mar 17, 2016

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@adonath

This comment has been minimized.

Member

adonath commented Mar 17, 2016

@OlgaVorokh Just merged! Congratulations to your first gammapy contribution and thanks for the good work!

@cdeil

This comment has been minimized.

Member

cdeil commented Mar 17, 2016

@OlgaVorokh - Thank you!

Make sure to mention #456 and this PR on your GSoC application, and send a first draft soon so that we can iterate on it next week.

@cdeil cdeil changed the title from Add unit test for the compute_ts_map to Add tests for TS map image computation Apr 20, 2016

@OlgaVorokh OlgaVorokh deleted the OlgaVorokh:unit_test branch May 25, 2016

@OlgaVorokh OlgaVorokh restored the OlgaVorokh:unit_test branch May 25, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment