-
-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement cross-correlation redshift/rv-fiding #544
Merged
Merged
Changes from all commits
Commits
Show all changes
27 commits
Select commit
Hold shift + click to select a range
d793c23
Adding a basic x-correlation function.
bafc51b
Computes basic correlation between two spectra. Uses numpy.correlate.
ibusko e10707f
Basic correlation is in place.
ibusko 30682b6
Check position of correlation peak
ibusko 88e0c27
Lags expressed in km/s. Return object is a tuple and no longer a Spec…
70a3298
Remove blank lines at end of file.
ibusko e8f4712
Add new line at end of file.
ibusko a389d06
Fix doc string.
ibusko eeb2fb7
Add doc entry.
b528e4e
Minor change in doc.
b67d4d2
Fixes in documentation.
ibusko 9ad767b
Fix lag computation.
3872887
Make this test more realistic and sensitive.
ibusko 3297b8c
Add autocorrelation test.
ibusko a0f82c6
Add test for zero padding
ibusko 28369c5
Stretch a bit the zero-padding test so as to make the reference line …
ibusko f34a3ac
Add test with non-correlated features in observed and template spectra.
ibusko 43aef0b
Doc standards compliance
ibusko 71d0831
Attempt to fix code block in documentation
ibusko e2bf798
Final fix (tests pass and notebooks work)
90afc3c
Fix build errors
34562de
Modify as per code review.
ibusko 103a2c7
Fix correlation import in doctest
nmearl 0748390
Fix api references in doc building
nmearl cabc982
comment/docs cleanup
eteq 4ea8092
change apodization to window keyword
eteq c033053
refactor to have template_logwl_resample
eteq File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,246 @@ | ||
import astropy.units as u | ||
import numpy as np | ||
from astropy import constants as const | ||
from astropy.units import Quantity | ||
from scipy.signal.windows import tukey | ||
|
||
from ..manipulation import LinearInterpolatedResampler | ||
from .. import Spectrum1D | ||
|
||
__all__ = ['template_correlate', 'template_logwl_resample'] | ||
|
||
_KMS = u.Unit('km/s') # for use below without having to create a composite unit | ||
|
||
|
||
def template_correlate(observed_spectrum, template_spectrum, lag_units=_KMS, | ||
apodization_window=0.5, resample=True): | ||
""" | ||
Compute cross-correlation of the observed and template spectra. | ||
|
||
|
||
After re-sampling into log-wavelength, both observed and template | ||
spectra are apodized by a Tukey window in order to minimize edge | ||
and consequent non-periodicity effects and thus decrease | ||
high-frequency power in the correlation function. To turn off the | ||
apodization, use alpha=0. | ||
|
||
Parameters | ||
---------- | ||
observed_spectrum : :class:`~specutils.Spectrum1D` | ||
The observed spectrum. | ||
template_spectrum : :class:`~specutils.Spectrum1D` | ||
The template spectrum, which will be correlated with | ||
the observed spectrum. | ||
lag_units: `~astropy.units.Unit` | ||
Must be a unit with velocity physical type for lags in velocity. To | ||
output the lags in redshift, use ``u.dimensionless_unscaled``. | ||
apodization_window: float, callable, or None | ||
If a callable, will be treated as a window function for apodization of | ||
the cross-correlation (should behave like a `~scipy.signal.windows` | ||
window function, with ``sym=True``). If a float, will be treated as the | ||
``alpha`` parameter for a Tukey window (`~scipy.signal.windows.tukey`), | ||
in units of pixels. If None, no apodization will be performed | ||
resample: bool or dict | ||
If True or a dictionary, resamples the spectrum and template following | ||
the process in `template_logwl_resample`. If a dictionary, it will be | ||
used as the keywords for `template_logwl_resample`. For example, | ||
``resample=dict(delta_log_wavelength=.1)`` would be the same as calling | ||
``template_logwl_resample(spectrum, template, delta_log_wavelength=.1)``. | ||
If False, *no* resampling is performed (and the user is responsible for | ||
a sensible resampling). | ||
|
||
Returns | ||
------- | ||
(`~astropy.units.Quantity`, `~astropy.units.Quantity`) | ||
Arrays with correlation values and lags in km/s | ||
""" | ||
|
||
# resample if the user requested to log wavelength | ||
if resample: | ||
if resample is True: | ||
resample_kwargs = dict() # use defaults | ||
else: | ||
resample_kwargs = resample | ||
log_spectrum, log_template = template_logwl_resample(observed_spectrum, | ||
template_spectrum, | ||
**resample_kwargs) | ||
else: | ||
log_spectrum = observed_spectrum | ||
log_template = template_spectrum | ||
|
||
# apodize (might be a no-op if apodization_window is None) | ||
observed_log_spectrum, template_log_spectrum = _apodize(log_spectrum, | ||
log_template, | ||
apodization_window) | ||
# Normalize template | ||
normalization = _normalize(observed_log_spectrum, template_log_spectrum) | ||
|
||
# Not sure if we need to actually normalize the template. Depending | ||
# on the specific data uncertainty, the normalization factor | ||
# may turn out negative. That causes a flip of the correlation function, | ||
# in which the maximum (correlation peak) is no longer meaningful. | ||
if normalization < 0.: | ||
normalization = 1. | ||
|
||
# Correlate | ||
corr = np.correlate(observed_log_spectrum.flux.value, | ||
(template_log_spectrum.flux.value * normalization), | ||
mode='full') | ||
|
||
# Compute lag | ||
# wave_l is the wavelength array equally spaced in log space. | ||
wave_l = observed_log_spectrum.spectral_axis.value | ||
delta_log_wave = np.log10(wave_l[1]) - np.log10(wave_l[0]) | ||
deltas = (np.array(range(len(corr))) - len(corr)/2 + 0.5) * delta_log_wave | ||
lags = np.power(10., deltas) - 1. | ||
|
||
if u.dimensionless_unscaled.is_equivalent(lag_units): | ||
lags = Quantity(lags, u.dimensionless_unscaled) | ||
elif _KMS.is_equivalent(lag_units): | ||
lags = lags * const.c.to(lag_units) | ||
else: | ||
raise u.UnitsError('lag_units must be either velocity or dimensionless') | ||
|
||
return corr * u.dimensionless_unscaled, lags | ||
|
||
|
||
def _apodize(spectrum, template, apodization_window): | ||
# Apodization. Must be performed after resampling. | ||
if apodization_window is None: | ||
clean_spectrum = spectrum | ||
clean_template = template | ||
else: | ||
if callable(apodization_window): | ||
window = apodization_window | ||
else: | ||
def window(wlen): | ||
return tukey(wlen, alpha=apodization_window) | ||
clean_spectrum = spectrum * window(len(spectrum.wavelength)) | ||
clean_template = template * window(len(template.wavelength)) | ||
|
||
return clean_spectrum, clean_template | ||
|
||
|
||
def template_logwl_resample(spectrum, template, wblue=None, wred=None, | ||
delta_log_wavelength=None, | ||
resampler=LinearInterpolatedResampler()): | ||
""" | ||
Resample a spectrum and template onto a common log-spaced spectral grid. | ||
|
||
If wavelength limits are not provided, the function will use | ||
the limits of the merged (observed+template) wavelength scale | ||
for building the log-wavelength scale. | ||
|
||
For the wavelength step, the function uses either the smallest wavelength | ||
interval found in the *observed* spectrum, or takes it from the | ||
``delta_log_wavelength`` parameter. | ||
|
||
Parameters | ||
---------- | ||
observed_spectrum : :class:`~specutils.Spectrum1D` | ||
The observed spectrum. | ||
template_spectrum : :class:`~specutils.Spectrum1D` | ||
The template spectrum. | ||
wblue, wred: float | ||
Wavelength limits to include in the correlation. | ||
delta_log_wavelength: float | ||
Log-wavelength step to use to build the log-wavelength | ||
scale. If None, use limits defined as explained above. | ||
resampler | ||
A specutils resampler to use to actually do the resampling. Defaults to | ||
using a `~specutils.manipulation.LinearInterpolatedResampler`. | ||
|
||
Returns | ||
------- | ||
resampled_observed : :class:`~specutils.Spectrum1D` | ||
The observed spectrum resampled to a common spectral_axis. | ||
resampled_template: :class:`~specutils.Spectrum1D` | ||
The template spectrum resampled to a common spectral_axis. | ||
""" | ||
|
||
# Build an equally-spaced log-wavelength array based on | ||
# the input and template spectrum's limit wavelengths and | ||
# smallest sampling interval. Consider only the observed spectrum's | ||
# sampling, since it's the one that counts for the final accuracy | ||
# of the correlation. Alternatively, use the wred and wblue limits, | ||
# and delta log wave provided by the user. | ||
|
||
if wblue: | ||
w0 = np.log10(wblue) | ||
else: | ||
ws0 = np.log10(spectrum.spectral_axis[0].value) | ||
wt0 = np.log10(template.spectral_axis[0].value) | ||
w0 = min(ws0, wt0) | ||
|
||
if wred: | ||
w1 = np.log10(wred) | ||
else: | ||
ws1 = np.log10(spectrum.spectral_axis[-1].value) | ||
wt1 = np.log10(template.spectral_axis[-1].value) | ||
w1 = max(ws1, wt1) | ||
|
||
if delta_log_wavelength is None: | ||
ds = np.log10(spectrum.spectral_axis.value[1:]) - np.log10(spectrum.spectral_axis.value[:-1]) | ||
dw = ds[np.argmin(ds)] | ||
else: | ||
dw = delta_log_wavelength | ||
|
||
nsamples = int((w1 - w0) / dw) | ||
|
||
log_wave_array = np.ones(nsamples) * w0 | ||
for i in range(nsamples): | ||
log_wave_array[i] += dw * i | ||
|
||
# Build the corresponding wavelength array | ||
wave_array = np.power(10., log_wave_array) * spectrum.spectral_axis.unit | ||
|
||
# Resample spectrum and template into wavelength array so built | ||
resampled_spectrum = resampler(spectrum, wave_array) | ||
resampled_template = resampler(template, wave_array) | ||
|
||
# Resampler leaves Nans on flux bins that aren't touched by it. | ||
# We replace with zeros. This has the net effect of zero-padding | ||
# the spectrum and/or template so they exactly match each other, | ||
# wavelengthwise. | ||
clean_spectrum_flux = np.nan_to_num(resampled_spectrum.flux.value) * resampled_spectrum.flux.unit | ||
clean_template_flux = np.nan_to_num(resampled_template.flux.value) * resampled_template.flux.unit | ||
|
||
clean_spectrum = Spectrum1D(spectral_axis=resampled_spectrum.spectral_axis, | ||
flux=clean_spectrum_flux, | ||
uncertainty=resampled_spectrum.uncertainty, | ||
velocity_convention='optical', | ||
rest_value=spectrum.rest_value) | ||
clean_template = Spectrum1D(spectral_axis=resampled_template.spectral_axis, | ||
flux=clean_template_flux, | ||
uncertainty=resampled_template.uncertainty, | ||
velocity_convention='optical', | ||
rest_value=template.rest_value) | ||
|
||
return clean_spectrum, clean_template | ||
|
||
|
||
def _normalize(observed_spectrum, template_spectrum): | ||
""" | ||
Calculate a scale factor to be applied to the template spectrum so the | ||
total flux in both spectra will be the same. | ||
|
||
Parameters | ||
---------- | ||
observed_spectrum : :class:`~specutils.Spectrum1D` | ||
The observed spectrum. | ||
template_spectrum : :class:`~specutils.Spectrum1D` | ||
The template spectrum, which needs to be normalized in order to be | ||
compared with the observed spectrum. | ||
|
||
Returns | ||
------- | ||
`float` | ||
A float which will normalize the template spectrum's flux so that it | ||
can be compared to the observed spectrum. | ||
""" | ||
num = np.nansum((observed_spectrum.flux*template_spectrum.flux)/ | ||
(observed_spectrum.uncertainty.array**2)) | ||
denom = np.nansum((template_spectrum.flux/ | ||
observed_spectrum.uncertainty.array)**2) | ||
|
||
return num/denom |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably a question for @eteq or @camipacifici: would be want to demonstrate loading in a simple ASCII file with e.g. wavelength/flux values to use as the template? (That is, for someone perusing the docs, would that be more of a helpful example?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I think the more useful one is something that actually uses real templates that are loaded as
Spectrum1D
objects. That is, a fits file with an actual spectrum.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E.g. take a galaxy template from synphot and convert it to ascii and load in into a Spectrum1D object to use as the template.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some templates in the Synphot library (Bruzual-Charlot galaxy templates for example) https://pysynphot.readthedocs.io/en/latest/ $PYSYN_CDBS/grid/bc95/. I'd suggest bc95_g_10E9.fits. I don't think there is a URL available for this single file, but @pllim might know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In developing a test case which uses a real template spectrum (wave and flux from bc95_g_10E9.fits converted to ASCII format), I realized that we need also an observed spectrum to correlate against this template. Any idea on how to get one of those without resorting to too many manipulations? The goal is to have this in the docs, so I believe it has to be straightforward to get the observed spectrum, as it is to get the template.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ibusko - For an observed spectrum, I think the easiest thing is to use the SDSS spectrum that's already an example from the specutils docs - see https://specutils.readthedocs.io/en/stable/ (or the
index.rst
here) for the exact URL. That way we don't have to keep track of multiple examples.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although @camipacifici pointed out to me out-of-band that she has some actual observed spectra and matched templates for them that is part of the notebook she's working on. Maybe we should use those as the template/observed combo?
Unfortunately I'm not sure it's going to be easy to access these publicly until she gets a chance to put them in a reasonable data location. So it might be best to follow the above just to get this PR in, but with a plan to move to @camipacifici's data once we can get it (and use it also in the chi2-template-fitting section?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The data is served by RedCat, independent of Synphotses code base. Pick your poison at https://ssb.stsci.edu/cdbs/grid/bc95/templates/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put in place a unit test function in a branch in my fork for people to review and continue the discussion. The goal is to have something that can be put in the docs as an example on how to use the correlation function with real data. I found that it is perhaps trickier and lengthier than it should be, since it involves other aspects of the process besides the correlation itself. The code is in function
test_correlation_real_data
in ibusko:redshift-measurement-sandbox.Temporarily I have both the observation and template spectra files added to the
tests
directory, because my dev environment (in PyCharm) can't run the test (it skips it) if I use the remote file loading code. It runs from the command line tough, so it's not a real problem. More important, remote loading works for the observed SDSS spectrum, but not for the synphot template. This file has to be manually changed into ECSV format to be usable. I think there is a way to use the file as is, but I didn't invest time pursuing it since that's not the goal of this exercise. These are topics to be solved later on. For now, we have to decide how to translate the code in the test function into a doc-usable code snippet.Note that the test function is not doing all possible steps that are eventually needed to pre-process spectra before they can be cross-correlated, such as continuum subtraction or rectification. Should they be included in the example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I implemented API for
synphot
to read from and write toSpectrum1D
. Maybe it'll make things easier for you. See spacetelescope/synphot_refactor#243