Fix bug in error propagation in exposure differences that results in inflated uncertainties #54

gbrammer · 2024-02-22T10:28:38Z

There was a bug in the way that make_diff_image propagated the variances when combing the "positive" and "negative" sets of exposures.

msaexp/msaexp/slit_combine.py

Lines 872 to 882 in ef48256

    
           pos  = np.nansum(self.data[ipos,:], axis=0) / np.nansum(self.mask[ipos,:], 
        
                            axis=0) 
        
           vpos = np.nansum(self.var[ipos,:], axis=0) / np.nansum(self.mask[ipos,:], 
        
                            axis=0) 
        
           if self.diffs: 
        
               ineg = ~self.unp[exp] 
        
               neg  = (np.nansum(self.data[ineg,:], axis=0) /  
        
                       np.nansum(self.bkg_mask[ineg,:], axis=0)) 
        
               vneg = (np.nansum(self.var[ineg,:], axis=0) /  
        
                       np.nansum(self.bkg_mask[ineg,:], axis=0))

The first line pos = ... is an average across the science arrays, and the vpos = ... is supposed to be the propagated variance of this. The denominator of these lines summing over the mask arrays is the number N of valid pixels across exposures that are combined together. However, the denominator in vpos = ... and vneg = ... should be N**2, where it is N above!

With this bug, the computed variance was then just the "average variance", not the "variance of the average" reduced by a factor N from the combination. The result is that the final reported uncertainties of the output spectra are too large by a factor Nexp**(1/4), where Nexp is the number of exposures in a group that were combined together (not necessarily the total number of exposures in a particular grating across all groups from, e.g., multiple masks). The extra factor of 2 in this correction relative to sqrt(Nexp) comes from the fact that one power of N was already included—if the vpos = ... variances had just been summed without the denominator, the uncertainties would be Nexp**(1/2) too large.

For a standard 3-shutter nod pattern with Nexp=3, the uncertainties under this bug will be of order 3**1/4 ~ 1.3 too high, but it can be larger for larger groups of combined exposures in the deeper programs. For the UNCOVER program, for example, Nexp = 18 in most cases for the groups of exposures for a particular mask, so the uncertainties are too large by a factor 18**1/4 ~ 2.

This bug affects all of the public v2 MSAEXP extractions before the date of this fix (22-Feb-2024), though to some extent, it doesn't directly affect the parameters derived from the spectra since the uncertainties are rescaled anyway in the redshift and line fits.

gbrammer · 2024-02-22T10:33:51Z

There are some other minor updates to docstrings in the commit above. The relevant lines addressing the bug are

msaexp/msaexp/slit_combine.py

Lines 883 to 894 in 3e012ec

    
           pos  = (np.nansum(self.data[ipos,:], axis=0) / 
        
                   np.nansum(self.mask[ipos,:], axis=0)) 
        
           vpos = (np.nansum(self.var[ipos,:], axis=0) / 
        
                   np.nansum(self.mask[ipos,:], axis=0)**2) 
        
           if self.diffs: 
        
               ineg = ~self.unp[exp] 
        
               neg  = (np.nansum(self.data[ineg,:], axis=0) / 
        
                       np.nansum(self.bkg_mask[ineg,:], axis=0)) 
        
               vneg = (np.nansum(self.var[ineg,:], axis=0) / 
        
                       np.nansum(self.bkg_mask[ineg,:], axis=0)**2)

fix bug in error propagation in exposure differences

4d0cc4a

gbrammer merged commit 3e012ec into main Feb 22, 2024
5 checks passed

gbrammer changed the title ~~Fix bug in error propagation in exposure differences that result in inflated uncertainties~~ Fix bug in error propagation in exposure differences that results in inflated uncertainties Feb 22, 2024

gbrammer deleted the variance-bug branch April 25, 2024 10:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix bug in error propagation in exposure differences that results in inflated uncertainties #54

Fix bug in error propagation in exposure differences that results in inflated uncertainties #54

gbrammer commented Feb 22, 2024 •

edited

gbrammer commented Feb 22, 2024

	pos = np.nansum(self.data[ipos,:], axis=0) / np.nansum(self.mask[ipos,:],
	axis=0)
	vpos = np.nansum(self.var[ipos,:], axis=0) / np.nansum(self.mask[ipos,:],
	axis=0)

	if self.diffs:
	ineg = ~self.unp[exp]
	neg = (np.nansum(self.data[ineg,:], axis=0) /
	np.nansum(self.bkg_mask[ineg,:], axis=0))
	vneg = (np.nansum(self.var[ineg,:], axis=0) /
	np.nansum(self.bkg_mask[ineg,:], axis=0))

Fix bug in error propagation in exposure differences that results in inflated uncertainties #54

Fix bug in error propagation in exposure differences that results in inflated uncertainties #54

Conversation

gbrammer commented Feb 22, 2024 • edited

gbrammer commented Feb 22, 2024

gbrammer commented Feb 22, 2024 •

edited