Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introducing Horne extraction #84

Merged
merged 12 commits into from
Mar 29, 2022
Merged

Conversation

ojustino
Copy link
Contributor

This pull request introduces Horne extraction classes, a Jupyter notebook adapted from spacetelescope/dat_pyinthesky#163 explaining how the classes were developed, and another Jupyter notebook comparing the results.

(I'm manually tagging @ibusko since I can't currently request him as a reviewer.)

extract.py

After some productive troubleshooting with @PatrickOgle, we have the HorneExtract and OptimalExtract classes. I made the latter as an alias since it seems many prefer that name.

There are a few NOTE comments in the code. Here's some other food for thought:

  1. Should we still expect to support CCDData image objects? Most of our examples use numpy arrays.
  2. How much control should users have of the model used to fit fluxes in each column? For now, it's assumed to be Gaussian with a hard-coded spatial standard deviation. We could take that standard deviation as an argument.
    • Taking the whole model as an argument (as with bkgrd_prof) allows usage of other models (Moffat, etc.) but presents difficulties because we need to use the trace to fix the mean and not every astropy.modeling model has a "mean" parameter.

optimal_extract_VLT.ipynb

I recommend at least reading the introduction to see a step-by-step description which steps of the Horne extraction process I've covered in the classes and which are left to the user.

compare_extractions.ipynb

We can take or leave this notebook, which generates a fake image and then compares the results from the Horne and boxcar algorithms. It was helpful in diagnosing some issues with the Horne work, and the final result is a good visualization of the improvement in S/N gained through Horne extraction.

@PatrickOgle
Copy link

Optimal extract works as expected in compare_extractions.ipynb. The normalization matches BoxcarExtract and the SNR is greatly improved for a wide extraction aperture.

I think an essential addition would be to add the option of "weights" analogous to the weights used in BoxcarExtract, to enable the delineation of an extraction region and allow suppression of bad regions. These weights should be multiplied into the optimal weights.

@PatrickOgle
Copy link

One future enhancement (not for this PR) would be to allow the user to compute the kernel from a higher SNR secondary source, such as a 2D spectrum of a star observed with the same instrumental stetup. The 2D star spectrum would be input, stellar trace and kernal computed, then shifted to align with the object trace. This would allow optimal extraction of extremely noisy spectra where the instrumental profile is difficult to measure.

specreduce/extract.py Outdated Show resolved Hide resolved
specreduce/extract.py Outdated Show resolved Hide resolved
Comment on lines 226 to 227
if np.ma.is_masked(g_x):
continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when the continue is triggered, won't the length of pixels and extracted be different when you create the Spectrum1D?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there has to be a more efficient way to vectorize this bit, but i'd have to give it a deeper think on how to do it...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kecnry you're right. I was switching between approaches of appending to an empty list and filling in a pre-made array of size pixels. I'm still thinking about how both fare with vectorization.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think using np.fromfunction is the approach we ultimately want to use here to leverage numpy broadcasting as much as possible. it'll require some refactoring/rearranging of the logic here so that might be something to leave for a future optimization. np.fromiter is another option that's more efficient than using lists.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that all said, it's also fine to have the initial implementation be straightforward and not optimal performance. provides a baseline for future improvements to compare to.

Copy link
Contributor Author

@ojustino ojustino Mar 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got rid of the continue in the latest commit. Since the arrays are masked, any NaNs can just be carried on to the final, extracted 1D spectrum. (Replied in the wrong thread; see above.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tepickering, I didn't find a sensible way to build kernel_vals or norms with np.fromiter or np.fromfunction, so I still create them with a loop. Continuing to append to a list was a small percentage faster than filling arrays.

kernel_vals becomes a 2D array that needs the fit object's mean attribute in each column to be modified before any fit results are stored in it. If that pre-modification wasn't necessary, I could see how np.fromfunction would be useful, plus the array of index locations it passes to the inner function would eliminate the need for xd_pixels. np.fromiter can only return 1D arrays, so it seems like a no-go here.

Each entry in norms must be calculated immediately after the corresponding fit in kernel_vals. This means we'd need to get norms from the same call to either numpy function as kernel_vals, and I don't think either is set up to give two separate arrays as output.

An alternate approach I tried for norms was to turn fit_ext_kernel into an array of separate fit objects for each column. I could then call np.fromiter twice on an inner function that used getattr() -- one call for amplitude and the other for standard deviation. However, even creating the array of fit objects was much slower than using the original loop.

image : `~astropy.nddata.CCDData` or array-like, required
The input 2D spectrum from which to extract a source.

variance : `~astropy.nddata.CCDData` or array-like, required
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we require inputs to be in CCDData format, then the image and variance data can be packaged into the same object (as well as masks). that's one of the main points of using CCDData and why it is a core component of astropy. so i strongly encourage its use here and elsewhere in specreduce.

Copy link
Contributor

@tepickering tepickering Mar 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also note that in DRAGONS if no variance is available, a uniform variance is calculated based on pixel-to-pixel variations in the data. a similar fall-back could be implemented here. to whit:

if var is None:
    var_model = models.Const1D(sigma_clipped_stats(data, mask=mask)[2] ** 2)
    var = np.full_like(data[ix1:ix2], var_model.amplitude)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood on the benefits of CCDData. An option could be to include variance and mask keyword arguments that default to None for users who prefer to work with separate numpy arrays for whatever reason. Solely handling CCDData objects could move the barrier to entry higher for users depending on how familiar they are with them.

I am hesitant on calculating a variance for the user. My impression is that the calculation varies based on the instrument and what other data products you have available (weights, etc.). If that's the case, it could be hard to do that in a general manner.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i guess an easy-ish way to do it would be to add a check if image is not CCDData, and if that's the case, make CCDData instance from that plus mask and variance. and we'd need a way to input a unit if image is not a Quantity. still, that's extra code and logic for us to maintain that wouldn't be required if we just consistently use CCDData internally. users that have normal numpy arrays can always just pack them into CCDData at call-time. e.g.:

horne_extract(CCDData(image, variance=variance, mask=mask, unit=unit), trace)

users of ground-based data that do initial processing using ccdproc will already have CCDData instances ready to go.

hear you on making variance assumptions. requiring it to be specified is a valid approach.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is no variance data, I think setting it to unity (or any other constant) everywhere will work, without having to compute the variance from the science data.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it would be good to support CCDData, I think we also need to keep the option to support less structured inputs, such as bare numpy arrays. Not everyone in the community uses the CCDData format. @eteq What do you think?


# fit source profile, using Gaussian model as a template
# NOTE: could add argument for users to provide their own model
gauss_prof = models.Gaussian1D(amplitude=coadd.max(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gaussian profiles are a very poor assumption for any kind of extended source. so i think it'd be worth coding in some generality to at least support a list of a few models that have similar-ish inputs. a model_pars input dict could be used to pass configuration for an input model.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any that jump out at you as "must-include?" I mentioned Moffat earlier. Sérsic and Lorentz seem reasonable to me. I'm not sure which are seen as standard in this field.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moffat for sure since it's widely applicable for ground-based data. lorentz and voigt would be appropriate as well. sérsic is tricky because it's one-sided. definitely want to stick with symmetric profiles for now...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optimal extraction doesn't work in principle for extended sources. You can't sensibly weight one part of an extended galaxy more than another. Optimal extraction is intended for point sources.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i was thinking more along the lines of spatially resolved, but compact, galaxies where the spectrum doesn't change significantly as a function of radius. that said, it's a pretty niche case and gaussian + moffat probably cover the vast majority of needs.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For point sources, getting the kernel shape exactly right is not too important and will only have a small effect on the S/N of the extracted spectrum. For extended sources, a point source (or any other centrally peaked) kernel will down-weight the extended emission and up-weight the core.

HorneExtract can take an image as a CCDData object or a numpy array.
The variance now argument only applies in the latter case, along with
the new mask and unit arguments. HorneExtract now takes better advantage
of numpy broadcasting. OptimalExtract now recycles HorneExtract's
docstring.
@ojustino
Copy link
Contributor Author

ojustino commented Mar 9, 2022

The latest commit...

  • Adds new mask and unit arguments to HorneExtract.
  • Doesn't change HorneExtract's assumption of a Gaussian1D model of the source profile.
  • Takes better advantage of numpy broadcasting in HorneExtract's calculation of the final 1D spectrum. These edits cut about a third off the runtime of the example in compare_extractions.ipynb. There may be room for more optimization; I've left more details on my attempts in a comment above.
  • Tries to draw a balance between preferring CCDData image objects while also accepting images as numpy arrays. As Tim warned, handling numpy arrays requires supporting a web of scenarios, and I left some out for the sake of time (e.g., if image is a masked array, then how do we handle the mask argument?). I'm interested to see comments on my implementation.
    • To address @PatrickOgle's request for weights, one can now either assign a mask to the image's CCDData object or provide a numpy array with the mask argument. Both of these would be binary masks, so they're not completely analogous to a weight image.
  • Recycles HorneExtract's docstring in OptimalExtract to prevent future syncing issues.

The notebooks will need to be updated once we finalize HorneExtract's arguments.

@eteq
Copy link
Member

eteq commented Mar 10, 2022

Just a quick comment here w.r.t. CCDData : yes, we absolutely should accept CCDData objects, that's a critical use case because, for example, that's what ccdproc natively uses.

That said, two complexities:

  1. If possible it would be better to use NDData as the underlying assumed interface. CCDData is a subclass of NDData so that still allows CCDData, but I'm not clear that we need any of the specialization that CCDData adds. Regardless, though, this could be treated as duck-typing - that is, assume it's an NDData and only error if its missing some attributes.
  2. There are probably some users who just wants to give an array and not use any of the extra features in CCDData. I'd suggest we allow this by just trying to create a CCDData object out of the input if it's an ndarray subclass.

Combining those two together, the assumption is that inside the blocks is "the data is an NDData or subclass" (or maybe CCDData if there's some feature CCDData has that NDData does not that we actually need), but "if you give an array, a utility function that we provide as a part of specreduce will traslate that into CCDData"

@tepickering
Copy link
Contributor

i agree with @eteq that NDData may be a better, more general requirement. it provides the interface we need for specifying mask and variance arrays as well as units so that we're not reinventing any wheels to handle them. the main difference at this stage is that CCDData requires unit to be specified while NDData does not. i don't think they're strictly required.

Copy link
Member

@kecnry kecnry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a few minor comments. Otherwise I think the code itself is looking pretty clean and ready to go (although I'll leave the science review of the actual results to others).

notebook_sandbox/horne_extract/requirements.txt Outdated Show resolved Hide resolved
specreduce/extract.py Show resolved Hide resolved
Copy link

@PatrickOgle PatrickOgle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Horne extraction works and gives the expected results for numpy array inputs. Please consider the following changes:

  1. Make mask and unit optional inputs with default 'zero' mask and unitless units. These inputs are optional for Boxcar and should be for Horne as well. However, it does make sense for variance to be required, since it is required for the Horne algorithm.

  2. Update compare_extractions notebook to follow the HorneExtract required inputs, e.g.:
    from astropy import units as u
    mask=img-img
    hrn = HorneExtract()
    hrn_result1d_whole = hrn(img, trace, variance=variance, mask=mask, unit=u.Jy)

  3. Also give an example in compare_extractions.ipynb of using a CCDData object as input. I did not try this mode of input.


# will also need to clone specreduce
# git clone git@github.com:astropy/specreduce.git OR git clone https://github.com/astropy/specreduce
# find and delete all occurrences of ".data" in kosmos/apextract.py
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that if NDData or CCDData is used as an input, like me and @eteq have requested, this wouldn't be required. the docstring in kosmos says 2d numpy array, or CCDData object, but the use of the .data attribute means it actually needs something NDData-like.

Copy link
Contributor Author

@ojustino ojustino Mar 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this, but it doesn't work because the KOSMOS trace also tries to call img.shape on the object, which isn't an NDData attribute. The best solution would be to update this notebook to use KosmosTrace once #85 is finished.

@ojustino
Copy link
Contributor Author

I was asked to collect outstanding issues in a single comment. Please share opinions on them and guidance on which of these must be decided before a merge. For a summary of the most recent changes, look here.

  1. Arguments. The required Trace object and optional specification of dispersion and cross-dispersion axes seem non-controversial. On the other hand, we have the required image (currently with CCDData or array type) and variance/mask/unit arguments that are currently required if the image is a numpy array.
    • Erik proposed switching from CCDData to NDData, which I can do.
    • Patrick proposed using default values for the mask (image-shaped array of zeros) and unit (unitless) arguments instead of forcing the user to specify them when using an array-based image. I think this could be confusing from the user perspective – one can pass in a CCDData image with its own mask and units but also see potentially different default values for those arguments in the docstring. What do others think?
    • I had asked whether/how to handle uncertainties with the InverseVariance type for CCDData/NDData objects.
    • Is the current approach to validating HorneExtract's arguments acceptable?
  2. Optimization. Most of the actual extraction process now makes better use of broadcasting, but part of it is still looped for reasons explained in greater detail here. Is the current level of optimization in HorneExtract OK?
  3. Documentation. Does the changed approach to the OptimalExtract docstring seem sustainable enough now?

I wrote earlier that I will wait to update the notebooks until we've decided on the final set of arguments.

@tepickering
Copy link
Contributor

I was asked to collect outstanding issues in a single comment. Please share opinions on them and guidance on which of these must be decided before a merge. For a summary of the most recent changes, look here.

  1. Arguments. The required Trace object and optional specification of dispersion and cross-dispersion axes seem non-controversial. On the other hand, we have the required image (currently with CCDData or array type) and variance/mask/unit arguments that are currently required if the image is a numpy array.

    • Erik proposed switching from CCDData to NDData, which I can do.
    • Patrick proposed using default values for the mask (image-shaped array of zeros) and unit (unitless) arguments instead of forcing the user to specify them when using an array-based image. I think this could be confusing from the user perspective – one can pass in a CCDData image with its own mask and units but also see potentially different default values for those arguments in the docstring. What do others think?

part of the point of using NDData-like structures is to use their handing/validation of mask/variance inputs rather than rolling our own, possibly conflicting ones. as @eteq suggested, if a raw ndarray is input, then pack that into an NDData with uncertainty set to None and mask set to either None or zeros_like the input array.

  • I had asked whether/how to handle uncertainties with the InverseVariance type for CCDData/NDData objects.

i think it'd be enough to add a check for that type and set variance = 1 / image.uncertainty.array in case of a match. unless i'm missing something?

  • Is the current approach to validating HorneExtract's arguments acceptable?
  • i would only have disp_axis as an argument and not both that and crossdisp_axis. we're only dealing with 2D data here so the latter is defined by the former. disp_axis should really be attached to the Trace object. i'll set up an issue to that effect.
  • i would much prefer to remove the variance, mask, and unit arguments and have them be included with image as part of an NDData object if the user wishes to specify them.
  1. Optimization. Most of the actual extraction process now makes better use of broadcasting, but part of it is still looped for reasons explained in greater detail here. Is the current level of optimization in HorneExtract OK?

i think it's good enough for now. significant gains would probably require significant work so best to see if we really need it first.

  1. Documentation. Does the changed approach to the OptimalExtract docstring seem sustainable enough now?

i'd say it's fine for now.

I wrote earlier that I will wait to update the notebooks until we've decided on the final set of arguments.

Copy link

@PatrickOgle PatrickOgle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried HorneExtract on fake dataset in compare_extractions. Tested input numpy array, NDData, or CCDData object and all 3 worked. Horne extraction gives lower noise in a wide aperture than does Boxcar Extract, as expected

@ojustino
Copy link
Contributor Author

@tepickering:

part of the point of using NDData-like structures is to use their handing/validation of mask/variance inputs rather than rolling our own, possibly conflicting ones.

The alternative validation may be necessary because NDData and CCDData are not consistent in how they handle uncertainties. I wrote a comment on this in the class as well, but uncertainties given as bare arrays to CCDData are turned into StdDevUncertainty objects. The default uncertainty type for NDData if one is given as an array is UnknownUncertainty.

We can't control whether users remember to wrap the variance array with a VarianceUncertainty() while creating their *DData object, so I thought we would need to be able to proactively anticipate each scenario.

as @eteq suggested, if a raw ndarray is input, then pack that into an NDData with uncertainty set to None and mask set to either None or zeros_like the input array.

A default mask of zeros seems to have broad support, so I'll implement that. I don't believe setting uncertainty to None if it's absent works -- we need one to do the extraction.

Internally turning the NDData-typed object into a masked array for the actual extraction seems easier because you don't need to take extra steps to make sure the mask is respected. For example, NDData([3,4,5], mask=[False,True,False]).data.sum() is 12 instead of 8 because the presence of a valid number at the masked index overrides the mask.

i think it'd be enough to add a check for that type and set variance = 1 / image.uncertainty.array in case of a match. unless i'm missing something?

I'm not sure that's what it actually means and was hoping someone else would know. The docs page doesn't go into the math, but from the Wikipedia page it seems like it's not that simple.

i would only have disp_axis as an argument and not both that and crossdisp_axis. we're only dealing with 2D data here so the latter is defined by the former. disp_axis should really be attached to the Trace object. i'll set up an issue to that effect.

Would you prefer I remove them as arguments like I did here with KosmosTrace? We could remember to remove the assumed values once it once Trace has those attributes.

i would much prefer to remove the variance, mask, and unit arguments and have them be included with image as part of an NDData object if the user wishes to specify them.

I'm trying to thread a needle between your preference and other stakeholders' desires to work with arrays. No one has relented, so the compromise is to support both.

@tepickering
Copy link
Contributor

@tepickering:

part of the point of using NDData-like structures is to use their handing/validation of mask/variance inputs rather than rolling our own, possibly conflicting ones.

The alternative validation may be necessary because NDData and CCDData are not consistent in how they handle uncertainties. I wrote a comment on this in the class as well, but uncertainties given as bare arrays to CCDData are turned into StdDevUncertainty objects. The default uncertainty type for NDData if one is given as an array is UnknownUncertainty.

We can't control whether users remember to wrap the variance array with a VarianceUncertainty() while creating their *DData object, so I thought we would need to be able to proactively anticipate each scenario.

as @eteq suggested, if a raw ndarray is input, then pack that into an NDData with uncertainty set to None and mask set to either None or zeros_like the input array.

A default mask of zeros seems to have broad support, so I'll implement that. I don't believe setting uncertainty to None if it's absent works -- we need one to do the extraction.

i think a reasonable approach would be to raise an exception if it's None , since it is needed. we could be nice and assume UnknownUncertainty is actually variance and make the conversion with a warning (which i think is generated, anyway). we'd have to make that assumption with a raw array, anyway.

Internally turning the NDData-typed object into a masked array for the actual extraction seems easier because you don't need to take extra steps to make sure the mask is respected. For example, NDData([3,4,5], mask=[False,True,False]).data.sum() is 12 instead of 8 because the presence of a valid number at the masked index overrides the mask.

i think it'd be enough to add a check for that type and set variance = 1 / image.uncertainty.array in case of a match. unless i'm missing something?

I'm not sure that's what it actually means and was hoping someone else would know. The docs page doesn't go into the math, but from the Wikipedia page it seems like it's not that simple.

the docs do give a few simple examples of the math, but they definitely could be clearer. the examples in the docs for StdDevUncertainty, VarianceUncertainty, and InverseVariance use the same uncertainties expressed in each of the three ways. so you can see that variance is stddev**2 and inverse variance is 1/variance.

the 1/variance trick works when the uncertainties are uncorrelated. full support for correlated uncertainties is a deeper, stickier problem we don't want to get into now, i think.

i would only have disp_axis as an argument and not both that and crossdisp_axis. we're only dealing with 2D data here so the latter is defined by the former. disp_axis should really be attached to the Trace object. i'll set up an issue to that effect.

Would you prefer I remove them as arguments like I did here with KosmosTrace? We could remember to remove the assumed values once it once Trace has those attributes.

as i suggested in #86, we can keep them for now and clean them up later when the work is done to add those attributes to Trace classes.

i would much prefer to remove the variance, mask, and unit arguments and have them be included with image as part of an NDData object if the user wishes to specify them.

I'm trying to thread a needle between your preference and other stakeholders' desires to work with arrays. No one has relented, so the compromise is to support both.

we had a discussion about this during the astropy coordination meeting this week and how this relates to the future of NDData in general. NDData itself is threading a needle between the desire to use low-level ndarrays while other members of the larger community are advocating for even higher-level, more sophisticated data structures. @eteq would be the person most cognizant of the various sides to this and the histories involved.

@tepickering
Copy link
Contributor

i just noticed that in the notebook for this PR, the VLT weights are actually inverse variance. they're then converted to variance in exactly the way i suggest converting variance back to inverse variance.

@tepickering
Copy link
Contributor

i tried running through notebook_sandbox/horne_extract/optimal_extract_VLT.ipynb and i get stuck on this error in KTrace:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/home/tim/astropy/specreduce/notebook_sandbox/horne_extract/optimal_extract_VLT.ipynb Cell [2](vscode-notebook-cell://wsl%2Bmyubuntu/home/tim/astropy/specreduce/notebook_sandbox/horne_extract/optimal_extract_VLT.ipynb#ch0000045vscode-remote?line=1)7' in <cell line: 5>()
      2 sys.path.append(kosmos_path)
      [3](vscode-notebook-cell://wsl%2Bmyubuntu/home/tim/astropy/specreduce/notebook_sandbox/horne_extract/optimal_extract_VLT.ipynb#ch0000045vscode-remote?line=2) from kosmos.apextract import trace as KTrace
----> [5](vscode-notebook-cell://wsl%2Bmyubuntu/home/tim/astropy/specreduce/notebook_sandbox/horne_extract/optimal_extract_VLT.ipynb#ch0000045vscode-remote?line=4) vlt_trace1 = KTrace(extraction_region)

File ~/astropy/kosmos/kosmos/apextract.py:159, in trace(img, nbins, guess, window, Saxis, Waxis, display)
    [156](file:///home/tim/astropy/kosmos/kosmos/apextract.py?line=155) for i in range(0, len(xbins)-1):
    [157](file:///home/tim/astropy/kosmos/kosmos/apextract.py?line=156)     # fit gaussian within each window
    [158](file:///home/tim/astropy/kosmos/kosmos/apextract.py?line=157)     if Waxis == 1:
--> [159](file:///home/tim/astropy/kosmos/kosmos/apextract.py?line=158)         zi = np.nansum(img.data[ilum2, xbins[i]:xbins[i+1]], axis=Waxis)
    [160](file:///home/tim/astropy/kosmos/kosmos/apextract.py?line=159)     if Waxis == 0:
    [161](file:///home/tim/astropy/kosmos/kosmos/apextract.py?line=160)         zi = np.nansum(img.data[xbins[i]:xbins[i+1], ilum2], axis=Waxis)

TypeError: memoryview: invalid slice key

everything ran fine up to that point and looks right.

@ojustino
Copy link
Contributor Author

Did you follow the instructions in the requirements file for removing instances of .data in KOSMOS' apextract.py?

@tepickering
Copy link
Contributor

d'oh! fixed that and now it works. however, the KTrace() output needs to be wrapped into an ArrayTrace for the next cell to work.

@ojustino
Copy link
Contributor Author

It's a raw cell, so it's not supposed to execute. It's just there for demonstration. The real vlt_trace1 is defined earlier as a FlatTrace object.

@tepickering
Copy link
Contributor

gotcha. the wording is confusing, though. it implies that if you change the cell to a code cell and run it, it should work (modulo fixing kosmos). as a test, i added ArrayTrace to the imports and then changed that line to:

vlt_trace1 = ArrayTrace(extraction_region, trace=KTrace(extraction_region))

and then everything after that was happy either way.

it's up to you if you want to tweak that. i'm content to merge as-is since the runnable cells all seem to work as intended.

@ojustino
Copy link
Contributor Author

I'm also OK with leaving it for now and remembering to change it once we've merged #85. That will also allow us to remove the gymnastics of modifying the KOSMOS source code.

@tepickering
Copy link
Contributor

ok, i'm going to go merge #90 and then merge this. last call for any changes before i do so...

@ojustino
Copy link
Contributor Author

I'm about to push a change for accepting InverseVariance uncertainties since it's a simple tweak.

@tepickering tepickering merged commit 38ff816 into astropy:main Mar 29, 2022
Comment on lines +239 to +251
if any(arg is None for arg in (variance, mask, unit)):
raise ValueError('if image is a numpy array, the variance, '
'mask, and unit arguments must be specified. '
'consider wrapping that information into one '
'object by instead passing an NDData image.')
if image.shape != variance.shape:
raise ValueError('image and variance shapes must match')
if image.shape != mask.shape:
raise ValueError('image and mask shapes must match')

# fill in non-required arguments if empty
if mask is None:
mask = np.ma.masked_invalid(image)
Copy link
Member

@kecnry kecnry Apr 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ojustino - this came up today when trying to use Horne extraction. The first if statement raises an error if mask is not passed, but later there is a check to fallback if mask is None. Can both mask and unit (default to unitless as the docstring says) be optional when passing image as an array instead of an NDData object? (If so, maybe this should be a follow-up issue since this PR has since been merged, I just wanted to attach it to the lines of code somewhere to have a breadcrumb).

Edit: the if mask is None check will also need to happen before comparing image.shape to mask.shape.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we settled on mask and unit being optional for array-type images, so the if any() should be updated to reflect that. I can open a pull request for it you'd like to copy over some of what you've written in a new issue.

I agree that mask check should be moved earlier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants