SD pipeline revamp#281

Merged

e-koch merged 21 commits intoPhangsTeam:masterfrom

e-koch:sd_pipeline_revamp

Feb 9, 2026

Collaborator

e-koch commented Jan 6, 2026

Updating the pipeline to use the default ALMA SD pipeline with custom selection for baseline fitting.

Collaborator Author

e-koch commented Jan 14, 2026

Local tests working up to revamped SD imaging stage.

Passing hsd_baseline the line frequency ranges per targeting is working as expected.

e-koch requested a review from thomaswilliamsastro

January 14, 2026 13:21

Collaborator Author

e-koch commented Jan 14, 2026 •

edited

Loading

@thomaswilliamsastro this is ready for a first pass review.

I'm changing the behaviour from our usual loop to process all products per target and (under the hood) this changes the ordering of operations a bit in SingleDishHandler.

Some parts are kludgy but work. Most of the clean-up reflects metadata handling improvements in the ALMA pipeline or the CASA task (e.g., tsdimaging handles the units and solving for the beam)

Collaborator Author

e-koch commented Jan 14, 2026 •

edited by EOakes

Loading

To-dos:

Ensure new imaging loop over all products works as intended
Add optional image-plane baseline subtraction (seems to be needed for TP data with a fixed position OFF and the ON vs OFF have differing elevation; M33, and the LV LP data need this)
Consider how to handle combining TP from multiple projects. (My opinion is this is SD postprocessing and we combine SD cubes in the image plane).

e-koch mentioned this pull request

SingleDish pipeline can now handle multiple MS's #280

Merged

Collaborator Author

e-koch commented Jan 18, 2026 •

edited

Loading

On a full test of ngc7793_c from 2025.1.00576.L, the custom baseline fitting inputs with the ALMA pipeline tasks appear to be working as expected.

Here is SPW 19 with 13CO and C18O masked after baseline subtraction (incl. the default +/-200 km/s padding beyond the target range that is probably overkill):

and the pipeline default (no lines detected):

Collaborator Author

e-koch commented Jan 18, 2026

And the same for SPW 21 with 12CO:

W/ target freq range from the phangs pipeline and a linear fit:

Default ALMA pipeline:

e-koch added 12 commits

January 20, 2026 09:47


          Move existing SD pipeline routines to 'legacy' file

442d08d


          Add outline for wrapping the ALMA pipeline within the phangs framework

5792f79


          casaviewer import optional as its removed on Mac

7cf216d


          Add passing target/product info to pipeline wrapper; build freq range…

ba65eb9

… mask for baseline masking


          Major updates to SingleDishHandler and updated TP wrapper function

a1b240f


          Small clean-ups before end to end testing

a9dee95


          only loop through allowed SD products as set in the SingleDishHandler

62cdbb8


          Fix restfreq units; default to cleaning up after cube trim step

5a094c2


          Fix imaging names

98b283b


          Fix clean-up step

1db5515


          Checks for bl sub versions and warnings/errors for missing MSs

4e3317e


          Corrected copy names on export

b435919

e-koch force-pushed the sd_pipeline_revamp branch from a0d932f to b435919 Compare

January 20, 2026 14:50


          fix if else for legacy pipeline

9453e05

e-koch marked this pull request as ready for review

January 20, 2026 15:06

Collaborator Author

e-koch commented Jan 20, 2026

I have another full test run going locally after merging in #286 and #280. Otherwise this is ready for review.

There's some follow-up steps that can be addressed later after more discussion:

exporting calibrated spectra to take advantage of other SD baseline/gridding tools
implementation of image baseline subtraction as needed

low-sky approved these changes

View reviewed changes

Collaborator

low-sky left a comment

Looks great; I don't see any showstoppers, but added some low utility comments.

phangsPipeline/casaSingleDishALMAWrapper.py Show resolved Hide resolved

phangsPipeline/casaSingleDishALMAWrapper.py Show resolved Hide resolved

phangsPipeline/casaSingleDishALMAWrapper.py Show resolved Hide resolved

phangsPipeline/casaSingleDishALMAWrapper.py

+                      width=str(chan_dv_kms)+'km/s',
+                      start=str(start_vel)+'km/s',
+                      veltype ="radio",
+                      outframe='LSRK',

Collaborator

low-sky Jan 20, 2026

These will be nice to override in the future.

phangsPipeline/casaSingleDishALMAWrapper.py Outdated


		logger.info(f"Using these ASDM files: {EBsnames}")

		if len(EBsnames) == 0:

Collaborator

low-sky Jan 20, 2026

Shouldn't this come before the renaming step?

Collaborator Author

e-koch Jan 21, 2026

good point. moved it up

phangsPipeline/casaSingleDishALMAWrapper.py Outdated

+                          vel_line_mask = product_dict[this_product]['vel_line_mask']
+                          # Convert velocity range to frequency range
+                          freq_line_mask = (vel_line_mask * u.km / u.s).to(u.Hz, u.doppler_optical(freq_rest * u.MHz)).value

Collaborator

low-sky Jan 20, 2026

u.doppler_radio for consistency with hardwire convention choice?

Collaborator Author

e-koch Jan 21, 2026

I figured this would be minor. but agreed we should stick to the same velocity convention throughout.

phangsPipeline/casaSingleDishALMAWrapper.py Show resolved Hide resolved

phangsPipeline/casaSingleDishALMAWrapper.py Show resolved Hide resolved

e-koch added 2 commits

January 20, 2026 20:43


          Addressing @low-sky's comments

75c83f0


          Fix restfreq arg name

df16974

Collaborator Author

e-koch commented Jan 21, 2026

Add most of the suggestions in; thanks @low-sky !

Collaborator Author

e-koch commented Jan 21, 2026

@thomaswilliamsastro I have one more complete test run on-going; the storage is slow so it's taking longer than it should.

I would like it to finish to make sure I caught all the renaming of the SD products.

After that and your review, I'm ready to merge this.

Collaborator

thomaswilliamsastro commented Jan 21, 2026

@e-koch I should have some time to run this end of the week. Just let me know when your tests are done and I'll check it does what I'd expect on my end

Collaborator Author

e-koch commented Jan 21, 2026

@thomaswilliamsastro alright my test made it through. Ready for your testing!

Collaborator Author

e-koch commented Jan 22, 2026

Confirmed that there are minimal changes between the legacy TP pipeline output cube and this version.

This is the average 12CO(2-1) spectrum for ngc7793_3:

Blue is this version. Green is the legacy TP version.

Collaborator

thomaswilliamsastro commented Jan 23, 2026 •

edited

Loading

My testing is still ongoing but first couple of test cases:

NGC0300_1 using the monolithic CASA pipeline. Runs fine, cube is essentially identical. There's more channels in the revamped pipeline than the old one

Screenshot 2026-01-23 at 08 37 10

NGC5236_1 using pip-installed CASA pipeline. Pipeline runs but the cube has been trimmed weirdly, half of it seems to have been cut off. For the non cut-off bit, spectral profile is pretty much identical. UPDATE With the new commits, this now also runs fine

Screenshot 2026-01-25 at 15 59 14

NGC7793_2 using pip-installed CASA pipeline. Works as expected
NGC7793_1 using pip-installed CASA pipeline. Still issues here but might be me not reflecting changes

Collaborator Author

e-koch commented Jan 23, 2026

I'll check on the larger spectral range. I'm probably passing the padded range for the baseline fitting

e-koch added 3 commits

January 23, 2026 14:51


          Fix velocity range for baseline masking vs imaging

be83966


          Keep consistent "weight" naming for output weight cube

a37dacd


          Pass phase center from product_dict

816c2d4

Collaborator

thomaswilliamsastro commented Jan 25, 2026

@e-koch made some updates above. Only one problem left on my end!


          Restore field selection from PhangsTeam#277

Collaborator Author

e-koch commented Jan 28, 2026

@thomaswilliamsastro - alright I think I've fixed things re: #277 . I have a test run going for ngc7793_1 and _2.

The pipeiine is now going to re-run the whole calibration separately for each part, despite being observed together. I'm 99% sure the legacy pipeline was doing the same thing.

It's not the most efficient approach but this is likely to remain a corner-case for most nearby galaxy ALMA obs having multiple small mosaics in a single EB. If that changes, we can figure out how to reconcile this via the fname_dict checking for identical paths in the ms_file_key.txt

Collaborator

thomaswilliamsastro commented Jan 28, 2026

The pipeiine is now going to re-run the whole calibration separately for each part, despite being observed together. I'm 99% sure the legacy pipeline was doing the same thing.

Exactly, I did the dumb thing. Running this case now

This was linked to issues Jan 28, 2026

Handling multiple lines with the single dish pipeline #243

Closed

SD calibration doesn't import QA2 flags #267

Closed

e-koch added 2 commits

January 28, 2026 13:50


          Assume single source name to pass

0a12446


          Force concat to keep original source/field names

5e2b099

Collaborator Author

e-koch commented Jan 29, 2026

A few more notes from corner cases:

2025.1.00576.L specified fields with offsets from a common target location. The TP sources reflect this central field, not the offset. For TP MSs with multiple sources, the concat step requires respectname=True because the TP fields have the same coordinates in the source/field table.
Right now, we concat EBs in TOPO, letting tsdimaging handle the coordinate transforms. This path has no issues. For many EBs, concat (correctly) does not combine SPWs leading to many SPWs in the concat MS. Deep concat MSs could hit a limit on number of SPWs. I doubt we'll hit this but just noting in case someone did hit this issue in the future.

Collaborator Author

e-koch commented Jan 29, 2026

The line wing padding for the baseline fitting is currently set to 200 km/s beyond the source velocity range. This can likely be smaller and may hit some issues where the masked region covers a whole SPW.

We haven't hit an issue with this yet so I suggest not changing it in this PR for consistency with the legacy TP pipeline

Collaborator

low-sky commented Feb 4, 2026

First, I'm not 100% on the data model being used here so this might be way off.

Having faffed around with this for a while, I more convinced that we want to separate the SD calibration step from the gridding step. The pipeline is capable of doing all the calibration for all the SPWs at once, which can put a per EB calibrated stack of spectra into a directory tree. This can serve as the analogue of interferometric MS and we can then pick stack of MS with products to do the gridding into a single cube. Then, I think we will want to be throwing all the spectra bundles into single gridding operations for the whole cube on a per-line basis. This also makes it simpler to incorporate archival SD data, provided it can stumble through the pipeline. It does blow up the bookkeeping a bit (potentially arguing for another keyfile analogous to ms_file_key) but gives good flexibility and preserves our data philosophy.

Collaborator Author

e-koch commented Feb 5, 2026

This is a good point and we should do this.

@low-sky do you have an opinion on merging the current PR now as "updated and working with current data model" or including the data model changes together here? I think either could work as it's already a significant change but isn't breaking the API.

Collaborator

low-sky commented Feb 7, 2026 •

edited

Loading

I think probably pull the trigger on this merge and then do a refactor. This is clearly an important increment. I'm not sure that the best practice is stable enough across multiple MS imaging that next steps will be fast.

e-koch merged commit 720c128 into PhangsTeam:master

e-koch deleted the sd_pipeline_revamp branch

February 9, 2026 13:55

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet