# On-The-Fly (OTF) Data Reduction

This notebook shows how to use `dysh` to calibrate and grid an *On-The-Fly* (OTF) observation. See Mangum et al. (2007) in https://ui.adsabs.harvard.edu/abs/2007A%26A...474..679M for the background on this method.

This more complex observation is not reduced with one simple command in dysh, but is usually two steps: 

1. The first step is calibration, where the calibrated spectra are written to an SDFITS file. This can be done in dysh, usually in a loop, one spectrum a time.

2. The second step takes these spectra and write
them to a FITS cube. For this there are several third-party tools available, there is nothing in dysh yet to cover this.

In [None]:
from pathlib import Path
import astropy.units as u
from dysh.util.download import from_url
from dysh.fits.gbtfitsload import GBTFITSLoad
from dysh.util.files import dysh_data

from dysh.spectra import ScanBlock

## Data Retrieval

Download the example SDFITS data, if necessary. or rely on $DYSH_DATA

In [None]:
filename = dysh_data(example='mapping-L/data/TGBT17A_506_11.raw.vegas')    
print(filename)

## Data Loading

Next, we use `GBTFITSLoad` to load the data, and then its `summary` method to inspect its contents. This takes a few seconds.

In [None]:
sdfits = GBTFITSLoad(filename)

In [None]:
sdfits.summary()

In this particular observation the OTF slew over the galaxy NGC6946 in scans 14-26, followed by an Off position. Each **SCAN**, in this case, corresponds to a row, with 61 integrations as the telescope slews slowly over the sky.
Instead of the classical OnOff reduction with `getps`, we can use `getsigref` to refer to scan 27 as the common Off.


## Data Trimming

Below you will see a loop over scans that takes two minutes, but by trimming the input sdfits file to a smaller file with just the scans we need, this loop can go down to 10 seconds.  Hopefully in the near future we can bring down the processing time on the full data as well.  For now, suffer a bit, or do the following:
```
     sdfits.write(....)

```

@todo finish this example from the spyder code

## Data Reduction



In [None]:
ifnum=0        # needs to be 0, where the 21cm signal is (there are 5 IF's in the full data(
plnum=[0,1]    # pick 0 and 1, they will be averaged. or just pick one to speed up the code
fdnum=0        # only one feed in this dataset
nint = 61      # nunber of integrations per scan. we will use all of them, and get the OFF from another scan
ref = 27                      # the reference ("OFF") scan
scans = list(range(14,27))    # the source ("ON") scans

In [None]:
# test : each pol takes about 1 min
plnum=[0]

In [None]:
# test cell for 3 plots  = ISSUE 616
sb = ScanBlock()

for s in [20]:
    for p in plnum:
        print(f"Working on scan {s} pol {p}")
        sb1 =  sdfits.getsigref(scan=s, ref=ref, fdnum=fdnum, ifnum=ifnum, plnum=p)[0]
        #  @todo baseline subtraction
        sb.append(sb1)
print(f"Accumulated {len(sb)} scanblocks, each should contain 61 integrations")   
sb.timeaverage().plot(xaxis_unit='chan', title='one')


ta = sb.timeaverage()
ta.plot(xaxis_unit='chan', title='two')

# if you split the cell here, the "two" plot just shown will be incorrect and be baseline subtracted already

ta.baseline(1, [(1500,2500)], remove=True, model='polynomial')
sb.subtract_baseline(ta.baseline_model)
ta2 = sb.timeaverage()
ta2.plot(xaxis_unit='chan', title='three')

In [None]:
plnum=[0,1]
plnum=[0]

In [None]:
%%time 

sb = ScanBlock()

for s in scans:
    for p in plnum:
        print(f"Working on scan {s} pol {p}")
        sb1 =  sdfits.getsigref(scan=s, ref=ref, fdnum=fdnum, ifnum=ifnum, plnum=p)
        if len(sb1) != 1:
            print("big fat warning")
        #  @todo baseline subtraction here?
        sb.append(sb1[0])
print(f"Accumulated {len(sb)} scanblocks, each should contain 61 integrations")   

This calibration step takes about 2.5 minutes, since there are  1586 (13\*61\*2) spectra written. Clearly something for performance review.

## Baseline subtraction

This particular data has quite a strong non-zero offset, especially on the source itself, and baseline subtraction is needed. You can skip the next cell if you want to see how  the spectra look without a baseline subtraction.

@todo  well, it  doesn't seem to work yet

In [None]:
sb[7].calibrated(30).plot()

print(ta.baseline_model)
print(sb[7]._calibrated.mean())

In [None]:

sb[7].subtract_baseline(ta.baseline_model, force=True)

print(sb[7]._calibrated.mean())
# ta.baseline_model


In [None]:
sb[14].calibrated(30).plot()

#### 

In [None]:
ta = sb.timeaverage()
ta.plot(xaxis_unit='channel', ylim=(-1,1))
plt.savefig('otf1_baseline.png')

if False:
    # bit awkward to use, especially if km/s are used
    # but the edges are bad and should not be used (~150 channels on both sides)
    ta.baseline(1, include=[(500, 1500), (2500, 3550)], remove=True)
    sb.subtract_baseline(ta.baseline_model, tol=1000)  # force=True)
else:
    ta.baseline(1, [(1500,2500)], remove=True)
    print(np.mean( sb[14]._calibrated[0]))
    sb.subtract_baseline(ta.baseline_model)
    print(np.mean( sb[14]._calibrated[0]))

In [None]:
len(sb)

In [None]:
sb.write("otf1_calibrated.fits", overwrite=True)    #  300 ms

In [None]:
#sdf0 = GBTFITSLoad('otf1_calibrated.fits')
#sdf0.summary()

# 1586 
n = 700
sdf0.getspec(n).plot()


## Gridding

The most commonly used task for this is `gbtgridder`, which is not part of dysh.  

Here's a super short blurb how to get it, with
the note that it is important to get the correct release branch.   This was the situation spring 2025, and it may change, so be sure to be in touch with the developers for recommendations, and hopefully we update this blurb here as well.

```
      git clone -b release_3.0  https://github.com/GreenBankObservatory/gbtgridder
      pip install -e gbtgridder
```

After installation, either from the shell, or from the notebook, one can grid as follows:

```
      gbtgridder --size 32 32  --channels 500:3500 -o otf1 --clobber --auto otf1.fits
```

this process will create `test2_cube.fits` andd `test2_weight.fits`

In [None]:
!gbtgridder --size 32 32  --channels 500:3500 -o otf1 --clobber --auto otf1.fits

In [None]:
ls -l otf1*

## Viewing the gridded cube

The cube should be 32 x 32 x 3001 x 1 

@todo Should check if the gridder can handle NAXIS4=2, but dysh may not handle this. Future feature?

In [None]:
from astropy.io import fits


In [None]:
cube = 'otf1_cube.fits'

hdu = fits.open(cube)
header = hdu[0].header
print(header)



In [None]:
import matplotlib.pyplot as plt

data = hdu[0].data
slice = data[0,1500]
#  no WCS, but north is up this way and east to the left
plt.imshow(slice, origin='lower')

In [None]:
spec = data[0,:,16,16]

plt.plot(spec)
plt.xlabel("Channel")
plt.ylabel(header["BUNIT"])
plt.title(header["OBJECT"]);
plt.savefig('otf-spectrum.png')

## SpectralCube

another teachable moment?

Using SpectralCube the examples become a lot more natural. We should consider giving that example here.   @todo


In [None]:
try:
    
    from spectral_cube import SpectralCube

    my_cube = SpectralCube.read(cube).with_spectral_unit(u.km/u.s)

except:
    print("alas, there is no SpectralCube in your python. You could try:   pip install spectral_cube")

## Viewing interactively

Here we would be leaveing the notebook and using your shell environment. Use at your own risk, the commands have been commented out as not to hang the automated notebook checkers.

A note of caution if you use **CARTA**: the cell needs to remain running while viewing. Kill it with the "interrupt the kernel".

In [None]:
# or go the command line way:

#!ds9 otf1_cube.fits

In [None]:
# careful, carta leaves the cell running as long as you want to view the image After that:  interrupt the kernel.

#!carta otf1_cube.fits