## Montage Mosaic

<b> NOTE: </b> For this Montage Mosaic to work, the script needs to download and extract image files that are 400 MB in size. Please make sure that you have enough memory on your disk/VM/server before you run this script.

Note that Parsl is not effective if multiple CPU cores aren't available because Parsl's ability to execute tasks in parallel is dependent on the availability of multiple cores.

In [None]:
import multiprocessing
print('Cores available: {}'.format(multiprocessing.cpu_count()))

<b> COMMAND LINE INFORMATION: </b> Install Montage from the website <a href='http://montage.ipac.caltech.edu/docs/download2.html'> here </a> or using Homebrew below. Also make sure to have the 'wget' and the 'tar' utilities installed on your laptop.

In [None]:
'''
Run this script if you have homebrew installed on your system.
'''
!brew install montage wget

<b> PYTHON PACKAGE INFORMATION: </b> Run the following command to install relevant python packages.

In [None]:
'''
Run this script to install relevant python packages
'''
!pip install montage_wrapper pandas parsl

This Python script has been inspired from the [Montage Wrapper Documentation](https://montage-wrapper.readthedocs.io/en/v0.9.5) and the [tutorial](http://montage.ipac.caltech.edu/docs/first_mosaic_tutorial.html) for the Montage Mosaic, which is a processed collection of images.

In [None]:
import pandas as pd
import parsl
import os
import montage_wrapper as montage
from parsl.data_provider.files import File
cwd = os.getcwd()

from parsl.app.app import python_app, bash_app
from parsl.providers import LocalProvider
from parsl.channels import LocalChannel

from parsl.config import Config
from parsl.executors import HighThroughputExecutor

config = Config(
    executors=[
        HighThroughputExecutor(
            label="htex_local",
            cores_per_worker=1,
            provider=LocalProvider(
                channel=LocalChannel(),
                init_blocks=1,
                max_blocks=1,
            ),
        )
    ],
)

parsl.load(config)

### First Part

In [None]:
from IPython.utils import io

with io.capture_output() as captured: 
    '''
    Packaging all the non-parallel commands inside a captured output to prevent printing any outputs here.
    NOTE: This script may take 0.5-1 minute depending on internet speed to fully execute.
    '''
    !wget -c http://montage.ipac.caltech.edu/docs/Kimages.tar
    !tar xvf Kimages.tar
    montage.mImgtbl(os.path.join(cwd,'Kimages/'),  File(os.path.join(cwd,'Kimages.tbl')))
    montage.mMakeHdr(File(os.path.join(cwd,'Kimages.tbl')), File(os.path.join(cwd,'Ktemplate.hdr')))
    os.mkdir(os.path.join(cwd,'Kprojdir/'))

Implementation of mProjExec in Parsl

In [None]:
@python_app
def mProject_parsl(inputs=  [], outputs = []):
    '''
    This is the Parsl Function that executes the mProject on each input image 
    and outputs the FITS file to the Kprojdir directory.
    '''
    import montage_wrapper as montage
    return montage.mProject(inputs[0], outputs[0], inputs[1])

In [None]:
list_of_images = os.listdir(os.path.join(cwd,'Kimages/'))

output = []

for image in list_of_images:
    '''
    For each image, we capture the input image and output image.
    We also feed the template header for each image.
    The inputs and outputs are then fed into the Parsl function
    '''
    input_image = File(os.path.join(cwd, 'Kimages/' + image))
    output_image = File(os.path.join(cwd, 'Kprojdir/hdu0_' + image))
    template = File(os.path.join(cwd,'Ktemplate.hdr'))

    output.append(mProject_parsl(inputs=[input_image, template],
                                 outputs = [output_image]))
    
output = [i.result() for i in output]
    
'''
If the function wasn't run in parallel, it would have looked like this:

montage.mProjExec(File(os.path.join(cwd,'Kimages.tbl')),
                  File(os.path.join(cwd,'Ktemplate.hdr')),
                  os.path.join(cwd,'Kprojdir/'),
                  File(os.path.join(cwd,'stats.tbl')))
'''

Final non-parallel section of the First part of Montage Mosaic

In [None]:
with io.capture_output() as captured2:
    '''
    Packaging all the non-parallel commands inside a captured output to prevent printing any outputs here.
    '''
    montage.mImgtbl(os.path.join(cwd,'Kprojdir/'), File(os.path.join(cwd,'images.tbl')))
    montage.mAdd( File(os.path.join(cwd,'images.tbl')), 
                  File(os.path.join(cwd,'Ktemplate.hdr')), 
                  File(os.path.join(cwd,'m17_uncorrected.fits')))
    !mViewer -ct 1 -gray m17_uncorrected.fits -1s max gaussian-log -out m17_uncorrected.png

'''
The markdown image below pulls the uncorrected image file:  m17_uncorrected.png
'''

![](./images/m17_uncorrected.png)

### Second Part

Initial non-parallel section

In [None]:
with io.capture_output() as captured:
    '''
    Packaging all the non-parallel commands inside a captured output to prevent printing any outputs here.
    '''
    montage.mOverlaps(File(os.path.join(cwd,'images.tbl')), File(os.path.join(cwd,'diffs.tbl')))
    os.mkdir(os.path.join(cwd,'diffdir/'))

Implementation of mDiffExec in Parsl

In [None]:
@python_app
def mDiff_parsl(inputs=[], outputs = []):
    '''
    The Parsl function for evaluating mDiff function over all input images.
    This replaces the mDiffExec function.
    '''

    import montage_wrapper as montage
    return montage.mDiff(inputs[0], inputs[1], outputs[0], inputs[2])

In [None]:
'''
This cell involves essential data processing that is required to 
feed individual images into the Parsl function for mDiff.

We extract the the two images for each file (normal image and _area image).
We also extract the output image directory.
'''

df = pd.read_csv('diffs.tbl', comment='#', delim_whitespace=True).drop(0)
images1 = list(df['|.1'])
images2 = list(df['cntr2'])
outputs = list(df['|.2'])

In [None]:
outputs_2 = []

for i in range(len(images1)):
    '''
    In the for loop, we extract individual input images along with output_file directory.
    The inputs along with the template header are fed into the mDiff_parsl function.
    '''
    
    image1 = File(os.path.join(cwd,'Kprojdir/' + images1[i]))
    image2 = File(os.path.join(cwd,'Kprojdir/' + images2[i]))
    output_file = File(os.path.join(cwd,'diffdir/' + outputs[i]))
    template = File(os.path.join(cwd,'Ktemplate.hdr'))
    
    outputs_2.append(mDiff_parsl(inputs=[image1, image2, template],
                                 outputs = [output_file]))
    
outputs_2 = [i.result() for i in outputs_2]

'''
If the function wasn't run in parallel, it would have looked like this:

montage.mDiffExec(File(os.path.join(cwd,'diffs.tbl')), 
                  File(os.path.join(cwd,'Ktemplate.hdr')), 
                  os.path.join(cwd,'diffdir/'),
                  proj_dir=os.path.join(cwd,'Kprojdir/'))
'''

Non-parallel components after mDiffExec

In [None]:
with io.capture_output() as captured:
    '''
    Packaging all the non-parallel commands inside a captured output to prevent printing any outputs here.
    '''
    montage.mFitExec(File(os.path.join(cwd,'diffs.tbl')), File(os.path.join(cwd,'fits.tbl')), 
                     os.path.join(cwd,'diffdir/'))
    montage.mBgModel(File(os.path.join(cwd,'images.tbl')), File(os.path.join(cwd,'fits.tbl')), 
                 File(os.path.join(cwd,'corrections.tbl')))
    
    os.mkdir(os.path.join(cwd,'corrdir'))

Implementation of mBgExec in Parsl

In [None]:
'''
This cell involves essential data processing that is required to 
feed individual images into the Parsl function for mBackground.

We extract the correction values for each image along with image id that we'll use for matching each image.
We also get the image table to get the directory of each image
'''


corrections = pd.read_csv('corrections.tbl', comment='|', delim_whitespace=True)
corrections.loc[90] = list(corrections.columns)
corrections.columns = ['id','a','b','c']

for i in range(len(corrections)):
    corrections['id'][i] = int(corrections['id'][i])
    
images_table = pd.read_csv('images.tbl', comment='|', delim_whitespace=True)

In [None]:
@python_app
def mBackground_parsl(inputs=[], outputs = []):
    '''
    The Parsl function for evaluating mBackground function over all input images and correct them.
    This replaces the mBgExec function.
    '''
    import montage_wrapper as montage
    return montage.mBackground( inputs[0], 
                                outputs[0], 
                                inputs[1],
                                inputs[2],
                                inputs[3])

In [None]:
outputs_mb = []

for i in range(len(images_table)):
    '''
    In the for loop, we extract individual input images along with output_image directory.
    The inputs along with the correction values are fed into the mBackground_parsl function.
    '''
    
    input_image = list(images_table['fitshdr'])[i]
    file_name = (list(images_table['fitshdr'])[i]).replace(cwd + '/Kprojdir/', '')
    output_image = os.path.join(cwd + '/corrdir',file_name)
    correction_values = list(corrections.loc[ corrections['id'] == i ].values[0])
    outputs_mb.append(mBackground_parsl(inputs = [File(input_image), correction_values[1], correction_values[2], correction_values[3]],
                        outputs = [File(output_image)]))
    
outputs_mb = [i.result() for i in outputs_mb]

'''
If the function wasn't run in parallel, it would have looked like this:

montage.mBgExec( File(os.path.join(cwd,'images.tbl')), 
                 File(os.path.join(cwd,'corrections.tbl')), 
                 os.path.join(cwd,'corrdir'), 
                 proj_dir=os.path.join(cwd,'Kprojdir'))
'''

Final non-parallel component of the Montage Mosaic

In [None]:
with io.capture_output() as captured:
    '''
    Packaging all the non-parallel commands inside a captured output to prevent printing any outputs here.
    '''
    montage.mAdd(File(os.path.join(cwd,'images.tbl')), 
             File(os.path.join(cwd,'Ktemplate.hdr')), 
             File(os.path.join(cwd,'m17.fits')))
    !mViewer -ct 1 -gray m17.fits -1s max gaussian-log -out m17.png

![](./images/m17.png)