# Source Detection with `edetect_chain` -- Part 1
<hr style="border: 2px solid #fadbac" />

- **Description:** Using `edetect_chain` to automatically detect sources.
- **Level:** Intermediate
- **Data:** XMM observation of the Lockman Hole (obsid=0123700101)
- **Requirements:** Must be run using pySAS version 2.2.2 or higher.
- **Credit:** Ryan Tanner (March 2025)
- **Support:** <a href="https://heasarc.gsfc.nasa.gov/docs/xmm/xmm_helpdesk.html">XMM Newton GOF Helpdesk</a>
- **Last verified to run:** 17 October 2025, for SAS v22.1 and pySAS v2.2.2

<hr style="border: 2px solid #fadbac" />

## 1. Introduction
This tutorial is a variation on the introductory notebooks on preparing an observation for analysis, image creation, filtering, and source extraction ([Part 1](./analysis-xmm-ABC-guide-EPIC-image-filtering.ipynb "EPIC Image Filtering") and [Part 2](./analysis-xmm-ABC-guide-EPIC-source-spectrum.ipynb "EPIC Source Extraction")). This notebook assumes you are at least minimally familiar with pySAS (see the [Long pySAS Introduction](./analysis-xmm-long-intro.ipynb "Long pySAS Intro")) and that you have previously worked through the two introductory notebooks.

In the two introductory notebooks a single source and a background region were selected by hand with the coordinates determined before hand. Now we will use the SAS task `edetect_chain` to automatically detect sources in an image. We will also demonstrate a few of the potential problems you might run into using `edetect_chain`.

In this notebook we will use the raw data files (ODFs) which will have to be calibrated and processed. This may take several (>20) minutes to run. Be prepared to wait.

#### SAS Tasks to be Used

- `edetect_chain`[(Documentation for edetect_chain)](https://xmm-tools.cosmos.esa.int/external/sas/current/doc/edetect_chain/index.html)

#### Useful Links

- [`pysas` Documentation](https://xmm-tools.cosmos.esa.int/external/sas/current/doc/pysas/index.html "pysas Documentation")
- [`pysas` on GitHub](https://github.com/XMMGOF/pysas)
- [Common SAS Threads](https://www.cosmos.esa.int/web/xmm-newton/sas-threads/ "SAS Threads")
- [Users' Guide to the XMM-Newton Science Analysis System (SAS)](https://xmm-tools.cosmos.esa.int/external/xmm_user_support/documentation/sas_usg/USG/SASUSG.html "Users' Guide")
- [The XMM-Newton ABC Guide](https://heasarc.gsfc.nasa.gov/docs/xmm/abc/ "ABC Guide")
- [XMM Newton GOF Helpdesk](https://heasarc.gsfc.nasa.gov/docs/xmm/xmm_helpdesk.html "Helpdesk") - Link to form to contact the GOF Helpdesk.

<div class="alert alert-block alert-warning">
    <b>Warning:</b> By default this notebook will place observation data files in your default <tt>data_dir</tt> directory. Make sure pySAS has been configured properly.
</div>

In [None]:
# pySAS imports
import pysas
from pysas import MyTask

# Useful imports
import os

# Matplotlib
import matplotlib.pyplot as plt

# Astropy import
from astropy.visualization import astropy_mpl_style
from astropy.io import fits
from astropy.wcs import WCS
import astropy.units as u
from astropy.coordinates import SkyCoord
from regions import CircleSkyRegion

# To handle certain warnings
import warnings
warnings.filterwarnings("ignore")

# Environment variables that need to be set to avoid a problem with edetect_chain on Fornax
os.environ['HEADASNOQUERY'] = ''
os.environ['HEADASPROMPT']  = '/dev/null'

In [None]:
obsid = '0123700101'

my_obs = pysas.ObsID(obsid)

my_obs.basic_setup(overwrite=False,rerun=False,
                   run_epproc=False,run_rgsproc=False,
                   emproc_args={'options':'-V 2'})
# my_obs.download_PPS_data(repo='heasarc',overwrite=False)
os.chdir(my_obs.work_dir)

In [None]:
# File names for this notebook. The User can change these file names.
unfiltered_event_list = my_obs.files['M1evt_list'][0]
first_filter_event_list = 'first_filter_event_list.fits'
light_curve_file ='mos1_ltcrv.fits'
gti_rate_file = 'gti_rate.fits'

filtered_event_list = 'filtered_event_list.fits'
filtered_image_file = 'filtered_image.fits'

attitude_file = 'attitude.fits'

large_filtered_image = 'large_filtered_image.fits'

eml_list_file = 'emllist.fits'

## 2. Filter the Observation

The following filtering follows exactly the filtering done in the ABC Guide Chapter 7, [Part 1](./analysis-xmm-ABC-guide-EPIC-image-filtering.ipynb "EPIC Image Filtering") and [Part 2](./analysis-xmm-ABC-guide-EPIC-source-spectrum.ipynb "EPIC Source Extraction").

In [None]:
# "Standard" Filter
inargs = ['table={0}'.format(unfiltered_event_list), 
          'withfilteredset=yes', 
          "expression='(PATTERN <= 12)&&(PI in [200:10000])&&#XMMEA_EM'", 
          'filteredset={0}'.format(first_filter_event_list), 
          'filtertype=expression', 
          'keepfilteroutput=yes', 
          'updateexposure=yes', 
          'filterexposure=yes']

MyTask('evselect', inargs).run()

# Make Light Curve File
inargs = ['table={0}'.format(first_filter_event_list), 
          'withrateset=yes', 
          'rateset={0}'.format(light_curve_file), 
          'maketimecolumn=yes', 
          'timecolumn=TIME', 
          'timebinsize=100', 
          'makeratecolumn=yes']

MyTask('evselect', inargs).run()

# Make Secondary GTI File
inargs = ['table={0}'.format(light_curve_file), 
          'gtiset={0}'.format(gti_rate_file),
          'timecolumn=TIME', 
          "expression='(RATE <= 6)'"]

MyTask('tabgtigen', inargs).run()

# Filter Using Secondary GTI File
inargs = ['table={0}'.format(first_filter_event_list),
          'withfilteredset=yes', 
          "expression='GTI({0},TIME)'".format(gti_rate_file), 
          'filteredset={0}'.format(filtered_event_list),
          'filtertype=expression', 
          'keepfilteroutput=yes',
          'updateexposure=yes', 
          'filterexposure=yes']

MyTask('evselect', inargs).run()

# Make attitude file
inargs = ['atthkset={0}'.format(attitude_file),
          'timestep=1']

MyTask('atthkgen', inargs).run()

In [None]:
my_obs.quick_eplot(filtered_event_list,image_file=filtered_image_file,ximagesize=1200,yimagesize=1200,vmin=0.1,vmax=1.0)

## 3. Make a Large Image for Analysis

We use the function `quick_eplot` to generate a FITS image for display purposes. In that function the size of the FITS image was set (`imagebinning=imageSize`, 600x600 pixels) and events in the event list were binned accordingly. While that size of image is fine for quick looks at the data, the resolution is too low for good analysis. Below we define another function, `make_large_image` to make a FITS image, but here we set the bin size (`imagebinning=binSize`, 20x20 arcseconds). Events will be binned accordingly. This creates a much higher resolution image suitable for data analysis.

For now we will make a high resolution image using the default settings, but later on we will see what happens when we change these defaults. We will also make a short function to run `edetect_chain` for convenience.

`edetect_chain` will create a list of sources and write the list to file in FITS format. We will define a function (`make_regions`) that will take the source list and generate regions for each of the sources and return the regions in a list. Then we will use a function `plot_regions` that will overlay the regions on the image we made previously with `quick_eplot`.

In [None]:
# High resolution image function
def make_large_image(event_list_file,image_file,xbinsize=20,ybinsize=20,pimin=300,pimax=2000):

    expression = '(FLAG == 0)&&(PI in [{pimin}:{pimax}])'.format(pimin=pimin,pimax=pimax)
    
    inargs = {'table'         : event_list_file, 
              'withimageset'  : 'yes',
              'imageset'      : image_file, 
              'xcolumn'       : 'X', 
              'ycolumn'       : 'Y', 
              'imagebinning'  : 'binSize', 
              'ximagebinsize' : xbinsize, 
              'yimagebinsize' : ybinsize,
              'filtertype'    : 'expression',
              'expression'    : expression}

    MyTask('evselect', inargs).run()

# Function to run edetect_chain
def run_edetect_chain(large_filtered_image,filtered_event_list,attitude_file,eml_list,
                      pimin=300,pimax=2000,
                      likemin=10,eml_ecut=15):
    inargs = {'imagesets'   : large_filtered_image, 
              'eventsets'   : filtered_event_list,
              'attitudeset' : attitude_file, 
              'pimin'       : pimin, 
              'pimax'       : pimax,
              'witheexpmap' : 'yes',
              'eml_ecut'    : eml_ecut,
              'likemin'     : likemin,
              'eml_list'    : eml_list}
    
    MyTask('edetect_chain', inargs).run()

# Function to make regions
def make_regions(source_list):
    obs_regions = []
    with fits.open(source_list) as hdu:
        data = hdu[1].data[hdu[1].data['ID_BAND'] == 1]
    for i in range(len(data)):
        RA     = data['RA'][i] * u.deg
        Dec    = data['DEC'][i] * u.deg
        radius = 30.0 * u.arcsec
        obs_regions.append({'ra':RA, 'dec':Dec, 'radius':radius})
    return obs_regions

def plot_regions(image_file, source_list):

    # Open file
    hdu = fits.open(image_file)[0]
    wcs = WCS(hdu.header)

    # Plot
    ax = plt.subplot(projection=wcs)
    plt.imshow(hdu.data, origin='lower', norm='log', vmin=0.1, vmax=1.0)
    ax.set_facecolor("black")

    # Add regions
    for source in source_list:
        # Define region
        region = CircleSkyRegion(SkyCoord(source['ra'], source['dec']), source['radius'])
        pixel_region = region.to_pixel(wcs)
        # Convert region to artist object
        artist = pixel_region.as_artist(color='lime')
        ax.add_artist(artist)

    plt.grid(color='blue', ls='solid')
    plt.xlabel('RA')
    plt.ylabel('Dec')
    plt.colorbar()
    plt.show()

<div class="alert alert-block alert-warning">
<b>Warning:</b> Running <tt>edetect_chain</tt> with a high resolution image will take several minutes to run.
</div>

When we run the cell below it will generate a source list and mark regions around each source. We notice that the algorithm in `edetect_chain` will occationally find two separate sources, with a slight offset, for a single source. This can be seen where for a few of the sources there are two region circles with a slight offset around a single source. There are also a few spots where our eye can detect what might be a source, but the algorithm in `edetect_chain` rejected them as sources due to default source detection cutoff values. `edetect_chain` has a large number of possible inputs that you can use to modify the detection assumptions made by the algorithm.

In [None]:
make_large_image(filtered_event_list, large_filtered_image)
run_edetect_chain(large_filtered_image,filtered_event_list,attitude_file,eml_list_file)

In [None]:
my_regions = make_regions(eml_list_file)
plot_regions(filtered_image_file,my_regions)

In [None]:
print('Number of regions: {}'.format(len(my_regions)))

<div class="alert alert-block alert-info">
<b>Note:</b> Below we will demonstrate changing a few input parameters from their defaults. This is not a comprehensive demonstration of the possible inputs for <tt>edetect_chain</tt>, just a few selected examples.</div>

Now let us change some basic values and see how that changes the results of `edetect_chain`. First we will change the resolution of the image used for source detection. We will change the bin size from 20x20 arcseconds to 80x80.

In [None]:
xbinsize=80
ybinsize=80

eml_list_lores = 'emllist_lores.fits'
large_filtered_image_lores = 'large_filtered_image_lores.fits'

make_large_image(filtered_event_list, large_filtered_image_lores,xbinsize=xbinsize,ybinsize=ybinsize)
run_edetect_chain(large_filtered_image_lores,filtered_event_list,attitude_file,eml_list_lores)

In [None]:
my_regions = make_regions(eml_list_lores)
plot_regions(filtered_image_file,my_regions)

In [None]:
print('Number of regions: {}'.format(len(my_regions)))

Because the image is lower resolution `edetect_chain` will run faster, but we can also see that the number of sources detected has gone down from 63 to 56 (though the number of duplicates has also gone down).

As a default we restricted source detection over the energy range 0.3-2.0 keV. Now let's see what happens if we expand the energy range to 0.3-8.0 keV, but return to the original resolution of 20x20 arcseconds.

In [None]:
pimin=300
pimax=8000

eml_list_hipimax = 'emllist_hipimax.fits'
large_filtered_image_hipimax = 'large_filtered_image_hipimax.fits'

make_large_image(filtered_event_list, large_filtered_image_hipimax,pimin=pimin,pimax=pimax)
run_edetect_chain(large_filtered_image_hipimax,filtered_event_list,attitude_file,eml_list_hipimax,pimin=pimin,pimax=pimax)

In [None]:
my_regions = make_regions(eml_list_hipimax)
plot_regions(filtered_image_file,my_regions)

In [None]:
print('Number of regions: {}'.format(len(my_regions)))

We see that the number of detected sources has increased to 69. This shows that the algorithm for `edetect_chain` is sensitive to the energy range given for detecting sources. Generally a narrower energy range will be better for source detection.

Next let us try two other parameters. The first is `likemin` which is the detection likelihood threshold. The default is 10. Let's set it to something higher and see what we get.

In [None]:
likemin=20

eml_list_hilikemin = 'emllist_hilikemin.fits'
large_filtered_image_hilikemin = 'large_filtered_image_hilikemin.fits'

make_large_image(filtered_event_list, large_filtered_image_hilikemin)
run_edetect_chain(large_filtered_image_hilikemin,filtered_event_list,attitude_file,eml_list_hilikemin,likemin=likemin)

In [None]:
my_regions = make_regions(eml_list_hilikemin)
plot_regions(filtered_image_file,my_regions)

In [None]:
print('Number of regions: {}'.format(len(my_regions)))

With a higher detection likelihood threshold we only get 33 sources, instead of 63 using the default value.

Now let's try another parameter. There is the parameter `eml_ecut` which is the event cut-out radius as measured in pixels.

In [None]:
eml_ecut=30

eml_list_hieml_ecut = 'emllist_hieml_ecut.fits'
large_filtered_image_hieml_ecut = 'large_filtered_image_hieml_ecut.fits'

make_large_image(filtered_event_list, large_filtered_image_hieml_ecut)
run_edetect_chain(large_filtered_image_hieml_ecut,filtered_event_list,attitude_file,eml_list_hieml_ecut,eml_ecut=eml_ecut)

In [None]:
my_regions = make_regions(eml_list_hieml_ecut)
plot_regions(filtered_image_file,my_regions)

In [None]:
print('Number of regions: {}'.format(len(my_regions)))

With this parameter change we get 64 source detections.

## 4. Automatic Spectra Extraction

Below we provide an example function that can be used to automatically extract spectra from all sources, along with background regions. As inputs it takes a filtered event list and the source list generated by `edetect_chain`. The outputs will be a corresponding source event list, background event list, source spectrum, background spectrum, RMF, ARF, and binned spectrum file for each source. The files for each source will start with 'MMMsXXX' where MMM is the instrument and XXX is the source number.

<div class="alert alert-block alert-info">
<b>Note:</b> This will generate spectra from duplicate sources. The size of the region used for source extraction uses a default value, along with the size of the background region. There are a few other default assumptions that may or may not be appropriate depending on the individual sources.
</div>

```python
def extract_spectra_from_source(filtered_event_list,eml_list_file,instrument):
    with fits.open(eml_list_file) as hdu:
        data = hdu[1].data[hdu[1].data['ID_BAND'] == 1]
    for i in range(len(data)):
        # File names
        source_event_list = instrument+'s{:03}_event_list.fits'.format(i)
        bkg_event_list    = instrument+'s{:03}_bkg_event_list.fits'.format(i)
        source_spectra    = instrument+'s{:03}_spectra.fits'.format(i)
        bkg_spectra       = instrument+'s{:03}_bkg_spectra.fits'.format(i)
        rmf_file          = instrument+'s{:03}_rmf.fits'.format(i)
        arf_file          = instrument+'s{:03}_arf.fits'.format(i)
        grouped_spectra   = instrument+'s{:03}_spectra_grouped.fits'.format(i)

        # Add source region and background annulus
        RA      = data['RA'][i] * u.deg
        Dec     = data['DEC'][i] * u.deg
        radiusi = 10.0 * u.arcsec
        radiuso = 20.0 * u.arcsec
        circle  = "CIRCLE({0},{1},{2})".format(RA.value,Dec.value,radiusi.to(u.deg).value)
        annulus = "ANNULUS({0},{1},{2},{3})".format(RA.value,Dec.value,radiusi.to(u.deg).value,radiuso.to(u.deg).value)

        # Extract spectrum from source
        inargs = {'table'           : filtered_event_list,
                  'energycolumn'    : 'PI',
                  'withfilteredset' : 'yes',
                  'filteredset'     : source_event_list,
                  'keepfilteroutput': 'yes',
                  'filtertype'      : 'expression',
                  'expression'      : "'((RA,DEC) in {0})".format(circle),
                  'withspectrumset' : 'yes',
                  'spectrumset'     : source_spectra,
                  'spectralbinsize' : '5',
                  'withspecranges'  : 'yes',
                  'specchannelmin'  : '0',
                  'specchannelmax'  : '11999'}
        
        MyTask('evselect', inargs).run()

        # Extract spectrum from background
        inargs = {'table'           : filtered_event_list,
                  'energycolumn'    : 'PI',
                  'withfilteredset' : 'yes',
                  'filteredset'     : bkg_event_list,
                  'keepfilteroutput': 'yes',
                  'filtertype'      : 'expression',
                  'expression'      : "'((RA,DEC) in {0})'".format(annulus),
                  'withspectrumset' : 'yes',
                  'spectrumset'     : bkg_spectra,
                  'spectralbinsize' : '5',
                  'withspecranges'  : 'yes',
                  'specchannelmin'  : '0',
                  'specchannelmax'  : '11999'}
        
        MyTask('evselect', inargs).run()

        # Generate rmf for source
        inargs = {}
        inargs = {'rmfset'      : rmf_file,
                  'spectrumset' : source_spectra}
        
        MyTask('rmfgen', inargs).run()

        # Generate arf for source
        inargs = {}
        inargs = {'arfset'         : arf_file,
                  'spectrumset'    : source_spectra,
                  'withrmfset'     : 'yes',
                  'rmfset'         : rmf_file,
                  'withbadpixcorr' : 'yes',
                  'badpixlocation' : filtered_event_list,
                  'setbackscale'   : 'yes'}
        
        MyTask('arfgen', inargs).run()

        # Bin events in spectrum and link arf and rmf
        inargs = {}
        inargs = {'spectrumset' : source_spectra,
                  'groupedset'  : grouped_spectra,
                  'arfset'      : arf_file,
                  'rmfset'      : rmf_file,
                  'backgndset'  : bkg_spectra,
                  'mincounts'   : '30'}
        
        MyTask('specgroup', inargs).run()
```