# Acquiring and processing LoVoCCS-relevant X-ray data

Here we use the Python module DAXA (Democratising Archival X-ray Astronomy) to identify X-ray observations relevant to the LoVoCCS sample, download them, and process them into a usable state **(PROCESSING IS CURRENTLY ONLY PERFORMED ON XMM DATA, THE INITIAL FOCUS FOR LOVOCCS)**.

DAXA is capable of searching the archives of a number of X-ray missions, though more will be added in the future. In this instance we search the following:
* XMM-Newton pointed observations
* Chandra observations
* ROSAT All-Sky Survey observations
* ROSAT pointed observations 
* NuSTAR pointed observations
* eROSITA calibration and performance verification observations

**Though XMM is the current main focus for data analysis, we acquire data for the other missions, including pre-processed images and exposure maps where possible** - in the future we will process and use the data from the other missions.

## Import Statements

In [42]:
import pandas as pd
from astropy.units import Quantity
from astropy.cosmology import LambdaCDM
import numpy as np
import os

import daxa
daxa.NUM_CORES = 100
daxa.OUTPUT = "/mnt/gs21/scratch/turne540/lovoccs/X-LoVoCCS-Data/data/"
daxa.config.OUTPUT = "/mnt/gs21/scratch/turne540/lovoccs/X-LoVoCCS-Data/data/"
daxa.mission.base.OUTPUT = "/mnt/gs21/scratch/turne540/lovoccs/X-LoVoCCS-Data/data/"
from daxa.mission import XMMPointed, Chandra, ROSATPointed, ROSATAllSky, NuSTARPointed, eROSITACalPV
from daxa.archive import Archive
from daxa.process.simple import full_process_xmm
from xga.sourcetools.misc import rad_to_ang

## Setting up cosmology

We have chosen to use a concordance LambdaCDM model:

In [2]:
cosmo = LambdaCDM(70, 0.3, 0.7)

## Reading in the sample

We will be searching for data around the central positions of the LoVoCCS galaxy clusters, as defined by the MCXC catalogue. As such we need to read in our basic initial information about the clusters:

In [3]:
samp = pd.read_csv("../sample_files/lovoccs_southnorth.csv")
samp

Unnamed: 0,name,MCXC,LoVoCCSID,ra,dec,redshift,L500,M500,R500,alt_name,other_names,Notes
0,MCXCJ1558.3+2713,J1558.3+2713,0,239.585833,27.226944,0.0894,10.676087,8.1491,1.3803,RXCJ1558.3+2713,A2142,L
1,MCXCJ1510.9+0543,J1510.9+0543,1,227.729167,5.720000,0.0766,8.726709,7.2708,1.3344,A2029,A2029,
2,MCXCJ0258.9+1334,J0258.9+1334,2,44.739583,13.579444,0.0739,6.088643,5.8488,1.2421,RXCJ0258.9+1334,A401,L
3,MCXCJ1348.8+2635,J1348.8+2635,3,207.220833,26.595556,0.0622,5.478067,5.5280,1.2236,RXCJ1348.8+2635,A1795,
4,MCXCJ0041.8-0918,J0041.8-0918,4,10.458750,-9.301944,0.0555,5.100085,5.3163,1.2103,RXCJ0041.8-0918,A85,"L,losStr"
...,...,...,...,...,...,...,...,...,...,...,...,...
139,MCXCJ0448.2-2028,J0448.2-2028,139,72.050833,-20.469722,0.0720,1.004022,1.9513,0.8620,RXCJ0448.2-2028,A514,losStr
140,MCXCJ2323.8+1648,J2323.8+1648,140,350.972917,16.808889,0.0416,1.002026,1.9896,0.8760,A2589,A2589,
141,MCXCJ1416.8-1158,J1416.8-1158,141,214.214583,-11.976111,0.0982,1.001648,1.9133,0.8491,RXCJ1416.8-1158,,X
142,MCXCJ1459.0-0843,J1459.0-0843,142,224.764583,-8.725000,0.1043,1.001337,1.9047,0.8461,RXCJ1459.0-0843,,


## Searching for X-ray data

For many of these missions we wish to search for observations within some physically motivated radius, and in this case we decide to use a radius of 2$R_{500}$, where the $R_{500}$ in question is that measured by MCXC. As such we convert the Mpc value of 2$R_{500}$ to an angular search distance:

In [4]:
search_distances = rad_to_ang(Quantity(samp['R500'].values*2, 'Mpc'), samp['redshift'].values, cosmo)
search_distances

<Quantity [0.45944419, 0.51074833, 0.49124172, 0.56713079, 0.62374544,
           0.30203501, 0.34859365, 0.35509501, 0.55996885, 0.39401058,
           0.60273926, 0.82604849, 0.43938027, 0.45123064, 0.38462553,
           0.55425603, 0.29399626, 0.38571156, 0.28929987, 0.33253335,
           0.65027903, 0.35406128, 0.38892216, 0.33448454, 0.27214206,
           0.32022652, 0.51926564, 0.40373002, 0.29100237, 0.56451273,
           0.36536191, 0.42285862, 0.29170232, 0.28965969, 0.31988706,
           0.2644956 , 0.31859985, 0.30719297, 0.29512386, 0.40648695,
           0.34646189, 0.38516036, 0.3517492 , 0.25235852, 0.25448364,
           0.31956403, 0.93161725, 0.59551541, 0.24609498, 0.36897886,
           0.30375816, 0.85159468, 0.36367784, 0.34484616, 0.24546266,
           0.49762719, 0.25334387, 0.3187183 , 0.35403892, 0.31487174,
           0.26672089, 0.26545546, 0.24312291, 0.34081082, 0.59849044,
           0.29935698, 0.57501545, 0.28963545, 0.31338601, 0.24306012,
      

Here we detail the meanings of the DAXA target category taxonomy codes; for reference when viewing the observation target category columns. Note that this may evolve with newer versions of DAXA:

In [31]:
XMMPointed.show_allowed_target_types()

╒═══════════════╤═══════════════════════════════════════════════╕
│ Target Type   │ Description                                   │
╞═══════════════╪═══════════════════════════════════════════════╡
│ AGN           │ Active Galaxies and Quasars                   │
├───────────────┼───────────────────────────────────────────────┤
│ BLZ           │ Blazars                                       │
├───────────────┼───────────────────────────────────────────────┤
│ CV            │ Cataclysmic Variables                         │
├───────────────┼───────────────────────────────────────────────┤
│ CAL           │ Calibration Observation (possibly of objects) │
├───────────────┼───────────────────────────────────────────────┤
│ EGS           │ Extragalactic Surveys                         │
├───────────────┼───────────────────────────────────────────────┤
│ GCL           │ Galaxy Clusters                               │
├───────────────┼───────────────────────────────────────────────┤
│ GS      

### XMM-Newton

Observations of LoVoCCS clusters by XMM-Newton are the initial focus of the X-LoVoCCS project, which is well suited to the study of local, large, galaxy clusters due to its medium sized field of view and excellent sensitivity. We select all X-ray instruments on XMM; all three EPIC detectors (PN, MOS1, and MOS2), as well as the grating spectrometer instruments (RGS1 and RGS2).

We initially need to declare an XMMPointed instance, which fetches the latest database of XMM observations, then we can filter all those observations to select only the ones we're interested in.

In [5]:
xm = XMMPointed(['PN', 'MOS1', 'MOS2', 'RGS1', 'RGS2'])
xm.all_obs_info

  warn("Some instrument names were converted to alternative forms expected by this module, the instrument "
  self._fetch_obs_info()


Unnamed: 0,ra,dec,ObsID,start,science_usable,duration,proprietary_end_date,revolution,proprietary_usable,end
0,64.925415,55.999440,0000110101,2001-08-19 07:05:23,True,0 days 09:08:33,2002-09-29,310,True,2001-08-19 16:13:56
1,263.674950,-32.581670,0001730201,2001-03-09 12:44:21,True,0 days 04:44:43,2002-05-25,229,True,2001-03-09 17:29:04
2,263.674950,-32.581670,0001730301,2001-03-09 17:30:16,True,0 days 02:36:02,2002-05-25,229,True,2001-03-09 20:06:18
3,263.674950,-32.581670,0001730401,2001-03-09 09:41:25,True,0 days 03:00:59,2002-05-25,229,True,2001-03-09 12:42:24
4,99.349995,6.135278,0001730501,2002-09-17 18:35:28,True,0 days 06:05:39,2004-12-31,508,True,2002-09-18 00:41:07
...,...,...,...,...,...,...,...,...,...,...
17014,81.325250,-46.005917,0900411301,2023-08-25 18:06:35,True,0 days 07:30:00,NaT,4342,False,2023-08-26 01:36:35
17015,81.325250,-46.005917,0900413101,2023-08-24 11:51:12,True,0 days 04:05:00,NaT,4342,False,2023-08-24 15:56:12
17016,85.026250,-40.842222,0920010201,2023-08-25 15:19:41,True,0 days 02:13:20,NaT,4342,False,2023-08-25 17:33:01
17017,72.487500,-31.964361,0921860501,2023-08-25 04:42:22,True,0 days 10:03:20,NaT,4342,False,2023-08-25 14:45:42


This is where we filter the observations, using the search distances we decided on earlier. This searches for observations with a central coordinate within a circle that has a radius equal to the search distance for the particular cluster.

We then show a preview of the filtered observation table:

In [6]:
xm.filter_on_positions(samp[['ra', 'dec']].values, search_distances)
xm.filtered_obs_info

Unnamed: 0,ra,dec,ObsID,start,science_usable,duration,proprietary_end_date,revolution,proprietary_usable,end
59,247.156500,39.552610,0008030201,2002-07-04 15:42:00,True,0 days 06:22:56,2003-09-12 00:00:00,470,True,2002-07-04 22:04:56
60,247.156500,39.552610,0008030301,2002-07-06 15:35:25,True,0 days 06:21:17,2003-09-12 00:00:00,471,True,2002-07-06 21:56:42
61,247.156500,39.552610,0008030601,2002-08-15 12:52:43,True,0 days 06:03:56,2003-09-12 00:00:00,491,True,2002-08-15 18:56:39
88,194.366700,-17.346280,0010420201,2001-01-08 15:58:43,True,0 days 06:20:06,2002-10-08 00:00:00,199,True,2001-01-08 22:18:49
89,194.366700,-17.346280,0010420701,2001-01-08 12:48:31,True,0 days 03:01:48,2002-10-08 00:00:00,199,True,2001-01-08 15:50:19
...,...,...,...,...,...,...,...,...,...,...
16868,68.005125,-61.292611,0921881301,2023-06-25 15:57:09,True,0 days 03:45:40,NaT,4312,False,2023-06-25 19:42:49
16875,68.133417,-61.506250,0921880601,2023-06-27 19:34:45,True,0 days 22:13:20,NaT,4313,False,2023-06-28 17:48:05
16876,68.133417,-61.506250,0921881401,2023-06-27 17:15:45,True,0 days 02:19:00,NaT,4313,False,2023-06-27 19:34:45
16950,232.406792,-83.700244,0763940535,2016-03-20 08:28:58,True,0 days 00:36:31,2017-04-28 22:00:00,2981,True,2016-03-20 09:05:29


We note that some of the observations selected are still proprietary (as of the time of writing), so we will not yet be able to use them - they will be automatically filtered out by DAXA when it comes to the data-downloading stage, but will be useful in the future when this DAXA archive is updated:

In [7]:
xm.filtered_obs_info[xm.filtered_obs_info['proprietary_usable'] == False]

Unnamed: 0,ra,dec,ObsID,start,science_usable,duration,proprietary_end_date,revolution,proprietary_usable,end
16252,230.5625,7.988889,904610201,2022-08-20 16:04:30,True,0 days 13:08:20,2023-09-12,4157,False,2022-08-21 05:12:50
16253,230.5625,7.988889,904610901,2022-08-20 12:27:50,True,0 days 03:36:40,2023-09-12,4157,False,2022-08-20 16:04:30
16557,230.341667,7.431111,904610401,2023-01-29 04:34:54,True,0 days 11:06:40,2024-02-16,4238,False,2023-01-29 15:41:34
16558,230.341667,7.431111,904611101,2023-01-29 01:45:54,True,0 days 02:49:00,2024-02-16,4238,False,2023-01-29 04:34:54
16612,230.754167,7.607778,904610301,2023-02-24 09:54:35,True,0 days 10:31:40,2024-03-21,4251,False,2023-02-24 20:26:15
16833,67.5085,-61.37375,921880301,2023-06-13 17:40:47,True,1 days 12:53:20,NaT,4306,False,2023-06-15 06:34:07
16834,67.5085,-61.37375,921881001,2023-06-13 16:44:27,True,0 days 00:56:20,NaT,4306,False,2023-06-13 17:40:47
16835,67.718583,-61.565167,921880401,2023-06-15 17:32:48,True,1 days 12:55:00,2024-07-26,4307,False,2023-06-17 06:27:48
16836,67.718583,-61.565167,921881101,2023-06-15 16:36:58,True,0 days 00:55:50,2024-07-26,4307,False,2023-06-15 17:32:48
16853,67.718583,-61.565167,921880801,2023-06-29 19:27:04,True,0 days 21:21:40,2024-07-26,4314,False,2023-06-30 16:48:44


### Chandra

Chandra observations of LoVoCCS clusters will allow us to explore the spatially resolved properties of the ICM with more confidence, thanks to its excellent spatial resolution. However the smaller field-of-view and lower sensitivity (when compared to XMM) mean that it will be a focus in a future phase of the project. 

We still wish to assess the number of observations that may be available with Chandra however, and we can use DAXA to download pre-processed data, with images that may be useful at this stage:

In [8]:
ch = Chandra()
ch.all_obs_info

Unnamed: 0,ra,dec,ObsID,science_usable,proprietary_usable,start,end,duration,proprietary_end_date,target_category,instrument,grating,data_mode
0,274.43140,-33.01883,6616,True,True,2006-02-24 04:33:41.000003,2007-09-21 00:26:11.000003,573 days 19:52:30,2007-02-28,GS,ACIS-S,HETG,CC_000A8
1,83.63292,22.01447,7587,True,True,2007-02-03 09:58:57.000000,2008-07-21 21:53:57.000000,534 days 11:55:00,2008-02-06,SNR,ACIS-S,HETG,TE_0077C
2,202.50000,47.20000,13814,True,True,2012-09-20 07:21:41.999999,2012-09-22 12:47:41.999999,2 days 05:26:00,2013-10-11,NGS,ACIS-S,NONE,TE_00958
3,266.41667,-29.00781,13842,True,True,2012-07-21 11:52:41.000002,2012-07-23 17:08:41.000002,2 days 05:16:00,2012-07-25,NGS,ACIS-S,HETG,TE_008D0
4,316.72458,38.74942,13651,True,True,2012-02-13 20:18:26.999997,2012-02-16 00:51:26.999997,2 days 04:33:00,2013-02-21,MISC,HRC-S,LETG,DEFAULT
...,...,...,...,...,...,...,...,...,...,...,...,...,...
22378,84.91458,-69.74361,1203,True,True,1999-08-31 03:28:53.000003,1999-08-31 03:28:53.000003,0 days 00:00:00,2003-06-16,GS,HRC-I,NONE,
22379,84.91458,-69.74361,62437,True,True,1999-09-05 00:50:10.999997,1999-09-05 00:50:10.999997,0 days 00:00:00,2002-03-27,GS,HRC-S,NONE,SCENTER
22380,54.19708,0.58997,1153,True,True,1999-09-05 04:29:00.999998,1999-09-05 04:29:00.999998,0 days 00:00:00,2002-10-04,CAL,HRC-S,NONE,SCENTER
22381,332.17010,45.74230,1336,True,True,1999-10-03 21:48:53.000004,1999-10-03 21:48:53.000004,0 days 00:00:00,1999-12-14,CAL,HRC-I,NONE,


This is the same procedure as with the search for XMM data, and we find that there are a significant number of observations available:

In [9]:
ch.filter_on_positions(samp[['ra', 'dec']].values, search_distances)
ch.filtered_obs_info

  fov = self.fov
  change_func(*args, **kwargs)


Unnamed: 0,ra,dec,ObsID,science_usable,proprietary_usable,start,end,duration,proprietary_end_date,target_category,instrument,grating,data_mode
13,15.35027,-21.81757,13442,True,True,2011-08-23 07:02:35.999998,2011-08-25 08:46:15.999998,2 days 01:43:40,2011-08-30,GCL,ACIS-I,NONE,TE_004D8
66,193.22708,-29.45500,4198,True,True,2003-03-20 10:23:35.000002,2003-03-22 08:23:35.000002,1 days 22:00:00,2003-03-26,GCL,ACIS-I,NONE,TE_0048A
135,15.99304,-21.71045,13448,True,True,2011-09-13 09:56:39.000002,2011-09-15 03:04:09.000002,1 days 17:07:30,2011-09-16,GCL,ACIS-I,NONE,TE_004D8
147,332.74768,-12.42057,23368,True,True,2021-07-22 02:38:25.999996,2021-07-23 18:40:55.999996,1 days 16:02:30,2022-07-26,MISC,HRC-I,NONE,OBS20743
149,15.32166,-22.08948,13452,True,True,2011-09-24 12:29:40.999998,2011-09-26 04:28:50.999998,1 days 15:59:10,2011-09-27,GCL,ACIS-I,NONE,TE_004D8
...,...,...,...,...,...,...,...,...,...,...,...,...,...
18830,247.25542,40.13322,4140,True,True,2003-07-12 08:26:28.000003,2003-07-12 09:23:38.000003,0 days 00:57:10,2004-07-16,AGN,ACIS-S,NONE,TE_003C4
18923,176.15333,67.40592,5675,True,True,2005-03-25 21:37:35.999999,2005-03-25 22:30:35.999999,0 days 00:53:00,2006-03-28,AGN,ACIS-S,NONE,TE_006A0
19204,256.04479,78.63100,1521,True,True,2000-02-27 01:49:23.000005,2000-02-27 02:39:03.000005,0 days 00:49:40,2002-02-14,GCL,ACIS-S,NONE,TE_002DE
20186,227.92208,5.30264,4047,True,True,2003-05-18 11:19:00.999995,2003-05-18 11:52:40.999995,0 days 00:33:40,2004-06-04,AGN,ACIS-S,NONE,TE_002A2


### ROSAT All-Sky

All of our clusters will be covered by the ROSAT All-Sky Survey, and data from this mission will be helpful in defining the background emission when analysing our clusters with XMM. It is also useful to have a data source that provides uniform(-ish) coverage of our sample, it provides a base for our work to build on:

In [10]:
ra = ROSATAllSky()
ra.all_obs_info

Unnamed: 0,ra,dec,ObsID,science_usable,start,end,duration,target_category
0,263.57088,67.500,RS930521N00,True,1990-07-11,1991-08-13,0 days 11:19:06,ASK
1,276.42533,67.500,RS930522N00,True,1990-07-11,1991-08-13,0 days 11:18:11,ASK
2,96.42533,-67.500,RS932908N00,True,1990-07-11,1991-08-13,0 days 05:47:48,ASK
3,83.57088,-67.500,RS932907N00,True,1990-07-11,1991-08-13,0 days 05:47:39,ASK
4,267.27100,61.875,RS930625N00,True,1990-07-11,1991-08-13,0 days 03:44:35,ASK
...,...,...,...,...,...,...,...,...
1373,289.08783,-61.875,RS932827N00,True,1990-09-10,1990-10-08,0 days 00:02:44,ASK
1374,350.17942,-33.750,RS932354N00,True,1990-11-09,1990-12-02,0 days 00:02:42,ASK
1375,278.17942,-61.875,RS932826N00,True,1990-09-04,1990-09-30,0 days 00:02:42,ASK
1376,273.75000,-45.000,RS932537N00,True,1990-09-07,1990-09-23,0 days 00:02:42,ASK


The RASS survey data is divided into 6x6 degree chunks of sky, and as such our previously defined search distances are not a good choice here - it is entirely possible that the centre of the relevant RASS chunk will be further away than the search distance, even though the cluster does lie in that chunk. As such we use the default search distance (which is based upon the known size of the RASS chunks).

We find that 144 RASS 'observations' are selected, which may correspond to one per cluster:

In [11]:
ra.filter_on_positions(samp[['ra', 'dec']].values)
ra.filtered_obs_info

Unnamed: 0,ra,dec,ObsID,science_usable,start,end,duration,target_category
9,256.36258,61.875,RS930624N00,True,1990-07-11,1991-08-13,0 days 01:17:00,ASK
21,258.75000,78.750,RS930312N00,True,1990-07-30,1991-08-13,0 days 00:45:33,ASK
25,70.90838,-61.875,RS932807N00,True,1990-07-11,1991-08-13,0 days 00:41:47,ASK
41,244.61279,56.250,RS930727N00,True,1990-07-11,1991-08-13,0 days 00:30:27,ASK
74,60.00000,-56.250,RS932707N00,True,1990-07-11,1991-08-13,0 days 00:20:52,ASK
...,...,...,...,...,...,...,...,...
1316,234.00000,-84.375,RS933207N00,True,1990-08-15,1991-02-18,0 days 00:05:11,ASK
1317,334.68750,-5.625,RS931860N00,True,1990-11-08,1990-11-28,0 days 00:05:10,ASK
1350,349.61279,-39.375,RS932451N00,True,1990-11-04,1990-11-29,0 days 00:03:50,ASK
1365,345.24583,-22.500,RS932159N00,True,1990-11-11,1990-12-02,0 days 00:03:10,ASK


### ROSAT Pointed

We also wish to search for pointed ROSAT observations of our clusters, as they may be useful. Pointed PSPC observations could be spectrally analysed for instance, though if the observations were off axis the PSF effects can become quite extreme. Pointed HRI observations have very good spatial resolution, so could be used for the assessment of ICM structure, but very poor spectral resolution, so would not be useful for spectral fitting:

In [12]:
rp = ROSATPointed()
rp.all_obs_info

Unnamed: 0,ra,dec,ObsID,science_usable,start,end,duration,instrument,with_filter,target_category,target_name,proc_rev,fits_type
0,163.1800,57.4800,RH701867A01,True,1995-04-15 23:24:16.000001,1995-05-11 14:24:47.000001,2 days 13:29:32,HRI,N,AGN,LOCKMAN HOLE,2,RFITS V4.
1,203.6500,37.9100,RH900717N00,True,1997-06-04 16:13:00.999998,1997-07-13 22:26:45.000004,2 days 07:58:33,HRI,N,MISC,DEEP SURVEY,2,RDF 4_0
2,163.1800,57.4800,RH701867A04,True,1997-04-15 21:51:16.999998,1997-04-28 16:37:45.999998,2 days 06:33:41,HRI,N,AGN,LOCKMAN HOLE,2,RFITS V3.
3,350.8700,58.8100,RH500444N00,True,1995-12-23 22:18:36.999999,1996-02-01 10:17:53.999998,2 days 02:07:27,HRI,N,SNR,CAS A,2,RDF 3_4
4,163.1800,57.4800,RH701867A02,True,1996-05-01 02:09:33.000002,1996-05-29 15:31:24.999997,2 days 01:05:24,HRI,N,AGN,LOCKMAN HOLE,2,RFITS V4.
...,...,...,...,...,...,...,...,...,...,...,...,...,...
11426,84.2900,-80.4700,RP999998A02,False,1991-10-20 04:07:51.999998,1991-11-01 22:43:43.000003,0 days 00:00:00,PSPCB,N,EGE,Idle Point,2,RDF 3_4
11427,93.1800,-81.8300,RH150094N00,False,1990-07-29 22:13:42.999997,1990-07-29 22:56:58.999998,0 days 00:00:00,HRI,N,STR,Calibration Source,2,RFITS V4.
11428,218.1540,-44.2031,RH001034N00,False,1997-08-28 00:29:42.000000,1997-08-28 01:09:58.999997,0 days 00:00:00,HRI,N,MISC,,2,RDF 4_2
11429,258.1383,-23.3850,RH800067M01,False,1991-03-19 20:33:48.999997,1991-03-19 20:35:59.999997,0 days 00:00:00,HRI,N,GCL,OPHIUCHUS CLUSTER,2,RFITS V3.


We again use the default search distances - this is due to HRI and PSPC having _very_ different field of views, so using a single search distance is not really valid. We also exclude one selected observation due to an incompatibility with DAXA that only appears when we try to download the data:

In [13]:
rp.filter_on_positions(samp[['ra', 'dec']].values)

# Due to a different processing version to the rest of the data (I assume), RP700068 cannot currently be downloaded
#  by DAXA, so we're manually excluding it
rp.filter_array[np.where(rp.all_obs_info['ObsID'] == 'RP700068')[0]] = False

rp.filtered_obs_info

Unnamed: 0,ra,dec,ObsID,science_usable,start,end,duration,instrument,with_filter,target_category,target_name,proc_rev,fits_type
178,348.4900,-42.7300,RH800448N00,True,1994-11-05 12:42:18.000000,1994-11-27 04:14:48.999998,0 days 13:23:40,HRI,N,GCL,SERSIC 159-03,2,RDF 3_8
245,255.8400,78.6400,RH800676N00,True,1995-03-02 12:38:52.999998,1995-03-06 03:04:16.000003,0 days 12:06:07,HRI,N,GCL,A2256,2,RDF 3_4
283,247.1600,39.5500,RP800644N00,True,1993-07-25 12:07:22.999999,1993-07-28 21:49:24.000004,0 days 11:23:19,PSPCB,N,GCL,A2199,2,RDF 3_4
335,137.2100,-9.6500,RH800768A01,True,1996-04-26 07:31:21.000000,1996-04-28 15:47:16.000002,0 days 10:22:26,HRI,N,GCL,A754,2,RDF 3_4
340,67.7900,-61.4800,RH800688N00,True,1995-09-30 16:47:31.000004,1995-10-08 21:20:24.000003,0 days 10:19:07,HRI,N,GCL,A 3266,2,RDF 3_4
...,...,...,...,...,...,...,...,...,...,...,...,...,...
10019,229.1800,7.0200,RH800223A01,True,1994-07-31 03:09:02.999998,1994-07-31 21:01:26.000003,0 days 00:13:36,HRI,N,GCL,A2052,2,RDF 3_6
10112,174.1300,67.6200,RP900521A01,True,1994-05-04 09:12:06.000002,1994-05-04 09:26:54.999997,0 days 00:12:33,PSPCB,N,EGE,HS 1133+6753,2,RDF 3_4
10509,230.7700,8.6099,RP800184A00,True,1992-02-01 10:04:44.000000,1992-02-01 10:08:44.000002,0 days 00:06:31,PSPCB,N,GCL,A 2063,2,RDF 3_4
10545,255.9000,78.7200,RP100145N00,True,1990-06-20 00:28:57.000003,1990-06-21 22:48:14.999999,0 days 00:05:52,PSPCC,N,STR,XRT/PSPC SPEC/FLUX B,2,RDF 3_3


We can examine the number of pointed observations for each instrument:

In [28]:
rp.filtered_obs_info['instrument'].value_counts()

PSPCB    122
HRI       91
PSPCC      5
Name: instrument, dtype: int64

### NuSTAR Pointed

NuSTAR is not a typical choice for the study of galaxy clusters, but it has been used for this purpose in the past. It is a very capable telescope which may provide extra coverage of our galaxy clusters, and it is also unique in the current selection of missions in that it can observe at very high energies (relative to the typical X-ray range) - this may be useful in future LoVoCCS work, or it may not, but it will be interesting to explore.

This mission does not have a large field of view, and has a limited number of requested pointings at galaxy clusters, so we do not expect very many observations to be selected:

In [14]:
nu = NuSTARPointed()
nu.all_obs_info

Unnamed: 0,ra,dec,ObsID,science_usable,proprietary_usable,start,end,duration,proprietary_end_date,target_category,exposure_a,exposure_b,ontime_a,ontime_b,nupsdout,issue_flag
0,83.8281,-69.2465,40001014013,True,True,2013-06-29 01:16:07.183998,2013-07-05 08:51:07.184004,6 days 07:35:00.000006,2015-09-17 00:00:00,SNR,5 days 11:20:42,5 days 11:09:58,5 days 20:32:48,5 days 20:33:46,2119,1
1,83.7759,-69.2677,40001014016,True,True,2014-04-22 21:06:07.183999,2014-04-29 09:31:07.183998,6 days 12:24:59.999999,2015-09-17 00:00:00,SNR,4 days 23:56:47,4 days 23:35:17,5 days 08:14:46,5 days 08:14:09,0,1
2,83.8965,-69.2477,40001014023,True,True,2014-08-01 23:46:07.183998,2014-08-08 02:41:07.184000,6 days 02:55:00.000002,2015-09-17 00:00:00,SNR,4 days 22:46:56,4 days 22:35:32,5 days 07:01:07,5 days 07:02:55,0,0
3,340.6530,29.6985,60401031004,True,True,2018-11-28 22:21:09.184000,2018-12-08 17:16:09.183997,9 days 18:54:59.999997,2019-12-25 00:00:00,AGN,4 days 17:35:58,4 days 17:01:43,5 days 02:48:25,5 days 02:49:43,0,0
4,6.3841,64.1044,40020001002,True,True,2014-04-12 00:56:07.183997,2014-04-17 22:06:07.184001,5 days 21:10:00.000004,2015-03-31 00:00:00,SNR,3 days 22:05:24,3 days 21:55:12,4 days 05:18:29,4 days 05:18:40,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
5208,0.0000,0.0000,90410210001,True,True,2018-09-28 20:23:08.184002,2018-09-28 20:25:37.183999,0 days 00:02:28.999997,2018-10-04 00:00:00,TOO,0 days 00:00:00,0 days 00:00:00,0 days 00:00:00,0 days 00:00:00,0,1
5209,0.0000,0.0000,90410211001,True,True,2018-09-28 20:25:37.183999,2018-09-28 20:28:04.184000,0 days 00:02:27.000001,2018-10-04 00:00:00,TOO,0 days 00:00:00,0 days 00:00:00,0 days 00:00:00,0 days 00:00:00,0,1
5210,0.0000,0.0000,90410212001,True,True,2018-09-28 20:28:04.184000,2018-09-28 20:30:31.184001,0 days 00:02:27.000001,2018-10-04 00:00:00,TOO,0 days 00:00:00,0 days 00:00:00,0 days 00:00:00,0 days 00:00:00,0,1
5211,0.0000,0.0000,90410213001,True,True,2018-09-28 20:30:31.184001,2018-09-28 20:32:58.184002,0 days 00:02:27.000001,2018-10-04 00:00:00,TOO,0 days 00:00:00,0 days 00:00:00,0 days 00:00:00,0 days 00:00:00,0,1


We search with the search distances defined earlier again, because they are likely to be larger than the telescope field of view:

In [15]:
nu.filter_on_positions(samp[['ra', 'dec']].values, search_distances)
nu.filtered_obs_info

Unnamed: 0,ra,dec,ObsID,science_usable,proprietary_usable,start,end,duration,proprietary_end_date,target_category,exposure_a,exposure_b,ontime_a,ontime_b,nupsdout,issue_flag
43,349.0706,-42.6138,50501001002,True,True,2019-10-10 16:16:09.184002,2019-10-14 18:26:09.184004,4 days 02:10:00.000002,2020-10-30 00:00:00,NGS,2 days 05:02:44,2 days 04:42:39,2 days 08:46:54,2 days 08:47:33,0,0
155,137.424,-9.6895,70201001002,True,True,2017-01-01 05:56:09.183998,2017-01-03 23:21:09.184003,2 days 17:25:00.000005,2018-01-10 00:00:00,GCL,1 days 10:23:26,1 days 10:18:25,1 days 12:49:04,1 days 12:49:10,0,0
276,255.9702,78.6158,70001053002,True,True,2013-03-09 20:21:07.183999,2013-03-11 18:41:07.184003,1 days 22:20:00.000004,2014-02-06 00:00:00,GCL,1 days 04:24:27,1 days 04:23:09,1 days 06:25:03,1 days 06:25:49,0,0
287,227.7529,5.7465,70660001002,True,True,2021-02-17 20:51:09.184000,2021-02-20 09:36:09.184000,2 days 12:45:00,2022-03-01 00:00:00,GCL,1 days 04:15:49,1 days 03:59:56,1 days 06:27:12,1 days 06:27:39,0,0
304,207.1966,26.5877,70660003002,True,True,2020-07-15 14:26:09.184001,2020-07-17 22:46:09.184000,2 days 08:19:59.999999,2021-07-27 00:00:00,GCL,1 days 04:00:30,1 days 03:46:18,1 days 06:03:31,1 days 06:04:06,0,0
352,67.808,-61.4552,70801001002,True,True,2022-11-17 00:31:09.184002,2022-11-19 01:56:09.184004,2 days 01:25:00.000002,2023-06-05 00:00:00,GCL,1 days 02:28:26,1 days 02:13:14,1 days 04:18:25,1 days 04:18:34,0,0
422,247.1378,39.5406,70660004002,True,True,2020-09-18 07:01:09.184000,2020-09-20 06:56:09.184001,1 days 23:55:00.000001,2021-09-28 00:00:00,GCL,0 days 23:46:43,0 days 23:33:57,1 days 01:31:43,1 days 01:31:55,0,0
515,255.4169,78.6901,70001060002,True,True,2013-05-03 05:26:07.183997,2013-05-04 16:46:07.184003,1 days 11:20:00.000006,2014-02-06 00:00:00,GCL,0 days 21:22:11,0 days 21:20:50,0 days 22:49:26,0 days 22:50:08,0,0
583,67.8588,-61.3672,70801002002,True,False,2023-04-18 17:21:09.184003,2023-04-20 01:51:09.183997,1 days 08:29:59.999994,2023-11-08 00:00:00,GCL,0 days 19:33:42,0 days 19:21:28,0 days 20:56:39,0 days 20:56:57,0,0
588,227.9639,5.3188,60401024002,True,True,2019-01-08 13:56:09.184004,2019-01-10 03:36:09.184000,1 days 13:39:59.999996,2020-01-24 00:00:00,AGN,0 days 19:26:54,0 days 19:20:25,0 days 20:49:17,0 days 20:49:37,0,0


We also check to see how many of the observations are still proprietary, and find that there are several which have been requested to point at galaxy clusters (see the target category):

In [16]:
nu.filtered_obs_info[nu.filtered_obs_info['proprietary_usable'] == False]

Unnamed: 0,ra,dec,ObsID,science_usable,proprietary_usable,start,end,duration,proprietary_end_date,target_category,exposure_a,exposure_b,ontime_a,ontime_b,nupsdout,issue_flag
583,67.8588,-61.3672,70801002002,True,False,2023-04-18 17:21:09.184003,2023-04-20 01:51:09.183997,1 days 08:29:59.999994,2023-11-08 00:00:00,GCL,0 days 19:33:42,0 days 19:21:28,0 days 20:56:39,0 days 20:56:57,0,0
1454,67.8621,-61.3675,70801002004,True,False,2023-04-29 10:36:09.184003,2023-04-30 08:26:09.184001,0 days 21:49:59.999998,2023-11-08 00:00:00,GCL,0 days 11:39:10,0 days 11:32:17,0 days 12:29:44,0 days 12:30:03,0,0
2754,49.4718,-44.1908,70860008002,True,False,2023-06-03 21:31:09.184002,2023-06-04 08:51:09.184000,0 days 11:19:59.999998,2023-12-12 00:00:00,GCL,0 days 06:09:43,0 days 06:05:29,0 days 06:34:03,0 days 06:34:02,0,0
2758,55.6502,-53.5903,70860007002,True,False,2023-05-16 07:36:09.184003,2023-05-16 20:06:09.184000,0 days 12:29:59.999997,2023-11-22 00:00:00,GCL,0 days 06:09:24,0 days 06:05:56,0 days 06:37:57,0 days 06:38:15,0,1
3075,201.9855,-31.5431,70860003002,True,False,2023-05-08 21:46:09.183997,2023-05-09 08:36:09.183997,0 days 10:50:00,2023-11-15 00:00:00,GCL,0 days 05:46:16,0 days 05:42:45,0 days 06:09:55,0 days 06:09:55,0,1
3298,194.7977,-4.2149,70860004002,True,False,2023-06-10 00:31:09.184002,2023-06-10 11:11:09.183998,0 days 10:39:59.999996,2023-12-19 00:00:00,GCL,0 days 05:25:37,0 days 05:22:34,0 days 05:52:18,0 days 05:52:23,0,0
3404,194.6245,-1.7801,70860006002,True,False,2023-06-02 15:46:09.183997,2023-06-03 02:41:09.184004,0 days 10:55:00.000007,2023-12-12 00:00:00,GCL,0 days 05:07:56,0 days 05:05:05,0 days 05:31:36,0 days 05:31:50,0,0


### eROSITA Calibration and Performance Verification 

The eROSITA telescope is an excellent choice for the exploration of the ICM of bright, local, clusters at large scales. As of the time of writing there have been no data releases for the eROSITA All-Sky Survey (though they will hopefully be supported by DAXA soon after release), so we can only search the performance verification data, which is a mixture of pointed and slew observations.

In [17]:
er = eROSITACalPV()
er.all_obs_info

Unnamed: 0,ra,dec,ObsID,science_usable,start,end,duration,Field_Name,Field_Type
0,129.550000,1.500000,300007,True,2019-11-03 02:42:50,2019-11-04 03:36:37,89627.0,EFEDS,SURVEY
1,133.860000,1.500000,300008,True,2019-11-04 03:49:16,2019-11-05 05:16:39,91643.0,EFEDS,SURVEY
2,138.140000,1.500000,300009,True,2019-11-05 05:29:18,2019-11-06 06:40:06,90648.0,EFEDS,SURVEY
3,142.450000,1.500000,300010,True,2019-11-06 07:24:46,2019-11-07 08:20:08,89722.0,EFEDS,SURVEY
4,130.331300,-78.963400,300004,True,2019-11-16 23:14:40,2019-11-18 18:17:12,154952.0,ETA_CHA,SURVEY
...,...,...,...,...,...,...,...,...,...
165,284.146250,-37.909167,700008,True,2019-10-24 11:11:19,2019-10-25 08:55:52,78273.0,1RXS_J185635_375433,EXTRAGALACTIC_FIELDS
166,281.540771,79.873726,900060,True,2019-09-24 15:27:06,2019-09-24 21:30:32,21806.0,3C390_3,EXTRAGALACTIC_FIELDS
167,281.500275,79.885376,900068,True,2019-09-28 15:49:51,2019-09-28 21:30:32,20441.0,3C390_3,EXTRAGALACTIC_FIELDS
168,281.489410,79.888214,900069,True,2019-09-29 15:23:24,2019-09-29 21:30:37,22033.0,3C390_3,EXTRAGALACTIC_FIELDS


We search the available eROSITA data and find that two of our LoVoCCS clusters have been observed during the commissioning phase - these were pointed observations, with a good exposure time. They have already been well analysed but will no doubt be useful to us in combination with the LoVoCCS weak-lensing data:

In [18]:
er.filter_on_positions(samp[['ra', 'dec']].values, search_distances)
er.filtered_obs_info

  fov = self.fov


Unnamed: 0,ra,dec,ObsID,science_usable,start,end,duration,Field_Name,Field_Type
148,55.72822,-53.63013,700177,True,2019-11-21 10:22:18,2019-11-22 08:36:11,80033.0,A3158,EXTRAGALACTIC_FIELDS
149,67.84096,-61.44033,700154,True,2019-11-11 02:09:28,2019-11-11 22:46:56,74248.0,A3266,EXTRAGALACTIC_FIELDS


## Downloading the data

Ordinarily the DAXA missions we have declared would be used to declare a DAXA archive, which would then automatically download the data, but as we wish to use some non-standard settings when downloading the data (i.e. we wish to download pre-processed data and images for non-XMM missions), we trigger the mission download method manually for each of them:

In [19]:
xm.download(num_cores=80)

  xm.download(num_cores=80)
  xm.download(num_cores=80)
  xm.download(num_cores=80)


In the case of Chandra, `download_standard=True` means it downloads the standard Chandra data distribution, with pre-cleaned event lists and images, as well as the raw data:

In [20]:
ch.download(num_cores=80, download_standard=True)

  warn("The raw data for this mission have already been downloaded.")


These download commands for the ROSAT All-Sky and ROSAT Pointed missions will download the processed event lists (completely unprocessed ones are not readily available), as well as pre-generated images and exposure maps (though pre-generated exposure maps are not available for HRI data):

In [21]:
ra.download(num_cores=80, download_products=True, download_processed=True)

  warn("The raw data for this mission have already been downloaded.")


In [22]:
rp.download(num_cores=80, download_products=True, download_processed=True)

  warn("The raw data for this mission have already been downloaded.")


The cleaned events and images are downloaded alongside the raw NuSTAR data here:

In [23]:
nu.download(num_cores=80, download_processed=True)

  nu.download(num_cores=80, download_processed=True)
  warn("The raw data for this mission have already been downloaded.")


The only products available for eROSITA Calibration and Performance Verification observations are cleaned event lists, so no pre-generated images or exposure maps are acquired:

In [24]:
er.download(num_cores=80)

  warn("The raw data for this mission have already been downloaded.")


## Setting up a DAXA Archive

We then setup a DAXA archive with the missions we have previously setup. This Archive is responsible for versioning the data and will allow us to make updates in the future, recording all changes made:

In [25]:
arch = Archive([xm, ch, ra, rp, nu, er], 'LoVoCCS', clobber=True)

  arch = Archive([xm, ch, ra, rp, nu, er], 'LoVoCCS', clobber=True)


We can use the archive's `info()` method to give us an overall summary of the amount of data available for our LoVoCCS clusters - though of course we have not yet included every applicable X-ray telescope in DAXA, so further pertinent data may yet be available.

In [26]:
arch.info()


-----------------------------------------------------
Number of missions - 6
Total number of observations - 1333
Beginning of earliest observation - 1990-06-17 17:44:17.000005
End of latest observation - 2023-08-16 05:40:56.999998

-- XMM-Newton Pointed --
   Internal DAXA name - xmm_pointed
   Chosen instruments - PN, M1, M2, R1, R2
   Number of observations - 368
   Fully Processed - False

-- Chandra --
   Internal DAXA name - chandra
   Chosen instruments - ACIS-I, ACIS-S, HRC-I, HRC-S
   Number of observations - 575
   Fully Processed - False

-- RASS --
   Internal DAXA name - rosat_all_sky
   Chosen instruments - PSPC
   Number of observations - 144
   Fully Processed - False

-- ROSAT Pointed --
   Internal DAXA name - rosat_pointed
   Chosen instruments - PSPCB, PSPCC, HRI
   Number of observations - 218
   Fully Processed - False

-- NuSTAR Pointed --
   Internal DAXA name - nustar_pointed
   Chosen instruments - FPMA, FPMB
   Number of observations - 26
   Fully Processed -

## Processing the X-ray data

Here we run DAXA processing for the X-ray data we have acquired. **Currently this only processes XMM data, as that is the mission we are initially interested in**.

_**The total storage required for our processed data, and the raw/pre-processed data downloaded for our X-ray missions is currently ~420GB.**_

### XMM-Newton

We use the standard settings for DAXA processing of XMM data - this creates cleaned, combined, event lists for PN, MOS1, and MOS2, along with images and exposure maps. The backend processing makes use of standard SAS and eSAS tools. It also assembles and cleans RGS data, without assembling any further products as that requires knowledge of the region of interest:

In [27]:
full_process_xmm(arch)

XMM-Newton Pointed - Generating calibration files: 100%|██████████| 368/368 [2:42:26<00:00, 26.49s/it]   
XMM-Newton Pointed - Generating ODF summary files: 100%|██████████| 368/368 [15:52<00:00,  2.59s/it]
XMM-Newton Pointed - Assembling RGS event lists: 100%|██████████| 718/718 [01:14<00:00,  9.65it/s]
XMM-Newton Pointed - Correcting RGS for aspect drift: 100%|██████████| 718/718 [00:31<00:00, 22.63it/s]
XMM-Newton Pointed - Cleaning RGS event lists: 100%|██████████| 718/718 [00:14<00:00, 51.08it/s] 
XMM-Newton Pointed - Assembling PN and PN-OOT event lists: 100%|██████████| 347/347 [2:17:12<00:00, 23.72s/it]    
XMM-Newton Pointed - Assembling MOS event lists: 100%|██████████| 728/728 [15:02<00:00,  1.24s/it] 
XMM-Newton Pointed - Finding PN/MOS soft-proton flares: 100%|██████████| 1072/1072 [04:43<00:00,  3.78it/s]
XMM-Newton Pointed - Generating cleaned PN/MOS event lists: 100%|██████████| 1043/1043 [00:55<00:00, 18.74it/s] 
XMM-Newton Pointed - Generating final PN/MOS event lists

## Adding source regions to the Archive

Source regions are a key part of any analysis, but DAXA does not currently have the capability to apply source detection to observations that it processes. As such we shall use source regions generated by the XMM Cluster Survey's (XCS) XAPA source finder. Passing them into the DAXA archive will store them in the DAXA archive directory structure:

In [40]:
reg_path = "../sample_files/xcs_xmm_regions/{o_id}_sky_regions.reg"

In [60]:
success_oi = [oi for oi, success in arch.final_process_success['xmm_pointed'].items() if success 
              and os.path.exists(reg_path.format(o_id=oi))]

reg_dec = {'xmm_pointed': {o: {'regions': reg_path.format(o_id=o)} for o in success_oi}}
arch.source_regions = reg_dec

We find that there are some missing region files for ObsIDs that were successfully processed - we will need to address this:

In [62]:
for g_oi in [oi for oi, success in arch.final_process_success['xmm_pointed'].items() if success]:
    if g_oi not in success_oi:
        print(g_oi)

0720250401
0761550101
0800380101
0843890101
0844050601
0843890201
0881210101
0903510101
0901830601
0904610101
0763940535
0763940536
