# Find specific HEASARC catalogs using Python

## Learning Goals

This notebook will teach you:
- How to use Astroquery to search HEASARC's holdings for specific catalogs.

## Introduction

This bite-sized tutorial will demonstrate how you can use Python to search for a
catalog relevant to your use case from HEASARC's holdings.

Our catalog archive currently contains over 1000 entries and is always growing, so just
finding (let alone using) the right catalog can be challenging.

### Runtime

As of 4th February 2026, this notebook takes ~30 s to run to completion on Fornax using the 'Default Astrophysics' image and the 'small' server with 8GB RAM/ 2 cores.

## Imports

In [1]:
from astroquery.heasarc import Heasarc

***

## 1. Retrieve the name and description of every HEASARC catalog

We have imported the `Heasarc` object from the `astroquery.heasarc` module and can
use it to retrieve a list of **all** catalogs in our archive:

In [2]:
all_hea_cat = Heasarc.list_catalogs()

The output of `Heasarc.list_catalogs()` (assigned to the `all_hea_cat`
variable) is an Astropy Table object - we can tell this from
the `list_catalogs` docstring, accessed using Python's built-in `help` function:

In [3]:
help(Heasarc.list_catalogs)

Help on method list_catalogs in module astroquery.heasarc.core:

list_catalogs(*, master=False, keywords=None) method of astroquery.heasarc.core.HeasarcClass instance
    Return a table of all available catalogs with two columns
    (name, description)

    Parameters
    ----------
    master : bool
        Select only master catalogs. Default is False
    keywords : str or list
        a str or a list of str of keywords used as search
        terms for catalogs. Words with a str separated by a space
        are AND'ed, while words in a list are OR'ed

    Returns
    -------
    `~astropy.table.Table` with columns: name, description


Alternatively, we could use `type()` to directly check the type of the returned object:

In [4]:
type(all_hea_cat)

astropy.table.table.Table

## 2. Examining the table of catalogs

In a Python notebook (like this one) we can put a variable name on the last line
of a cell to display its contents; for an Astropy `Table` object it will render a
nice visualization of the contents:

In [5]:
all_hea_cat

name,description
str27,str64
"""first""",Faint Images of the Radio Sky at Twenty cm (FIRST)
a1,HEAO 1 A1 X-Ray Source Catalog
a1point,HEAO 1 A1 Lightcurves
a2lcpoint,HEAO 1 A2 Pointed Lightcurves
a2lcscan,HEAO 1 A2 Scanned Lightcurves
a2led,HEAO 1 A2 LED Catalog
a2pic,HEAO 1 A2 Piccinotti Catalog
a2point,HEAO 1 A2 Pointing Catalog
a2rtraw,HEAO 1 A2 Raw Rates
...,...


If you're more familiar with Pandas DataFrames than you are with Astropy tables, we can
use the `to_pandas()` method of the Astropy `Table` object to convert it to a
Pandas DataFrame. We can visualize the resulting dataframe in much the same way as
an Astropy `Table` object, though in this case we limit the number of rows to six:

In [6]:
pd_all_hea_cat = all_hea_cat.to_pandas()
pd_all_hea_cat.head(6)

Unnamed: 0,name,description
0,"""first""",Faint Images of the Radio Sky at Twenty cm (FI...
1,a1,HEAO 1 A1 X-Ray Source Catalog
2,a1point,HEAO 1 A1 Lightcurves
3,a2lcpoint,HEAO 1 A2 Pointed Lightcurves
4,a2lcscan,HEAO 1 A2 Scanned Lightcurves
5,a2led,HEAO 1 A2 LED Catalog


## 3. Filter the table of catalogs

```{important}
We generally recommend using direct keyword searches through
the `Heasarc.list_catalogs()` method (see section 4), rather than filtering
the table of catalogs in the way we demonstrate here.

On the other hand, this method is useful if you need more flexibility than
is provided by the keywords method.
```

As we have a table (or dataframe) of catalog names and descriptions, we can perform
all the usual boolean filtering operations on it to narrow down the list and find a
catalog we might be interested in.

Using the Pandas dataframe version of the all-catalogs-table (stored in the
`pd_all_hea_cat` variable), we can very easily filter the table based on what
the contents of the 'description' column are.

For instance, we can find out which of the catalog descriptions contain the
word 'NuSTAR', produce a boolean array, and use it as a mask for the original table:

In [7]:
# Use a Pandas string method to look for sub-string matches to 'NuSTAR' in
#  all entries of the 'description' column
nustar_mask = pd_all_hea_cat["description"].str.contains("NuSTAR")

# Apply the boolean mask to filter the table
pd_all_hea_cat[nustar_mask]

Unnamed: 0,name,description
608,nuaftl,NuSTAR As-Flown Timeline
609,nucosmosfc,NuSTAR COSMOS Field X-Ray Source Catalog
610,nuecdfscat,NuSTAR Survey of Extended Chandra Deep Field S...
611,nugalcen,NuSTAR Hard X-Ray Survey of the Galactic Center
612,numaster,NuSTAR Master Catalog
613,nustarssc,NuSTAR Serendipitous Survey 40-Month Primary S...
614,nustarssc2,NuSTAR Serendipitous Survey 40-Month Secondary...


More complex filtering operations can be performed using the same approach; for
instance, if you wanted to find all catalogs whose description mentions
XMM and Chandra, but **not** ROSAT:

In [8]:
desc_str = pd_all_hea_cat["description"].str

filt_mask = (
    desc_str.contains("XMM")
    & desc_str.contains("Chandra")
    & ~desc_str.contains("ROSAT")
)
ch_xmm_no_ros_search = pd_all_hea_cat[filt_mask]
ch_xmm_no_ros_search

Unnamed: 0,name,description
313,gmrt1hxcsf,Giant Metrewave Radio Telescope 1h XMM/Chandra...
379,ic10xmmcxo,IC 10 XMM-Newton and Chandra X-Ray Point Sourc...
985,xmmomcdfs,XMM-Newton Optical Monitor Chandra Deep Field-...


Note that the `~` operator in the mask above inverts the result of the last `contains`
operation, so that only catalogs that mention XMM and Chandra **and** do not mention ROSAT
are selected.

If we hadn't included the final expression, we would have gotten the following:

In [9]:
desc_str = pd_all_hea_cat["description"].str

filt_mask = desc_str.contains("XMM") & desc_str.contains("Chandra")
pd_all_hea_cat[filt_mask]

Unnamed: 0,name,description
313,gmrt1hxcsf,Giant Metrewave Radio Telescope 1h XMM/Chandra...
379,ic10xmmcxo,IC 10 XMM-Newton and Chandra X-Ray Point Sourc...
724,ros13hrcxo,ROSAT/XMM-Newton 13-hour Field Chandra X-Ray S...
985,xmmomcdfs,XMM-Newton Optical Monitor Chandra Deep Field-...


## 4. Search for catalogs using keywords [**recommended**]

Here we demonstrate the recommended method to search for specific catalogs - passing
values to the `keywords` argument of the `Heasarc.list_catalogs()` method.

The simplest case is searching for catalogs using a single keyword. For instance, if
we thought we needed a catalog based on Chandra observations:

In [10]:
Heasarc.list_catalogs(keywords="chandra")

name,description
str11,str64
acceptcat,Archive of Chandra Cluster Entropy Profile Tables (ACCEPT) Catal
aegisx,AEGIS-X Chandra Extended Groth Strip X-Ray Point Source Catalog
aegisxdcxo,AEGIS-X Deep Survey Chandra X-Ray Point Source Catalog
aknepdfcxo,Akari North Ecliptic Pole Deep Field Chandra X-Ray Point Source
arcquincxo,Arches and Quintuplet Clusters Chandra X-Ray Point Source Catalo
atcdfsss82,Australia Telescope Chandra Deep Field-South and SDSS Stripe 82
bmwchancat,Brera Multi-scale Wavelet Chandra Source Catalog
candelscxo,CANDELS H-Band Selected Chandra Source Catalog
cargm31cxo,Carina Nebula Gum 31 Chandra X-Ray Point Source Catalog
...,...


That keyword search has returned a lot of catalogs, so maybe we want to
narrow it down a bit. For example, we might want to find Chandra-based catalogs that
are related to galaxy clusters; that involves identifying catalogs that have both
'chandra' and 'cluster' keywords.

In other words, we want to use an **AND** boolean operation between all the keywords
we have decided are relevant. That is achieved by passing a *string* of space-separated
words to the `keywords` argument of the `Heasarc.list_catalogs()` method:

In [11]:
Heasarc.list_catalogs(keywords="chandra cluster")

name,description
str10,str64
acceptcat,Archive of Chandra Cluster Entropy Profile Tables (ACCEPT) Catal
arcquincxo,Arches and Quintuplet Clusters Chandra X-Ray Point Source Catalo
gc47tuccx2,47 Tuc Globular Cluster Chandra X-Ray Point Source Catalog (2017
gc47tuccxo,47 Tuc Globular Cluster Chandra X-Ray Point Source Catalog (2005
gcptsrccxo,Chandra Point Sources in 18 Distant Galaxy Clusters
ic1396acxo,IC 1396A & Trumpler 37 Cluster Chandra X-Ray Point Source Catalo
omegcencx2,Omega Centauri Globular Cluster Chandra Deep Survey X-Ray Point
omegcencxo,Omega Centauri Globular Cluster Chandra X-Ray Point Source Catal
onccxoopt,Orion Nebula Cluster Chandra HRC Optical Sample
onccxoxray,Orion Nebula Cluster Chandra HRC X-Ray Point Source Catalog


Finally, if you want to search for catalogs that match **any** of a passed set
of keywords (i.e., an **OR** boolean operation), you can pass a list of strings to
the `keywords` argument.

In this case, we've decided that we want to find two prominent X-ray galaxy
cluster catalogs that we already know the names of:

In [12]:
acc_xcs_search = Heasarc.list_catalogs(keywords=["accept", "xcs"])
acc_xcs_search

name,description
str10,str64
acceptcat,Archive of Chandra Cluster Entropy Profile Tables (ACCEPT) Catal
gmrt1hxcsf,Giant Metrewave Radio Telescope 1h XMM/Chandra Survey Fld 610-MH
swxcscat,Swift X-Ray Telescope Cluster Survey Catalog
swxcsoxid,Swift X-Ray Telescope Cluster Survey Cross-Correlation Catalog
xcs,"XMM-Newton Cluster Survey Catalog, DR1 Version"
xmmao,XMM-Newton Accepted Targets


## 5. What next?

At this point, you should have a good idea of how to search for specific catalogs in
HEASARC's holdings. Once you have a list of catalogs, you can either pick out the
names just by looking at the table or by retrieving the name column programmatically.

For an Astropy Table, this snippet outputs the catalog names as a Numpy array:

In [13]:
acc_xcs_name_arr = acc_xcs_search["name"].value
acc_xcs_name_arr

array(['acceptcat', 'gmrt1hxcsf', 'swxcscat', 'swxcsoxid', 'xcs', 'xmmao'],
      dtype='<U10')

For a Pandas DataFrame, this snippet outputs the catalog names as an array:

In [14]:
ch_xmm_no_ros_name_arr = ch_xmm_no_ros_search["name"].values
ch_xmm_no_ros_name_arr

array(['gmrt1hxcsf', 'ic10xmmcxo', 'xmmomcdfs'], dtype=object)

## About this notebook

Author: David Turner, HEASARC Staff Scientist

Updated On: 2026-02-04

### Additional Resources

Support: [HEASARC Helpdesk](https://heasarc.gsfc.nasa.gov/cgi-bin/Feedback?selected=heasarc)

### Acknowledgements

### References

[Ginsburg, SipÅ‘cz, Brasseur et al. (2019)](https://ui.adsabs.harvard.edu/abs/2019AJ....157...98G/abstract) - _astroquery: An Astronomical Web-querying Package in Python_

[Cavagnolo K. W., Donahue M., Voit G. M., Sun M. (2009)](https://ui.adsabs.harvard.edu/abs/2009ApJS..182...12C/abstract) - _Intracluster Medium Entropy Profiles for a Chandra Archival Sample of Galaxy Clusters_

[Mehrtens N., Romer A. K., Hilton M. et al. (2012)](https://ui.adsabs.harvard.edu/abs/2012MNRAS.423.1024M/abstract) - _The XMM Cluster Survey: optical analysis methodology and the first data release_