Skip to content

ErikVini/specz_compilation

Repository files navigation

The Southern-Hemisphere Spectroscopic Redshift Compilation

DOI - 10.5281/zenodo.11641314

GitHub tag License

Zenodo - Download



Description

This page is reserved for releases of a compilation of spectrocopic redshifts for the Southern Hemisphere (declination below +10 degrees) focused on galaxies.

The goal of this compilation is to provide a training sample for a machine-learning photometric redshift model (de Lima et al. 2022, arXiv link) to be used in the Southern-Photometric Local Universe Survey (SPLUS, Mendes de Oliveira et al. 2019, arXiv link).

This compilation contains 5000+ catalogues of spectroscopic redshifts from services such as VizieR, HEASARC, SDSS, and others. After removing duplicates, the number of catalogues in the final compination is 1852 and the total number of objects is 8437460, including galaxies, stars, QSOs, and other object types. The catalogue name in the TAP services, work titles, number of objects, and authors are present in the files reference_catalogues_all.csv, external_catalogues_used.csv, and reference_catalogues_used.csv for all downloaded tables and the ones used in the final compilation (after removing duplicates) respectively.

The catalogue can be downloaded via Google Drive in the "Releases" section.

Compilation numbers:

  • STAR: 4812282
  • GALAXY: 2548065
  • QSO: 507894
  • AGN: 35192
  • GLOBCLUSTER: 2245
  • SUPERNOVAE: 1948
  • UNCLEAR (described below): 529602

The columns available are:

  • RA: right ascension (degrees),
  • DEC: declination (degrees),
  • z: spectroscopic redshift,
  • e_z: error in the spectroscopic redshift,
  • f_z: flag for the spectroscopic redshift quality,
  • class_spec: spectroscopic classification of the object,
  • original_class_spec: original spectroscopic classification of the object (before grouping),
  • source: TAP service and catalogue from which the information was obtained.

Not all catalogues used contain information about the redshift error, quality flags, and/or classes. In these situations the value is left empty.

How it was done

A script written in Python is used to download all catalogues from the VizieR and HEASARC table access protocol (TAP) services using a series of queries to obtain the table names, respective column names for coordinates, redshift, redshift error, and class, and their descriptions. This information is used to generate another table which contains all information needed to query for the objects (RA, DEC, z, e_z, f_z, class_spec).

Before downloading all possible tables, and since there are duplicates, a pre-processing is done to remove tables that:

  • The z column does not correspond to spectroscopic redshifts (such as metallicities or distance above the plane of the galaxy, for example)
  • The e_z, f_z, or class columns does not correspond to the information needed Example: some tables have flags in magnitudes (e.g. in the z-band), but they share the same unified content descriptor (UCD) with spectroscopic redshift UCD, so the table may be downloaded with the wrong information.

This process is done for VizieR and HEASARC. In both cases any tables with zero objects are removed.

For VizieR, a correction for J1950 coordinates is applied. Also, any tables that represent distances as cz also have the redshift and error values corrected.

Classes

For tables that have this information, a manual procedure was applied to group classes into STAR, GALAXY, QSO, AGN, GLOBCLUSTER, or UNCLEAR. When avaliable, sub-classes are included. Some examples are:

  • GALAXY
    • GALAXY(SF): Star-forming galaxies
    • GALAXY(GROUP): Galaxy group
    • GALAXY(RADIO): Radio galaxy
    • GALAXY(CLUSTER): Cluster of galaxies
    • GALAXY(PAIR): Pair of galaxies
    • GALAXY(COMP): Composite galaxy
    • GALAXY(UNCLEAR): Unclear galaxy (It's a extended object)
    • GALAXY(TRIPLET): Galaxy triplet
    • GALAXY(BCG): Brighest Cluster Galaxy
    • GALAXY(LINER): LINER galaxy
    • GALAXY(STRIPPING): Stripping galaxy
    • GALAXY(LSB): Low surface brightness galaxy
    • GALAXY(JELLYFISH): Jellyfish galaxy
  • STAR
    • STAR(NEB): Nebulae
    • STAR(WD): White-dwarf star
    • STAR(HII): HII region
  • SUPERNOVAE
  • QSO
    • QSO(BLLAC): BL-Lac object
    • QSO(BLAZAR): Blazar
  • AGN
    • AGN(Sy2): Seyfert 2 AGN
    • AGN(Sy1): Seyfert 1 AGN
    • AGN(Sy): AGN with unspecified Seyfert type
  • GLOBCLUSTER
  • UNCLEAR: This class represents classification that were unclear, according to the catalogue authors, or may not be correct
    • UNCLEAR(XRAY)
    • UNCLEAR(IR)
    • UNCLEAR(POINTLIKE)
    • UNCLEAR(HI)
    • UNCLEAR(EXTENDED)
    • UNCLEAR(RADIO)
    • UNCLEAR(BROADLINE)
    • UNCLEAR(ASTEROID)
    • UNCLEAR(HII)
    • UNCLEAR(EmLS)
    • UNCLEAR(Sy2)

The UNCLEAR class is reserved for objects where the classification was not clear enough to be included in the other six groups. Some other details on the classification are:

  • Classes ending with (SIMBAD): these objects did not contain spectroscopic class information until the 'reordening by missing information' step. They were crossmatched with the SIMBAD database and its class is adopted if there is a match.
  • Classes ending with (FULL): these objects received a classification from the entire catalogue (the source). For example, if a catalogue is named "Spectroscopic redshifts for galaxies" and there is no spectroscopic class information, its objects are classified as GALAXY (FULL).

Be aware that this classification may change, and an update to class names to align them with the SIMBAD object types is underway.

Flags

For tables with this information, a manual verification was made in order to classify flags as KEEP, or REMOVE. The KEEP flag indicates that the spectroscopic redshift is reliable according to the authors of the catalogue and the REMOVE indicates that the given measurement is not reliable. The original flag is maintained whitin parenthesis after the KEEP or REMOVE words.

Merging catalogues

After all VizieR and HEASARC catalogues were downloaded and processed, they were concatenated with the SDSS DR18, PRIMUS, NED, HECATE 2dFLenS, GLADE+, DESI, and the Maddox et al. spectroscopic redshifts for the Fornax Cluster (Maddox, N. et al. 2019) catalogues.

Duplicate removal procedure

After concatenating all catalogues, the resulting catalogue will inevitably have duplicate objects. To try and remove those objects, the Python package Scipy was used.

Before removing duplicates, the table was sorted in order to keep the objects with most information at the top. The catalogue is then composed of blocks:

  • Objects with e_z, f_z, and class_spec,
  • objects with e_z and class_spec,
  • objects with e_z and f_z,
  • objects with f_z and class_spec,
  • objects with e_z,
  • objects with class_spec,
  • objects with f_z, and
  • objects without e_z, f_z or class_spec.

Inside each block, a priority is given to the object types in the order GALAXY, AGN, SUPERNOVAE, QSO, STAR, GLOBCLUSTER, UNCLEAR, and the remaining objects, and to the spectroscopic redshift flag (KEEP, REMOVE). This way the duplicate removal procedure tries to keep the most galaxies and gives priority to objects with a quality flag information.

An internal match is done using the sky coordinates with a 1 arcsecond maximum separation, keeping only the first ocurrence (thus the importance of the sorting procedure above).

Known issues

Some redshifts are duplicated even after the previous duplicate removal procedure. This is more common for extended objects (such as big galaxies in nearby clusters). This happens because, although the measurements lie inside a 2 arcsecond radius, they differ by more than 0.002 in z, thus they are not detected as duplicated and are not removed.

The HEASARC table descriptions are incomplete. There seems to be a limit to how many characters are returned, so the text is cut.

The final catalogue

Distribution of objects in the sky. Image also available in "Images" folder.

Number of objects per class. Image also available in "Images" folder.

Number of objects per flag. Image also available in "Images" folder.

Distribution of redshifts per class. Image also available in "Images" folder.

How to cite

If you use this compilation in your work, please use the following citation:

BibTex:

@dataset{delima_specz_compilation,
  author       = {Erik Vinicius Rodrigues de Lima},
  title        = {{ErikVini/specz\_compilation: Southern Hemisphere 
                   Spectrocopic Redshift Compilation}},
  month        = jul,
  year         = 2024,
  publisher    = {Zenodo},
  doi          = {10.5281/zenodo.12728524},
  url          = {https://doi.org/10.5281/zenodo.12728524}
}