This page is reserved for releases of a compilation of spectrocopic redshifts for the Southern Hemisphere (declination below +10 degrees) focused on galaxies.
The goal of this compilation is to provide a training sample for a machine-learning photometric redshift model (de Lima et al. 2022, arXiv link) to be used in the Southern-Photometric Local Universe Survey (SPLUS, Mendes de Oliveira et al. 2019, arXiv link).
This compilation contains 5000+ catalogues of spectroscopic redshifts from services such as VizieR, HEASARC, SDSS, and others. After removing duplicates, the number of catalogues in the final compination is 1852 and the total number of objects is 8437460, including galaxies, stars, QSOs, and other object types. The catalogue name in the TAP services, work titles, number of objects, and authors are present in the files reference_catalogues_all.csv
, external_catalogues_used.csv
, and reference_catalogues_used.csv
for all downloaded tables and the ones used in the final compilation (after removing duplicates) respectively.
The catalogue can be downloaded via Google Drive in the "Releases" section.
Compilation numbers:
STAR
: 4812282GALAXY
: 2548065QSO
: 507894AGN
: 35192GLOBCLUSTER
: 2245SUPERNOVAE
: 1948UNCLEAR
(described below): 529602
The columns available are:
RA
: right ascension (degrees),DEC
: declination (degrees),z
: spectroscopic redshift,e_z
: error in the spectroscopic redshift,f_z
: flag for the spectroscopic redshift quality,class_spec
: spectroscopic classification of the object,original_class_spec
: original spectroscopic classification of the object (before grouping),source
: TAP service and catalogue from which the information was obtained.
Not all catalogues used contain information about the redshift error, quality flags, and/or classes. In these situations the value is left empty.
A script written in Python is used to download all catalogues from the VizieR and HEASARC table access protocol (TAP) services using a series of queries to obtain the table names, respective column names for coordinates, redshift, redshift error, and class, and their descriptions. This information is used to generate another table which contains all information needed to query for the objects (RA
, DEC
, z
, e_z
, f_z
, class_spec
).
Before downloading all possible tables, and since there are duplicates, a pre-processing is done to remove tables that:
- The
z
column does not correspond to spectroscopic redshifts (such as metallicities or distance above the plane of the galaxy, for example) - The
e_z
,f_z
, orclass
columns does not correspond to the information needed Example: some tables have flags in magnitudes (e.g. in the z-band), but they share the same unified content descriptor (UCD) with spectroscopic redshift UCD, so the table may be downloaded with the wrong information.
This process is done for VizieR and HEASARC. In both cases any tables with zero objects are removed.
For VizieR, a correction for J1950 coordinates is applied. Also, any tables that represent distances as cz
also have the redshift and error values corrected.
For tables that have this information, a manual procedure was applied to group classes into STAR
, GALAXY
, QSO
, AGN
, GLOBCLUSTER
, or UNCLEAR
. When avaliable, sub-classes are included. Some examples are:
GALAXY
GALAXY(SF)
: Star-forming galaxiesGALAXY(GROUP)
: Galaxy groupGALAXY(RADIO)
: Radio galaxyGALAXY(CLUSTER)
: Cluster of galaxiesGALAXY(PAIR)
: Pair of galaxiesGALAXY(COMP)
: Composite galaxyGALAXY(UNCLEAR)
: Unclear galaxy (It's a extended object)GALAXY(TRIPLET)
: Galaxy tripletGALAXY(BCG)
: Brighest Cluster GalaxyGALAXY(LINER)
: LINER galaxyGALAXY(STRIPPING)
: Stripping galaxyGALAXY(LSB)
: Low surface brightness galaxyGALAXY(JELLYFISH)
: Jellyfish galaxy
STAR
STAR(NEB)
: NebulaeSTAR(WD)
: White-dwarf starSTAR(HII)
: HII region
SUPERNOVAE
QSO
QSO(BLLAC)
: BL-Lac objectQSO(BLAZAR)
: Blazar
AGN
AGN(Sy2)
: Seyfert 2 AGNAGN(Sy1)
: Seyfert 1 AGNAGN(Sy)
: AGN with unspecified Seyfert type
GLOBCLUSTER
UNCLEAR
: This class represents classification that were unclear, according to the catalogue authors, or may not be correctUNCLEAR(XRAY)
UNCLEAR(IR)
UNCLEAR(POINTLIKE)
UNCLEAR(HI)
UNCLEAR(EXTENDED)
UNCLEAR(RADIO)
UNCLEAR(BROADLINE)
UNCLEAR(ASTEROID)
UNCLEAR(HII)
UNCLEAR(EmLS)
UNCLEAR(Sy2)
The UNCLEAR
class is reserved for objects where the classification was not clear enough to be included in the other six groups. Some other details on the classification are:
- Classes ending with
(SIMBAD)
: these objects did not contain spectroscopic class information until the 'reordening by missing information' step. They were crossmatched with the SIMBAD database and its class is adopted if there is a match. - Classes ending with
(FULL)
: these objects received a classification from the entire catalogue (the source). For example, if a catalogue is named "Spectroscopic redshifts for galaxies" and there is no spectroscopic class information, its objects are classified asGALAXY (FULL)
.
Be aware that this classification may change, and an update to class names to align them with the SIMBAD object types is underway.
For tables with this information, a manual verification was made in order to classify flags as KEEP
, or REMOVE
. The KEEP
flag indicates that the spectroscopic redshift is reliable according to the authors of the catalogue and the REMOVE
indicates that the given measurement is not reliable. The original flag is maintained whitin parenthesis after the KEEP
or REMOVE
words.
After all VizieR and HEASARC catalogues were downloaded and processed, they were concatenated with the SDSS DR18, PRIMUS, NED, HECATE 2dFLenS, GLADE+, DESI, and the Maddox et al. spectroscopic redshifts for the Fornax Cluster (Maddox, N. et al. 2019) catalogues.
After concatenating all catalogues, the resulting catalogue will inevitably have duplicate objects. To try and remove those objects, the Python
package Scipy
was used.
Before removing duplicates, the table was sorted in order to keep the objects with most information at the top. The catalogue is then composed of blocks:
- Objects with
e_z
,f_z
, andclass_spec
, - objects with
e_z
andclass_spec
, - objects with
e_z
andf_z
, - objects with
f_z
andclass_spec
, - objects with
e_z
, - objects with
class_spec
, - objects with
f_z
, and - objects without
e_z
,f_z
orclass_spec
.
Inside each block, a priority is given to the object types in the order GALAXY
, AGN
, SUPERNOVAE
, QSO
, STAR
, GLOBCLUSTER
, UNCLEAR
, and the remaining objects, and to the spectroscopic redshift flag (KEEP
, REMOVE
). This way the duplicate removal procedure tries to keep the most galaxies and gives priority to objects with a quality flag information.
An internal match is done using the sky coordinates with a 1 arcsecond maximum separation, keeping only the first ocurrence (thus the importance of the sorting procedure above).
Some redshifts are duplicated even after the previous duplicate removal procedure. This is more common for extended objects (such as big galaxies in nearby clusters). This happens because, although the measurements lie inside a 2 arcsecond radius, they differ by more than 0.002 in z, thus they are not detected as duplicated and are not removed.
The HEASARC table descriptions are incomplete. There seems to be a limit to how many characters are returned, so the text is cut.
If you use this compilation in your work, please use the following citation:
BibTex:
@dataset{delima_specz_compilation,
author = {Erik Vinicius Rodrigues de Lima},
title = {{ErikVini/specz\_compilation: Southern Hemisphere
Spectrocopic Redshift Compilation}},
month = jul,
year = 2024,
publisher = {Zenodo},
doi = {10.5281/zenodo.12728524},
url = {https://doi.org/10.5281/zenodo.12728524}
}