In [16]:
__author__ = 'Stephanie Juneau <stephanie.juneau@noirlab.edu>, Felix Pat <felixpat10@email.arizona.edu>'
__version__ = '20240606' #yyyymmdd
__datasets__ = ['gogreen_dr1']
__keywords__ = ['gemini llp','file service','spectra','catalogues']

# Listing files in the GOGREEN DR1 Dataset
*Authors: Stephanie Juneau (NOIRLab), Felix Pat (Univ. of Arizona), and the Astro Data Lab Team*

### Table of contents
* [Goal](#LFgoal)
* [Summary](#LFsummary)
* [Disclaimer & attribution](#LFdisclaimer)
* [Imports](#LFimport)
* [Reading in Cluster table from the Data Lab database](#LF0)
* [Pathway to 1D, 2D, and image directories](#LF1)
* [1D spectra files](#LF2)
* [2D spectra files](#LF3)
* [Image files](#LF4)
* [References](#LF5)

<a class="anchor" id="LFgoal"></a>
# Goal
This notebook uses the file service to list available files as part of the [GOGREEN DR1 dataset](https://datalab.noirlab.edu/gogreendr1/), which includes data from [GOGREEN](https://ui.adsabs.harvard.edu/abs/2017MNRAS.470.4168B/abstract) and [GCLASS.](https://ui.adsabs.harvard.edu/abs/2012ApJ...746..188M/abstract)

<a class="anchor" id="LFsummary"></a>
# Summary
This notebook prints out files available in Data Lab from [GOGREEN DR1 dataset](https://ui.adsabs.harvard.edu/abs/2021MNRAS.500..358B/abstract) 1D spectra, 2D spectra, and image directories. After listing each cluster name and whether their respective file is available, one can use the lists to retrieve data for clusters and/or galaxies of interest. This notebooks gives a tour of the data files as a starting point while other GOGREEN notebooks demonstrate various capabilities for data access and analyses.

<a class="anchor" id="attribution"></a>
# Disclaimer & attribution

Disclaimers
-----------
Note that using the Astro Data Lab constitutes your agreement with our minimal [Disclaimers](https://datalab.noirlab.edu/disclaimers.php).

Acknowledgments
---------------
If you use **Astro Data Lab** in your published research, please include the text in your paper's Acknowledgments section:

_This research uses services or data provided by the Astro Data Lab, which is part of the Community Science and Data Center (CSDC) Program of NSF NOIRLab. NOIRLab is operated by the Association of Universities for Research in Astronomy (AURA), Inc. under a cooperative agreement with the U.S. National Science Foundation._

If you use **SPARCL jointly with the Astro Data Lab platform** (via JupyterLab, command-line, or web interface) in your published research, please include this text below in your paper's Acknowledgments section:

_This research uses services or data provided by the SPectra Analysis and Retrievable Catalog Lab (SPARCL) and the Astro Data Lab, which are both part of the Community Science and Data Center (CSDC) Program of NSF NOIRLab. NOIRLab is operated by the Association of Universities for Research in Astronomy (AURA), Inc. under a cooperative agreement with the U.S. National Science Foundation._

In either case **please cite the following papers**:

* Data Lab concept paper: Fitzpatrick et al., "The NOAO Data Laboratory: a conceptual overview", SPIE, 9149, 2014, https://doi.org/10.1117/12.2057445

* Astro Data Lab overview: Nikutta et al., "Data Lab - A Community Science Platform", Astronomy and Computing, 33, 2020, https://doi.org/10.1016/j.ascom.2020.100411

If you are referring to the Data Lab JupyterLab / Jupyter Notebooks, cite:

* Juneau et al., "Jupyter-Enabled Astrophysical Analysis Using Data-Proximate Computing Platforms", CiSE, 23, 15, 2021, https://doi.org/10.1109/MCSE.2021.3057097

If publishing in a AAS journal, also add the keyword: `\facility{Astro Data Lab}`

And if you are using SPARCL, please also add `\software{SPARCL}` and cite:

* Juneau et al., "SPARCL: SPectra Analysis and Retrievable Catalog Lab", Conference Proceedings for ADASS XXXIII, 2024
https://doi.org/10.48550/arXiv.2401.05576

The NOIRLab Library maintains [lists of proper acknowledgments](https://noirlab.edu/science/about/scientific-acknowledgments) to use when publishing papers using the Lab's facilities, data, or services.

<a class="anchor" id="LFimport"></a>
# Imports

In [2]:
# 3rd party
import textwrap
wrapper = textwrap.TextWrapper(width=200)

# Data Lab
from dl import queryClient as qc, storeClient as sc

<a class="anchor" id="LF0"></a>
# Read in Clusters table from the [gogreen_dr1 database](https://datalab.noirlab.edu/query.php?name=gogreen_dr1.clusters)

In [3]:
clusters = qc.query('select * from gogreen_dr1.clusters', fmt='pandas')

In [4]:
clusters.columns

Index(['cluster', 'fullname', 'cluster_id', 'ra_best', 'dec_best', 'ra_gmos',
       'dec_gmos', 'pa_deg', 'redshift', 'vdisp', 'vdisp_err', 'gogreen_m1',
       'gogreen_m2', 'gogreen_m3', 'gogreen_m4', 'gogreen_m5', 'gogreen_m6',
       'gclass_m1', 'gclass_m2', 'gclass_m3', 'gclass_m4', 'gclass_m5',
       'kphot_cat', 'photoz_cat', 'stelmass_cat', 'image_u', 'image_b',
       'image_g', 'image_v', 'image_r', 'image_i', 'image_z', 'image_j',
       'image_j1', 'image_y', 'image_k', 'image_irac1', 'preimage',
       'random_id'],
      dtype='object')

In [5]:
# List the cluster names, and the first mask from GOGREEN and from GCLASS to check
# which clusters don't have any GOGREEN data (gogreen_m1 = NaN)
clusters[['cluster','gogreen_m1','gclass_m1']]

Unnamed: 0,cluster,gogreen_m1,gclass_m1
0,COSMOS-125,GS2015ALP001-02,
1,COSMOS-221,GS2014BLP001-05,
2,COSMOS-28,GN2015BLP004-03,
3,COSMOS-63,GN2015BLP004-02,
4,SPT0205,GS2014BLP001-06,
5,SPT0546,GS2014BLP001-09,
6,SPT2106,GS2018ALP001-01,
7,SXDF49,GN2015BLP004-01,
8,SXDF64,GS2014BLP001-08,
9,SXDF76,GS2014BLP001-02,


In [6]:
print('Total number of galaxy clusters: ', len(clusters))
cluster = clusters.cluster

cluster

Total number of galaxy clusters:  26


0     COSMOS-125
1     COSMOS-221
2      COSMOS-28
3      COSMOS-63
4        SPT0205
5        SPT0546
6        SPT2106
7         SXDF49
8         SXDF64
9         SXDF76
10       SXDF76b
11        SXDF87
12    SpARCS0035
13    SpARCS0219
14    SpARCS0335
15    SpARCS1033
16    SpARCS1034
17    SpARCS1051
18    SpARCS1616
19    SpARCS1634
20    SpARCS1638
21    SpARCS0034
22    SpARCS0036
23    SpARCS0215
24    SpARCS1047
25    SpARCS1613
Name: cluster, dtype: object

<a class="anchor" id="LF1"></a>
# Location of files in the file service

In [7]:
oneddir = 'gogreen_dr1://SPECTROSCOPY/OneD/'  # 1-d spectra
twoddir = 'gogreen_dr1://SPECTROSCOPY/TwoD/'  # 2-d spectra
imdir = 'gogreen_dr1://PHOTOMETRY/IMAGES/'    # photometry and images

# make variables for file names
onedfiles = oneddir + cluster + '_final.fits'
twodfiles = twoddir + cluster + '_twod.fits.gz'

<a class="anchor" id="LF2"></a>
# One-D spectra
The storeClient as sc service is called here to retrieve the file names. For more uses and information, refer to the [How-to-use-the-StoreClient](https://github.com/astro-datalab/notebooks-latest/blob/master/04_HowTos/StoreClient/How_to_use_the_Data_Lab_StoreClient.ipynb) notebook.

In [8]:
print(sc.ls(oneddir,format='long'))

-rw-rw-r-x  gogreen_dr1  2793600  13 Aug 2020 17:54  COSMOS-125_final.fits
-rw-rw-r-x  gogreen_dr1  3438720  13 Aug 2020 17:54  COSMOS-221_final.fits
-rw-rw-r-x  gogreen_dr1  2833920  13 Aug 2020 17:54  COSMOS-28_final.fits
-rw-rw-r-x  gogreen_dr1  1379520  13 Aug 2020 17:54  COSMOS-63_final.fits
-rw-rw-r-x  gogreen_dr1     174  13 Aug 2020 17:54  README
-rw-rw-r-x  gogreen_dr1  3720960  13 Aug 2020 17:54  SPT0205_final.fits
-rw-rw-r-x  gogreen_dr1  4728960  13 Aug 2020 17:54  SPT0546_final.fits
-rw-rw-r-x  gogreen_dr1  3358080  13 Aug 2020 17:54  SPT2106_final.fits
-rw-rw-r-x  gogreen_dr1  5011200  13 Aug 2020 17:54  SXDF49_final.fits
-rw-rw-r-x  gogreen_dr1  1137600  13 Aug 2020 17:54  SXDF64_final.fits
-rw-rw-r-x  gogreen_dr1  4245120  13 Aug 2020 17:54  SXDF76_final.fits
-rw-rw-r-x  gogreen_dr1  4783680  13 Aug 2020 17:54  SpARCS0034_final.fits
-rw-rw-r-x  gogreen_dr1  5451840  13 Aug 2020 17:54  SpARCS0035_final.fits
-rw-rw-r-x  gogreen_dr1  4262400  13 Aug 2020 17:54  SpARCS0036_

In [9]:
print(wrapper.fill(text=sc.get(oneddir+'README')))

This directory contains final 1D spectra for every system in GOGREEN and GCLASS.  These have been absolute flux calibrated where possible, using Lyndsay's v1.1 calibrations.


### Note:
From the list above, there are 24 subfolders for 26 clusters. This is expected because two pairs of clusters are in the same field and therefore share a file (see footnote to Table 1 in the GOGREEN DR1 paper):
- `SXDF49` and `SXDF87` share a single GMOS field.  The spectra for both are included in the `SXDF49` fits files.
- `SXDF76` and `SXDF76b` share a single GMOS field.  The spectra for both are included in the `SXDF76` fits files.

Below, we will verify this by printing the names of the 1D spectra files.

In [10]:
for i in range(len(clusters)):
     print("%-10s " % cluster[i],end='')
     if sc.stat(onedfiles[i]) != {}:
         print(True)
     else:
         print(False)

COSMOS-125 True
COSMOS-221 True
COSMOS-28  True
COSMOS-63  True
SPT0205    True
SPT0546    True
SPT2106    True
SXDF49     True
SXDF64     True
SXDF76     True
SXDF76b    False
SXDF87     False
SpARCS0035 True
SpARCS0219 True
SpARCS0335 True
SpARCS1033 True
SpARCS1034 True
SpARCS1051 True
SpARCS1616 True
SpARCS1634 True
SpARCS1638 True
SpARCS0034 True
SpARCS0036 True
SpARCS0215 True
SpARCS1047 True
SpARCS1613 True


### Note:
As expected, we find that `SXDF76b` and `SXDF87` are not listed with separate one-D spectra files.

<a class="anchor" id="LF3"></a>
# Two-D spectra

In [11]:
print(sc.ls(twoddir,format='long'))

-rw-rw-r-x  gogreen_dr1  8714082  13 Aug 2020 17:54  COSMOS-125_twod.fits.gz
-rw-rw-r-x  gogreen_dr1  10620431  13 Aug 2020 17:54  COSMOS-221_twod.fits.gz
-rw-rw-r-x  gogreen_dr1  14009121  13 Aug 2020 17:54  COSMOS-28_twod.fits.gz
-rw-rw-r-x  gogreen_dr1  4838575  13 Aug 2020 17:54  COSMOS-63_twod.fits.gz
-rw-rw-r-x  gogreen_dr1     340  13 Aug 2020 17:54  README
-rw-rw-r-x  gogreen_dr1  11848796  13 Aug 2020 17:54  SPT0205_twod.fits.gz
-rw-rw-r-x  gogreen_dr1  15034157  13 Aug 2020 17:54  SPT0546_twod.fits.gz
-rw-rw-r-x  gogreen_dr1  10916915  13 Aug 2020 17:54  SPT2106_twod.fits.gz
-rw-rw-r-x  gogreen_dr1  27405946  13 Aug 2020 17:54  SXDF49_twod.fits.gz
-rw-rw-r-x  gogreen_dr1  3710772  13 Aug 2020 17:54  SXDF64_twod.fits.gz
-rw-rw-r-x  gogreen_dr1  13944087  13 Aug 2020 17:54  SXDF76_twod.fits.gz
-rw-rw-r-x  gogreen_dr1  11352764  13 Aug 2020 17:54  SpARCS0035_twod.fits.gz
-rw-rw-r-x  gogreen_dr1  10195223  13 Aug 2020 17:54  SpARCS0219_twod.fits.gz
-rw-rw-r-x  gogreen_dr1  108790

In [12]:
print(wrapper.fill(text=sc.get(twoddir+'README')))

Each file contains 2D spectra for GOGREEN.  The spatial axis is in pixels, which are 0.16".  No relative or absolute flux calibration has been applied to these spectra; the pixel units are in detector
counts.  Note the dimension of these MEF files can be different from the corresponding 1D files because the latter include GCLASS spectra.


In [13]:
for i in range(len(clusters)):
     print("%-10s " % cluster[i],end='')
     if sc.stat(twodfiles[i]) != {}:
         print(True)
     else:
         print(False)

COSMOS-125 True
COSMOS-221 True
COSMOS-28  True
COSMOS-63  True
SPT0205    True
SPT0546    True
SPT2106    True
SXDF49     True
SXDF64     True
SXDF76     True
SXDF76b    False
SXDF87     False
SpARCS0035 True
SpARCS0219 True
SpARCS0335 True
SpARCS1033 True
SpARCS1034 True
SpARCS1051 True
SpARCS1616 True
SpARCS1634 True
SpARCS1638 True
SpARCS0034 False
SpARCS0036 False
SpARCS0215 False
SpARCS1047 False
SpARCS1613 False


### Note:
The last five SpARCS clusters lack a GOGREEN mask, and therefore lack a twod data folder (as expected). In addition, the two pairs of clusters sharing a GMOS field are grouped together as we saw for the 1D spectra above (`SXDF76b` together with `SXDF76`; `SXDF87` together with `SXDF49`).

<a class="anchor" id="LF4"></a>
# Images

We now list the folders in the image directory.

In [14]:
print(sc.ls(imdir, format='long'))

drwxrwxr-x  gogreen_dr1       0  13 Aug 2020 17:54  HST/
-rw-rw-r-x  gogreen_dr1    5058  13 Aug 2020 17:54  MAGZPs_cal.list
drwxrwxr-x  gogreen_dr1       0  13 Aug 2020 17:54  MANMASKS/
drwxrwxr-x  gogreen_dr1       0  13 Aug 2020 17:54  Preimages/
drwxrwxr-x  gogreen_dr1       0  13 Aug 2020 17:54  SPTCL-0205/
drwxrwxr-x  gogreen_dr1       0  13 Aug 2020 17:54  SPTCL-0546/
drwxrwxr-x  gogreen_dr1       0  13 Aug 2020 17:54  SPTCL-2106/
drwxrwxr-x  gogreen_dr1       0  13 Aug 2020 17:54  SpARCS-0034/
drwxrwxr-x  gogreen_dr1       0  13 Aug 2020 17:54  SpARCS-0035/
drwxrwxr-x  gogreen_dr1       0  13 Aug 2020 17:54  SpARCS-0036/
drwxrwxr-x  gogreen_dr1       0  13 Aug 2020 17:54  SpARCS-0215/
drwxrwxr-x  gogreen_dr1       0  13 Aug 2020 17:54  SpARCS-0219/
drwxrwxr-x  gogreen_dr1       0  13 Aug 2020 17:54  SpARCS-0335/
drwxrwxr-x  gogreen_dr1       0  13 Aug 2020 17:54  SpARCS-1034/
drwxrwxr-x  gogreen_dr1       0  13 Aug 2020 17:54  SpARCS-1047/
drwxrwxr-x  gogreen_dr1       0  13 Au

### Note:
Above, we see that the nomenclature is different from the cluster naming from the table. Namely:
- `SPTxxxx` are named `SPTCL-xxxx`
- `SpARCSxxxx` are named `SpARCS-xxxx`

There are 16 out of 26 clusters with an imaging folder. In the case of Cluster `SpARCS1033`, the K-band imaging was not available at the time of the first release. It will however become available in the future as part of GOGREEN DR2.

In addition, there is an `HST/` folder for Hubble Space Telescope imaging. Let's examine its content next.

In [15]:
print(sc.ls(imdir+'HST/', format='long'))

-rw-rw-r-x  gogreen_dr1    3325  13 Aug 2020 17:54  README
drwxrwxr-x  gogreen_dr1       0  13 Aug 2020 17:54  RGB_images/
drwxrwxr-x  gogreen_dr1       0  13 Aug 2020 17:54  SPARCS0034/
drwxrwxr-x  gogreen_dr1       0  13 Aug 2020 17:54  SPARCS0035/
drwxrwxr-x  gogreen_dr1       0  13 Aug 2020 17:54  SPARCS0036/
drwxrwxr-x  gogreen_dr1       0  13 Aug 2020 17:54  SPARCS0215/
drwxrwxr-x  gogreen_dr1       0  13 Aug 2020 17:54  SPARCS0219/
drwxrwxr-x  gogreen_dr1       0  13 Aug 2020 17:54  SPARCS0335/
drwxrwxr-x  gogreen_dr1       0  13 Aug 2020 17:54  SPARCS1033/
drwxrwxr-x  gogreen_dr1       0  13 Aug 2020 17:54  SPARCS1034/
drwxrwxr-x  gogreen_dr1       0  13 Aug 2020 17:54  SPARCS1047/
drwxrwxr-x  gogreen_dr1       0  13 Aug 2020 17:54  SPARCS1051/
drwxrwxr-x  gogreen_dr1       0  13 Aug 2020 17:54  SPARCS1613/
drwxrwxr-x  gogreen_dr1       0  13 Aug 2020 17:54  SPARCS1616/
drwxrwxr-x  gogreen_dr1       0  13 Aug 2020 17:54  SPARCS1634/
drwxrwxr-x  gogreen_dr1       0  13 Aug 2020 

### Note:
Above, we see that the nomenclature is different from the previous folder, and also different from the cluster naming from the table in one case. Namely:
- `SPTxxxx` are named the same as in the clusters table;
- `SpARCSxxxx` are named `SPARCSxxxx` (all upper case letters)

There are 17 out of 26 clusters with an `HST/` imaging folder.

<a class="anchor" id="LF5"></a>
# References

#### GOGREEN Notebooks at the Data Lab
- [GOGREEN Data Release 1 data access at Astro Data Lab](https://github.com/astro-datalab/notebooks-latest/blob/master/03_ScienceExamples/GOGREEN_GalaxiesInRichEnvironments/1_GOGREENDr1DataAccessAtDataLab.ipynb)
- [GOGREEN DR1 at Data Lab - Simple Image Access (SIA)](https://github.com/astro-datalab/notebooks-latest/blob/master/03_ScienceExamples/GOGREEN_GalaxiesInRichEnvironments/2_GOGREENDr1SIA.ipynb)

#### GOGREEN & GCLASS DR1 Paper
- [Balogh et al. 2021, MNRAS, 500, 358](https://ui.adsabs.harvard.edu/abs/2021MNRAS.500..358B/abstract)