# The Sarah demo

This notebook goes through the Sarah demo. It demonstrates the use of DataLab as a science exploration framework employing the various remote storage and query functionalities. Two modes of use are presented: using the datalab command and programmatically via Python.

The use case for this demo is that Sarah, an astronomer, is interested in late-type galaxies (LTGs) and is wondering whether DECam-reduced images in the Y1 DES release have sufficient resolution and depth. She decides to perform an exploratory analysis on a sample of known LTGs in the Stripe 82 field. This will inform her whether it is worth carrying out a more detailed study using the entire DES release.

The total amount of data that she will work with is about 200 GB of images. This is at the limit of what is tractable with current networks and would take about 15 hours to download the data alone. With the DataLab the total exercise can be accomplished in about 15 minutes.

## Using the datalab command

### Login to the DataLab

First, Sarah logins into the Data Lab and mounts her virtual storage on her local machine. 

In [None]:
!datalab login --user=dldemo --password=dldemo --mount=/tmp/vospace

She creates working directories in the space: one as a workspace (/ltg}), one to hold tables (/dbs) and one to hold images (/img). She also enables capabilities on two of the directories: one (tableingester) to automatically ingest any tabular file placed in one into her personal Data Lab database (referred to as MyDB); and one (downloader) to automatically retrieve any images referred to in any VOTable placed in the directory (via any accessURL columns). Her Data Lab environment is now ready for her to start her exploratory work.

In [None]:
!cd /tmp/vospace
!mkdir ltg
!mkdir dbs
!mkdir img
!datalab addcapability --dir=dbs --cap=tableingester --fmt=votable,fits,csv
!datalab addcapability --dir=img --cap=downloader 

### Query VO for relevant galaxy sample and save to remote storage

Sarah downloads a paper she remembers has a list of LTGs in it from Vizier and saves it to the (/dbs) directory. It is parsed and a database table created in her MyDB with its contents.

In [None]:
!datalab query --uri=ivo://CDS.VizieR/J/MNRAS/406/382/catalog \
    --out=vos://dbs/sample.vot --addArgs="0.0 0.0 180."

### Extract LTGs from galaxy sample in remote storage

She then extracts the positions of the LTGs in this table with a simple query and saves them in a CSV file in the workspace directory (/ltg).

In [None]:
!datalab query --ofmt=csv --out=vos://ltg/ltg.csv \
--adql="select SDSS,_raj2000,_dej2000 from mydb://sample_vot where ETG=6"  

### Query DataLab SIA for reduced DECam images and save to remote storage

Next, she uses the CSV file as input to the Data Lab image access (SIA) service and saves the results in the /img directory. The output from the service is a VOTable with an accessURL column linking to the Data Lab image cutout service. As the table is placed in the /img directory with an active downloader capability, thumbnail cutouts are automatically retrieved for each row. 

In [None]:
!datalab siaquery --input=/tmp/vospace/ltg/ltg.csv \
  --out=vos://img/img.vot --search=0.5

### Check cutouts of galaxies from DECam images saved in remote storage

She can check the contents of the directory.

In [None]:
!ls /tmp/vospace/sarah/ltg

### Run legacy analysis code on desktop with data in remote storage

Finally, Sarah can examine the DES image thumbnails of the LTGs with her favorite legacy code.

In [None]:
...

## Using Python

### Login to the DataLab

First, Sarah logins into the Data Lab.

In [1]:
from dl import auth, queryMgr, storeMgr
token = 'dldemo.1.1.fish'
#token = auth.login('dldemo', 'dldemo')

She creates working directories in her virtual storage space: one as a workspace (/ltg}), one to hold tables (/dbs) and one to hold images (/img). She also enables capabilities on two of the directories: one (tableingester) to automatically ingest any tabular file placed in one into her personal Data Lab database (referred to as MyDB); and one (downloader) to automatically retrieve any images referred to in any VOTable placed in the directory (via any accessURL columns). Her Data Lab environment is now ready for her to start her exploratory work.

In [8]:
storeMgr.mkdir(token, 'vos://ltg')
storeMgr.mkdir(token, 'vos://dbs')
storeMgr.mkdir(token, 'vos://img')
storeMgr.put(token, 'vos://dbs', 'caps/tableingester_cap.conf')
storeMgr.put(token, 'vos://img', 'caps/downloader_cap.conf')

### Query VO for relevant galaxy sample and save to remote storage

Sarah downloads a paper she remembers has a list of LTGs in it from Vizier and saves it to the (/dbs) directory. It is parsed and a database table created in her MyDB with its contents.

In [17]:
queryMgr.query(token, uri='ivo://CDS.VizieR/J/MNRAS/406/382/catalog', \
    out='vos://dbs/sample.vot', addArgs="0.0 0.0 180.")

<?xml version="1.0" encoding="UTF-8"?>
<VOTABLE version="1.2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.ivoa.net/xml/VOTable/v1.2" xmlns="http://www.ivoa.net/xml/VOTable/v1.2">
	<RESOURCE type="results">
		<INFO name="QUERY_STATUS" value="OK" />
		<INFO name="PROVIDER" value="TAPVizieR">VizieR TAP service.</INFO>
		<INFO name="QUERY"><![CDATA[select * from "J/MNRAS/406/382/catalog"]]></INFO>
		<TABLE>
			<FIELD ID="umag0" name="umag0" datatype="double" arraysize="1" ucd="phot.mag;em.opt.U" unit="mag">
<DESCRIPTION>[14.7/19.6] SDSS-DR7 u de-reddened magnitude</DESCRIPTION>
</FIELD>
			<FIELD ID="DEJ2000" name="DEJ2000" datatype="double" arraysize="1" ucd="pos.eq.dec;meta.main" unit="deg">
<DESCRIPTION>[-1.25/1.25] SDSS declination (J2000)</DESCRIPTION>
</FIELD>
			<FIELD ID="SDSS" name="SDSS" datatype="char" arraysize="*" ucd="meta.id;meta.main">
<DESCRIPTION>IAU SDSS-DR7 name</DESCRIPTION>
</FIELD>
			<FIELD ID="objID" name="objID" datatype="l

### Extract LTGs from galaxy sample in remote storage

She then extracts the positions of the LTGs in this table with a simple query and saves them in a CSV file in the workspace directory (/ltg).

In [None]:
query = 'select SDSS,_raj2000,_dej2000 from mydb://sample_vot where ETG=6'
queryMgr.query(token, adql = query, ofmt = 'csv', out = 'vos://ltg/ltg.csv')

### Query DataLab SIA for reduced DECam images and save to remote storage

Next, she uses the CSV file as input to the Data Lab image access (SIA) service and saves the results in the /img directory. The output from the service is a VOTable with an accessURL column linking to the Data Lab image cutout service. As the table is placed in the /img directory with an active downloader capability, thumbnail cutouts are automatically retrieved for each row. 

In [2]:
result = queryMgr.siaquery(token, input = "vos://ltg/ltg.csv"
                 search = 0.5, out = "vos://img/img.vot")

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
  "http://www.w3.org/TR/html4/loose.dtd">
<html>
  <head>
    <title>UnboundLocalError: local variable 'out' referenced before assignment // Werkzeug Debugger</title>
    <link rel="stylesheet" href="?__debugger__=yes&amp;cmd=resource&amp;f=style.css"
        type="text/css">
    <!-- We need to make sure this has a favicon so that the debugger does
         not by accident trigger a request to /favicon.ico which might
         change the application state. -->
    <link rel="shortcut icon"
        href="?__debugger__=yes&amp;cmd=resource&amp;f=console.png">
    <script src="?__debugger__=yes&amp;cmd=resource&amp;f=jquery.js"></script>
    <script src="?__debugger__=yes&amp;cmd=resource&amp;f=debugger.js"></script>
    <script type="text/javascript">
      var TRACEBACK = 38004560,
          CONSOLE_MODE = false,
          EVALEX = true,
          EVALEX_TRUSTED = false,
          SECRET = "lpzCV719wFnZWT1qfec3";
    </scri

ValueError: substring not found

### Check cutouts of galaxies from DECam images saved in remote storage

She can check the contents of the directory.

In [None]:
import glob
for img in glob.glob("/tmp/vospace/img"):
    ...

### Run legacy analysis code on desktop with data in remote storage

Finally, Sarah can examine the DES image thumbnails of the LTGs with her favorite legacy code.

...