# Variable Stars in the Orion Star Cluster #
These notebooks will show how to access different services to obtain data for variable stars in the Orion star cluster. We will use a list of VO standards (for a complete linklist look at the end of this notebook). 

The steps we are going to to are the following:

1. Query SIMBAD to obtain a list of member stars of the Orion star cluster and their Gaia DR 3 id
2. X-match with the Gaia catalog
3. Use the VO Datalink standard to access the lightcurves of some of the objects
4. Use UCDs to access lightcurves
5. Use astropy to find external software in SAMP
6. Query a SSA service to access spectra for the objects

The first notebook will present and explain short snippets to help understand each of the upper steps and how easy they can be implemented in Python. The second notebook will bring together these snippets and add some user interface to shift control (and responsibility) on the user side. 


## Requirements ##

Before you run these notebooks please make sure you have the required libraries and software installed on you machine. 


### Python libraries ### 
We also use some libraries which are not included in Python. You will have to install these either from your OS package manager or via a package manager like pip or conda. These libraries are

astropy: https://www.astropy.org/

pyvo: https://pyvo.readthedocs.io/en/latest/

ipywidgets: https://ipywidgets.readthedocs.io/en/stable/

ipyaladin: https://github.com/cds-astro/ipyaladin

### Other software ###

Aladin: https://aladin.u-strasbg.fr/AladinDesktop/

Cassis: http://cassis.irap.omp.eu/

Topcat: https://www.g-vo.org/topcat/topcat/





## Query SIMBAD with TAP/ADQL ##
In the first step we will query the SIMBAD TAP/ADQL service to obtain a list of variable stars in the Orion cluster. Note that in our query we JOIN three table on the SIMBAD service to combine the data and make the most of the otype tree as well as the membership relations. If you feel uncomfortable with this query, you may want to have a look at the ADQL course: http://docs.g-vo.org/adql/html/


In [None]:
import pyvo

"""Get information of stars in the Orion Cluster from SIMBAD
   We are selecting variable stars and the gaia id"""

# Make the TAP service object
service = pyvo.dal.TAPService ("http://simbad.u-strasbg.fr:80/simbad/sim-tap")

# Query the TAP service with a simple ADQL query.
stars = service.search ("""
SELECT
    DISTINCT 
    TOP 20 main_id,ra, dec, cats.id
    FROM ident AS orion,basic
    
    JOIN ident AS cats 
    ON basic.oid=cats.oidref 
    
    JOIN h_link ON (oid=child 
    AND parent=orion.oidref)
    
    WHERE orion.id = 'Orion Cluster' 
    AND membership >=90
    AND cats.id LIKE 'Gaia DR3 %' 
    AND basic.otype = 'V*..' 

  """)

print (stars)

## X-match with Gaia DR3 ##

To perform the x-match we will use the ESA Gaia archive service. We use the table upload feature in ADQL to JOIN our local table with the remote one. Note, that since the x-match is performed on the Gaia id, we do not need our positions. 
Also note that we added

WHERE has_epoch_photometry ='t'

because we are only interested in stars which have photometric time series available. Thus, not all of our stars from SIMBAD will be included in the result. 

Also see that we added the restriction TOP 3 here. This is to keep the presentation sane and fast. 

In [None]:
"""X-matching with ESA Gaia Service and limit search to objects which
   also have lightcurves published."""

esa_service = pyvo.dal.TAPService ("https://gea.esac.esa.int/tap-server/tap")

gaia_xmatch = esa_service.search ("""
    SELECT TOP 3
    
    simbad.main_id, source_id, designation, 
    phot_variable_flag,phot_g_mean_mag, 
    phot_bp_mean_mag, phot_rp_mean_mag, 
    gaia.ra, gaia.dec,parallax, pm, 
    pmra, pmdec, radial_velocity 
    
    FROM gaiadr3.gaia_source AS gaia
    
    JOIN TAP_UPLOAD.simbad AS simbad
    ON simbad.id=designation
    
    WHERE has_epoch_photometry ='t'
    """, uploads={'simbad':stars})

print (gaia_xmatch)

## Datalink ##
Within the VOTable we obtained from the ESA Gaia archive we also have a little treasure that will help us to access further information. In this step we will have a look into the datalink standard and how to use it in PyVO to access the gaia lightcurves. 

We start out with the example of how to find the lightcurves for a single source in our table. 

In [None]:
"""Get the datalink and the lightcure for the first row in our table"""

# Get first row
first_row=gaia_xmatch[0]

# Make the datalink object
datalink=first_row.getdatalink()

# Go through the datalink parameters
for link in datalink: 
    print (link['description'])
    print (link['access_url'])


The output shows the datalink capabilities of the first source. These capabilities may differ depending on the actual object and thus can be different for other sources in the same table (Due to our selecttion in the ADQL query, the capability of "Epoch photometry" should be available for all source in our table). 

Let sink in what happens here: the ESA Service is providing us with information about additional data for each of the objects in our table gaia_xmatch. But where is this meta data stored ?

The secret lies in the data link standard which is implemented and comes with the VOTable. Within the VOTable one can find a described resource looking similar than this: 

$  
<RESOURCE type="meta" utype="adhoc:service" name="ancillary">
  <DESCRIPTION>Retrieve DataLink file containing ancillary data for source</DESCRIPTION>
  <PARAM name="accessURL" datatype="char" arraysize="*" 
         value="https://gea.esac.esa.int/data-server/datalink/links"/>
  <PARAM name="standardID" datatype="char" arraysize="*" 
         value="ivo://ivoa.net/std/DataLink#links-1.0"/>
  <GROUP name="inputParams">
    <PARAM arraysize="*" datatype="char" name="ID" ref="DESIGNATION" value="">
    </PARAM>
  </GROUP>
</RESOURCE>
$

Within the parameter tags of the resource, the parameter "accessURL" defines the URL under which a service can be found which offers more information about additional data linked to a specific data point of the data within the VOTable. In our case this means we can complete the URL to find out more about each object in our table. Therefore the parameters in the group tag explain which column in our table contains the identifier we need to complete the datalink, here it's the column "designation" (that's why we showed it in the output above). The IVOA datalink standard now defines that the complete link should be the accessURL, followed bz "?ID=" and the identifier, e.g. one of the gaia identifiers above. So an example of this link could look like this:

https://gea.esac.esa.int/data-server/datalink/links?ID=Gaia DR3 3017264007761349504

Now, behind this link one does not find the lightcurves, but instead a VOTable describing the additional cabilities of the service. In the use case here this is a list of links spectra or lightcurves, but other datalinks might include descriptions of image or cube cutout services (which are covered by the IVOA SODA standard). 

Luckily, since all of this is standardized, we do not have to write individual code to build the URLs to the datalink and eventually the lightcurves, because the table object in python does provide us with the information. In our gaia_xmatch table we can access these information like following:

In [None]:
print (gaia_xmatch.votable.resources[1].description)

print (gaia_xmatch.votable.resources[1].params[1])

print (gaia_xmatch.votable.resources[1].groups[0].entries[0])

Now this shows us how to receive the information of the additional data, what we still need to do is accessing this information.  Very conveniently the row objects of the table provide us with this possibility and actually this is where it makes most sense: part of the datalink information will be individually for each object, in our example the lightcurves of course are individual, but for the whole gaia catalog, some data records might not offer lightcurves at all, whereas others might offer additional information. 

To actually access all lightcurves (or other additional resources), we can iterate over the rows of our table and use the row object methods to access the datalink. Note that we do not have to do the step of actually reading the rresources bit from the VOTable -- PyVO already did this for us and provides us with the methods to access the data. So here is how the iteration looks like and how we can access the light curve table. 

Note: we use the description to only use the lightcurve capability in the datalink, and ignore the others. As convenient as PyVO is, this of course still needs a bit of reading of the ESA gaia archive docu. From there we know that the lightcurves come as VOTable also, hence we can easily read them into this notebook. 

In [None]:
""" Use the datalink resource of the gaia Table to find the 
    links to the lightcurves."""
from astropy.io.votable import parse

# Going through the rows of our VOTable 
for i in gaia_xmatch:
    # Get the possible datalinks for each data record.
    # The possible resources behind the datalink depend on 
    # the actual objects. 
    
    dl=i.getdatalink()

    # Since there may be more than one datalink, we need 
    # to so get the one with the lc of g band and therefore 
    # make a selection of the description. 
    
    for ii in dl:
        if ii['description'].find('Epoch photometry')!=-1:
            # Use the access_url attribut
            accurl=ii['access_url']
 
            # Parse the VOTable from the link -- we have the lightcurves
            lc_vot=parse(str(accurl)).get_first_table()
            # Print the source ID of the first row of each table to stdout            
            print(lc_vot.to_table()[0]['source_id'])


In [None]:
"""Print the last received lightcurve table to stdout"""
print (lc_vot)

The output will show you the list of clients accepting VOTables and the list of clients accepting spectra. If you started topcat, Aladin and CASSIS, all three should be in list of accepting VOTables, and only CASSIS should be shown to accept spectra.

## SSA: finding and accessing spectra ##
The Single Spectra Access Protocol is the VO standard to find and access services providing spectra. From the user side, all the "simple" protocols word similar: by defining a position and a diameter around it, a service is queried if data exists. If so, the service will reply with a list of the results matching the query, and access urls to download the results. 
Note, that the results are not automatically loaded. Thus enabling the software to first "analyse" the result, e.g. by printing it to stdout and letting the user decide if they want to download the spectrum. A short example on how this works is the following code:

In [None]:
"""Finding spectra for a specific position on the gavo service"""
import astropy.units as units
from astropy.units import Quantity
from astropy.coordinates import SkyCoord

coord=SkyCoord(83.817*units.deg, -5.385*units.deg, frame='icrs')


def search_spectrum(coord):
    # Gaia DR3 MC sampled XP spectra SSA
    ssa_service = pyvo.dal.SSAService("http://dc.zah.uni-heidelberg.de/gaia/s3/ssa/ssap.xml?")
    ssa_results = ssa_service.search(pos = coord, diameter=Quantity(3, unit="arcsec"))
    return ssa_results

ssa_results=search_spectrum(coord)

print (ssa_results[0]['ssa_dstitle'])
print (ssa_results[0]['accref'])
                      

## VO Standards ##

TAP : https://www.ivoa.net/documents/latest/ADQL.html

ADQL: https://www.ivoa.net/documents/TAP/20190927/
           
VOTable: https://www.ivoa.net/documents/VOTable/20191021/

Datalink: https://www.ivoa.net/documents/DataLink/20150617/index.html

UCDs: https://www.ivoa.net/documents/latest/UCD.html

SSA: https://www.ivoa.net/documents/SSA/20120210/index.html


## Services Used ##
We use a few VO compliant data services out there to receive data. These are:
    
SIMABD: https://simbad.unistra.fr/simbad/

ESA-Gaia Archive: https://gea.esac.esa.int/archive/

GAVO SSA: https://dc.zah.uni-heidelberg.de/gaia/s3/web/form