# ESO Programmatic Authentication & Authorisation  
## How to access private data and metadata 


This jupyter notebook complements with some python examples what described in the <a href="/cms/eso-data/programmatic-access/ESO-Programmatic-Authentication-and-Authorisation.html">ESO Programmatic Authentication &amp; Authorisation</a> documentation page.

It drives you through the process of:

1. Authenticating to receive a token
2. Performing authorised archive searches on raw data via TAP (using your token to exercise your permissions)
3. Downloading science raw data with authorisation
4. Finding the associated calibration reference files (via DataLink and calSelector)
5. Downloading the calibration reference files and the association tree


This notebook is based on a little utility module called <code>&nbsp;eso_programmatic.py&nbsp;</code> and <a href="eso_programmatic.py">downloadable here</a>, which contains, among others, the method to get a token (<strong title='def getToken(username, password):&#10;    """Token based authentication to ESO: provide username and password to receive back a JSON Web Token."""&#10;    if username==None or password==None:&#10;        return None&#10;    token_url = "https://www.eso.org/sso/oidc/token"&#10;    token = None&#10;    try:&#10;        response = requests.get(token_url,&#10;                            params={"response_type": "id_token token",&#10;                                    "grant_type":    "password",&#10;                                    "client_id":        "clientid",&#10;                                    "username":      username,&#10;                                    "password":      password})&#10;        token_response = json.loads(response.content)&#10;        token = token_response["id_token"]+"=="&#10;    except NameError as e:&#10;        print(e)&#10;    except:&#10;        print("*** AUTHENTICATION ERROR: Invalid credentials provided for username %s" %(username))&#10;&#10;        return token&#10;'>getToken</strong>). 

<hr>


##### Initialisations

In [1]:
TAP_URL = "http://archive.eso.org/tap_obs"

# Importing useful packages
import os 
import sys
import requests
#import cgi
import json
import time

import pyvo
from pyvo.dal import tap
from pyvo.auth.authsession import AuthSession
    
# Verify the version of pyvo 
from pkg_resources import parse_version
print('\npyvo version {version} \n'.format(version=pyvo.__version__))
if parse_version(pyvo.__version__) < parse_version('1.1'):
    raise ImportError('pyvo version must be 1.1 or higher')
    
        
import eso_programmatic as eso


pyvo version 1.1 



<span id='getToken'></span>
## 1 Authenticating
#### Get an ESO token using your ESO credential
With your ESO username and password you can get an authorization token (the *id_token*) using the *getToken()* method (<a href="/cms/eso-data/programmatic-access/ESO-Programmatic-Authentication-and-Authorisation.html#getToken">see it here</a>), part of the *eso_programmatic.py* module.

In [2]:
# Prompt for user's credentials and get a token
import getpass

username = input("Type your ESO username: ")
password=getpass.getpass(prompt="%s's password: "%(username), stream=None)

token = eso.getToken(username, password)
if token != None:
    print('token: ' + token)
else:
    sys.exit(-1)

token: eyJhbGciOiJSUzI1NiIsImtpZCI6InNzbyJ9.eyJqdGkiOiJlMmM4Mzk3Zi00NzE0LTRmYTctOGRmMC03MTY5NTU0MjU2MDEiLCJpc3MiOiJodHRwczovL3d3dy5lc28ub3JnL3Nzby9vaWRjIiwiYXVkIjoiY2xpZW50aWQiLCJleHAiOjE2MzQxMjI4MzQsImlhdCI6MTYzNDA5NDAzNCwibmJmIjoxNjM0MDkzNzM0LCJzdWIiOiJhbmRyZXdqb2xseSIsImNsaWVudF9pZCI6ImNsaWVudGlkIiwic3RhdGUiOiIiLCJub25jZSI6IiIsImF0X2hhc2giOiJzUXdwc1BKWVNPMnlhVkd1TURtbEh3IiwiYWNjb3VudF9pZCI6OTM2MDMsImVtYWlsIjoiYW5kcmV3Lmouam9sbHlAdW5zdy5lZHUuYXUiLCJmYW1pbHlfbmFtZSI6IkpvbGx5IiwiZ2l2ZW5fbmFtZSI6IkFuZHJldyIsImlzX2ltcGVyc29uYXRpbmciOmZhbHNlLCJuYW1lIjoiQW5kcmV3IEpvbGx5Iiwicm9sZXMiOiJ1c2VyIiwic2V4IjoiIiwidGl0bGUiOiIiLCJwcmVmZXJyZWRfdXNlcm5hbWUiOiJhbmRyZXdqb2xseSJ9.WBnaYLIJSMihG9a64w9ZMo9JN34J8um4Me2rt8w_HpOjnuuulPL7040M0FTXVM2f2ppYxtsKj21KsKzx9oTdz3WdoxNcRE2hXF4zaUH40Bz5rFBYa88Z5F9U1S5SKIvMB-481YzdvNAbmZYheqZReoSVRjzxt1Ef42ZFQQBqfrW6GQyXK__Z8s9cOHDIl77cv5fYskeALo-5bOi1whRbEa1Cp5vpDZTqxV_a43GtlsYcLdJXDr8EohtrJz597Vqfx1VCY9SilifKRfy76jg9zZJ3LEO4CTYoQpPTFvj35-fs63GlQHoM-YKCLAJXQEYEIdsogaM_T5R

<span id='authorised_archive_searches'></span>
## 2 Authorised archive searches

Remember what written in documentatio page, at <em><a href="/cms/eso-data/programmatic-access/ESO-Programmatic-Authentication-and-Authorisation.html#which_users">§1.2.1 Which users should (not) perform authorised data searches?</a></em> before performing authorised archive searches! Authorised queries are slower than anonymous queries and only few users will really need that functionality.
 - *authorised* archive searches are useful only to users with special permissions 
 - a PI of a regular observing programme normally does *not* possess special permissions
 - authorised queries are slower than anonymous queries, so use them only if you really need them!

### 2.1 Setup a python requests session with an Authorization header
Create a python requests session and add your token to its header. You will pass this session to an ESO service when you want to ensure that your own permissions are taken into consideration.

In [3]:
session = requests.Session()
session.headers['Authorization'] = "Bearer " + token

# Initialise a tap service for authorised queries
# passing the created "tokenised" session
# Remember: passing a non tokenised-session, or no session at all, 
# will result in tap performing anonymous queries:
# none of your permissions will be used, hence the queryies will run faster,
# and you will not be able to find any file with protected metadata.

tap = pyvo.dal.TAPService(TAP_URL, session=session)

# for comparison, use: 
# tap = pyvo.dal.TAPService(TAP_URL) 
# to execute your queries anonymously

### 2.2 Execute authorised queries 
Any query you send to the tap service so initialised will be "authorised", in the sense that your permissions will be taken into consideration. 

To achieve this, your query gets modified on-the-fly by the TAP software; the resulting SQL query ensures that you retrieve all the records you have granted access to, including the public ones, and only those. Such modified query (which you do not see) is more complex than the one you actually typed, and cannot be as fast.

For this reason we suggest to run authorised queries asynchronously, so to give it more execution time and not waiting for its results, hence avoiding http or application timeouts and possible intervening transient failures.

How? Using a TAP job.

In [4]:
# define the query you want to run, e.g.:
query = "select top 2 * from dbo.raw where dp_cat='SCIENCE' and prog_id = 'your-protected-observing-run' "

# well, in this example we use a non-protected run, 
# but please pretend it is actually a protected one given the purpose of this notebook!

# let's consider only 2 of its science frames:
query = "select top 2 * from dbo.raw where dp_cat='SCIENCE' and prog_id = '098.C-0739(C)' "


results = None

# define a job that will run the query asynchronously 
job = tap.submit_job(query)

# extending the maximum duration of the job to 300s (default 60 seconds)
job.execution_duration = 300 # max allowed: 3600s

# job initially is in phase PENDING; you need to run it and wait for completion: 
job.run()

try:
    job.wait(phases=["COMPLETED", "ERROR", "ABORTED"], timeout=600.)
except pyvo.DALServiceError:
    print('Exception on JOB {id}: {status}'.format(id=job.job_id, status=job.phase))

print("Job: %s %s" %(job.job_id, job.phase))

if job.phase == 'COMPLETED':
    # When the job has completed, the results can be fetched:
    results = job.fetch_result()

# the job can be deleted (always a good practice to release the disk space on the ESO servers)
job.delete()

# Let's print the results to examine the content:
# check out the access_url and the datalink_url
if results:
    print("query results:")
    eso.printTableTransposedByTheRecord(results.to_table()) 
else:
    print("!" * 42)
    print("!                                        !")
    print("!       No results could be found.       !")
    print("!       ? Perhaps no permissions ?       !")
    print("!       Aborting here.                   !")
    print("!                                        !")
    print("!" * 42)
    quit()

Job: e1a13839-484c-408a-9924-73c09216a8c3 COMPLETED
query results:
    access_url     = https://dataportal.eso.org/dataPortal/file/SPHER.2016-09-26T04:15:07.252
    datalink_url   = http://archive.eso.org/datalink/links?ID=ivo://eso.org/ID?SPHER.2016-09-26T04:15:07.252
    date_obs       = 2016-09-26T04:15:07.2524
    dec            = 1.6143
    dec_pnt        = 1.6143
    det_chip1id    = ESO-Hawaii2RG
    det_chop_ncycles = --
    det_dit        = --
    det_expid      = --
    det_ndit       = 2
    dp_cat         = SCIENCE
    dp_id          = SPHER.2016-09-26T04:15:07.252
    dp_tech        = IFU
    dp_type        = SKY
    ecl_lat        = 3.960941
    ecl_lon        = 185.089795
    exp_start      = 2016-09-26T04:15:07.253Z
    exposure       = 64.0
    filter_path    = 
    gal_lat        = -55.874158
    gal_lon        = 87.265553
    grat_path      = 
    gris_path      = 
    ins_mode       = 
    instrument     = SPHERE
    last_mod_date  = 2016-09-26T04:25:47.533Z
    mjd

<span id='downloadURL'></span>
## 3 Downloading the selected science files using their access_url

In [8]:
# The access_url field of the dbo.raw table
# provides the link that can be used to download the file

# Here we pass that link together with your session
# to the downloadURL method of the eso_programmatic.py module
# (similarly to the authorised queries, if no session is passed, 
#  downloadURL will attempt to download the file anonymously)

print("Start downloading...")
for raw in results:
    access_url = raw['access_url'] # the access_url is the link to the raw file
    status, filepath = eso.downloadURL(access_url, session=session, dirname="/tmp")
    if status==200:
        print("      RAW: %s downloaded  "  % (filepath))
    else:
        print("ERROR RAW: %s NOT DOWNLOADED (http status:%d)"  % (filepath, status))


Start downloading...
      RAW: /tmp/SPHER.2016-09-26T04:15:07.252.fits.Z downloaded  
      RAW: /tmp/SPHER.2016-09-26T04:12:45.603.fits.Z downloaded  


## 4 Finding and downloading the associated calibration reference files

The datalink service (implementing the VO <a href="https://www.ivoa.net/documents/DataLink/20150617">DataLink</a> protocol) helps you find out files related to an input science file (whether raw or product, in this case a raw). Let's call THIS the science file at hand. Datalink in particular can give you back two lists of calibration reference files that can be used to process THIS:
 - the list of raw calibration reference files (mode: raw2raw)
 - the list of processed calibration reference files (mode: raw2master)
 
As a side note, Datalink can also offer access to other related files, e.g.:
 - products generated out of THIS, 
 - provenance files, i.e., the science files that were used to generate THIS
 - preview file, a quick look of THIS (for products only)
 - ancillary files of THIS (e.g. a weightmap of an imaging product) (for products only)
 - data documentation describing the science aim and the processing applied to THIS (for products only)
 - night log (for raws only)

### 4.1 Find the link to the associated calibration reference files (using DataLink)
The <code>datalink_url</code> field of the dbo.raw table
provides you the link that can be used to find files associated
to the selected science frame.


In [5]:
# A python datalink object is created running
# the pyvo DataLinkResults.from_result_url() method onto the datalink_url.

# When dealing with files whose metadata are protected, we need to be authorised:
# for that we need to pass to the from_result_url() also the above-created python requests session.

# For the sake of this example, let's just consider the first science raw frame:
first_record = results[0]
datalink_url = first_record['datalink_url']

datalink = pyvo.dal.adhoc.DatalinkResults.from_result_url(datalink_url, session=session)

# The resulting datalink object contains the table of files associated
# to SPHER.2016-09-26T03:04:09.308
# Note: Were this input file a metadata protected file (it is not, but suppose...),
# and had you not passed your session, or had you no permission to see this file,
# DataLink would have given you back only a laconic table with the message 
# that that you do not have access permissions or that the file does not exist.

# let's print the resulting datalink table:
eso.printTableTransposedByTheRecord(datalink.to_table())


    ID             = ivo://eso.org/ID?SPHER.2016-09-26T04:15:07.252
    access_url     = https://dataportal.eso.org/dataPortal/file/SPHER.2016-09-26T04:15:07.252
    service_def    = 
    error_message  = 
    semantics      = #this
    description    = Requested file
    content_type   = application/fits
    content_length = 27535989
    eso_origfile   = 2016-09-25/SPHER.2016-09-26T04:15:07.252.fits.Z
    eso_category   = 
    eso_datalink   = https://archive.eso.org/datalink/links?ID=ivo://eso.org/ID?SPHER.2016-09-26T04:15:07.252
    ---------------------------------------------------------------------------------------------------------
    ID             = ivo://eso.org/ID?SPHER.2016-09-26T04:15:07.252
    access_url     = https://archive.eso.org/calselector/v1/associations?dp_id=SPHER.2016-09-26T04:15:07.252&mode=Raw2Raw&responseformat=votable
    service_def    = 
    error_message  = 
    semantics      = http://archive.eso.org/rdf/datalink/eso#calSelector_raw2raw
    descriptio

As shown above, the Datalink result is a table; each of its records provides a pointer (access_url) to an associated file, or to a service that returns associated files (like the calibration reference files); to distinguish among the records, the <code>semantics</code> column can be used. 

In this case there are 4 records:
 - semantics = <code>#this</code> :<br>
    -  the first record in any datalink response always describes the input file (THIS) <p><br>
    
 - semantics = <code>http://archive.eso.org/rdf/datalink/eso#calSelector_raw2raw</code> :<br>
    -  provides a link (access_url) to the the associated raw calibration files <p><br>
    
 - semantics = <code>http://archive.eso.org/rdf/datalink/eso#calSelector_raw2master</code> :<br>
    -  provides a link (access_url) to the associated processed calibration files <p><br>
    
 - semantics = <code>http://archive.eso.org/rdf/datalink/eso#night_log</code> :<br>
    -  provide a link (access_url) to the associated Night Log report <p><br>
    
<table>
<tr><td style="background-color: lightgrey; text-align: left;"><strong>To know more:</strong><br>
  For the two different flavours of calibration files (raw and processed), please refer to the  <a href="http://archive.eso.org/cms/application_support/calselectorInfo.html">documentation page of the calSelector service</a>.
</td></tr>
<tr><td style="background-color: lightgrey; text-align: left;">
   For the description of all possible semantics values, please refer to:
   <ul>
   <li> <a href="http://archive.eso.org/programmatic/rdf/datalink/eso/">the ESO semantics</a>
   <li> <a href="http://www.ivoa.net/rdf/datalink/core">the DataLink VO standard semantics</a>
   </ul>
</td></tr>
</table>


Here we want to get the processed calibration files, hence:

In [6]:
# Let's get the link to the processed calibration files (raw2master)

semantics = 'http://archive.eso.org/rdf/datalink/eso#calSelector_raw2master'

raw2master_url = next(datalink.bysemantics( semantics )).access_url

# which returns the calSelector (see next box) link:
# https://archive.eso.org/calselector/v1/associations?dp_id=\
#SPHER.2016-09-26T03:04:09.308&mode=Raw2Master&responseformat=votable

### 4.2 Getting the list of processed calibration reference files (using calSelector and DataLink)

The automatic selection of calibration files (raw or processed) is performed by the above-mentioned calSelector service, exposed also programmatically.

One of the calSelector interfaces (the _responseformat=votable_ param must be present), is fully compatible with the datalink VO protocol. This means that the same pyvo DatalinkResults.from_result_url() method can be used, e.g., to get the list of associated raw2master files.


In [7]:
# Don't forget to pass your session in case the science file has protected metadata!

associated_calib_files = pyvo.dal.adhoc.DatalinkResults.from_result_url(raw2master_url, session=session)

eso.printTableTransposedByTheRecord(associated_calib_files.to_table())

# create and use a mask to get only the #calibration entries,
# given that other entries, like #this or ...#sibiling_raw, could be present:
calibrator_mask = associated_calib_files['semantics'] == '#calibration'
calib_urls = associated_calib_files.to_table()[calibrator_mask]['access_url','eso_category']

#eso.printTableTransposedByTheRecord(calib_urls)

    ID             = ivo://eso.org/ID?SPHER.2016-09-26T04:15:07.252
    access_url     = https://archive.eso.org/datalink/links?ID=ivo://eso.org/ID?SPHER.2016-09-26T04:15:07.252&eso_download=file
    service_def    = 
    error_message  = 
    semantics      = #this
    description    = category="IFS_SCI_SKY" certified="true" complete="true" mode="Raw2Master" type="main" messages=""
    content_type   = application/fits
    content_length = 27535989
    eso_category   = IFS_SCIENCE_DR_RAW
    eso_datalink   = https://archive.eso.org/datalink/links?ID=ivo://eso.org/ID?SPHER.2016-09-26T04:15:07.252
    ---------------------------------------------------------------------------------------------------------
    ID             = ivo://eso.org/ID?SPHER.2016-09-26T04:15:07.252
    access_url     = https://archive.eso.org/datalink/links?ID=ivo://eso.org/ID?M.SPHERE.2016-09-27T08:14:42.823&eso_download=file
    service_def    = 
    error_message  = 
    semantics      = #calibration
    descr

#### 4.2.1 Check calibration cascade qualities

Check if calibration cascade is complete, if it is certified, and if it is actually for processed calib files

Beware: When executing a request for processed calibrations, you might get back the raw calibrations instead!   

This is happening when no processed calibrations exists for the given raw frame, in which case the service, not to leave you empty-handed, gives back the raw calibrations instead.
It is possible to check this, by reading the calibration cascade description, as shown here below. 


In [8]:
# Given the above list of "associated_calib_files"
# and knowing that we requested...
mode_requested = "raw2master"

# ... let's print out some important info and warnings on the received calibration cascade: 
# - is the cascade complete? 
# - is the cascade certified?
# - has the cascade being generated for the mode you requested (processed calibrations) or not?

# That info is embedded in the description field of the #this record.
# We use the printCalselectorInfo of the eso_programmatic.py to parse/make sense of it.

this_description=next(associated_calib_files.bysemantics('#this')).description

alert, mode_warning, certified_warning = eso.printCalselectorInfo(this_description, mode_requested)

if alert!="":
    print("%s" % (alert))
if mode_warning!="":
    print("%s" % (mode_warning))
if certified_warning!="":
    print("%s" % (certified_warning))
    
question = None
answer = None
if len(calib_urls):
    print()
    if alert or mode_warning or certified_warning:    
        question = "Given the above warning(s), do you still want to download these %d calib files [y/n]? " %(len(calib_urls))
    else:
        question = "No warnings reported, do you want to download these %d calib files [y/n]? " %(len(calib_urls))

while answer != 'y' and answer != 'n':
    answer = input(question)
    

    calibration info:
    ------------------------------------
    science category=IFS_SCI_SKY
    cascade complete=true
    cascade messages=
    cascade certified=true
    cascade executed mode=raw2master
    full description: category="IFS_SCI_SKY" certified="true" complete="true" mode="Raw2Master" type="main" messages=""



### 4.3 Downloading the calibration reference files

To download the calibration files we use again the <code>downloadURL</code> method of the <code>eso_programmatic.py</code> module.

All ESO calibration files are open to the public, hence there is no need to pass your token/session.

In [9]:
if answer == 'y':
    print("Downloading the calibration reference files...")

    i_calib=0
    for url,category in calib_urls:
        i_calib+=1
        status, filename = eso.downloadURL(url)
        if status==200:
            print("    CALIB: %4d/%d dp_id: %s (%s) downloaded"  % (i_calib, len(calib_urls), filename, category))
        else:q
            print("    CALIB: %4d/%d dp_id: %s (%s) NOT DOWNLOADED (http status:%d)"  % (i_calib, len(calib_urls), filename, category, status))


Downloading the calibration reference files...


### 4.4 Getting the Association Tree describing the relations among the science frame and calibration files

You might have spotted above, that the <code>associated_calib_files</code>, generated invoking the raw2master_url, provides not only the <code>#calibrator</code> entries, but also an entry for the association tree.

<code>Association Tree :== file describing the relations among the input raw frame(s)
                           and the calibration files (in custom XML format)</code>

You can use its semantics to find its access_url, as shown here below.

In [None]:
association_tree_semantics = 'http://archive.eso.org/rdf/datalink/eso#calSelector_raw2master'

# Notice that the datalink service and the calselector service use the same semantics
# to indicate two different things:
# - in datalink: it points to the distinct list of calibration reference files (responseformat=votable);
#                its eso_category is not defined
# - in calselector: it points to the calibration cascade description (format still XML but not votable);
#                its eso_category is set to "ASSOCIATION_TREE"

association_tree_mask = associated_calib_files['semantics'] == association_tree_semantics
association_tree = associated_calib_files.to_table()[association_tree_mask]['access_url','eso_category']

for url, category in association_tree:
    # the url points to the calselector service, which, for metadata protected files, needs a tokenised-session
    status, filename = eso.downloadURL(url, session=session)
    print(url)
    if status == 200:
        print("  Association tree: %s (%s) downloaded"  % (filename, category))
    else:
        print("  Association tree: %s (%s) NOT DOWNLOADED (http status:%d)"  % (filename, category, status))
