# Registry-powered Searches


If NAVO develops astroquery.vo, we could use things like the following. This is a summary of what is below in more detail. 

RegTAP:  

    query_results=astroquery.vo.Registry.query( ... lots of options, this already exists in our github ...)
    heasarc_image_services=astroquery.vo.Registry.list_image_services(source='heasarc') 

TAP(?):

    tap_services_2mass=Registry.query(keyword='2mass',service_type='table')[0]
    tap_results=Tap.query(
        source=tap_services_2mass[32],
        logic_string='CONTAINS(POINT('J2000',ra,dec),CIRCLE('J2000',9.90704,8.96507,0.001))'
        )

In [2]:
import matplotlib
import matplotlib.pyplot as plt
%matplotlib inline  
import requests, io, astropy
from IPython.display import Image, display

## For handling ordinary astropy Tables
from astropy.table import Table, vstack

## For reading FITS files
import astropy.io.fits as apfits

## There are a number of relatively unimportant warnings that 
## show up, so for now, suppress them:
import warnings
warnings.filterwarnings("ignore")

## our stuff
import sys
# Use the NASA_NAVO/astroquery
from navo_utils.cone import Cone
from navo_utils.registry import Registry

Registry query methods exist in an astroquery.vo.Registry() class with different levels of simplicity and power. So you can, for example, if you already know you want to search NED, get related service URLs as follows. Note that you may get *more* results than you expect, which should be easily differentiated by a human.

In [9]:
results = Registry.query(source='ned', service_type='cone', debug=True)
print('Found {} results:'.format(len(results)))
print(results[:]['access_url'])
print(results[1]['ivoid'])
print(results.columns)

Registry:  sending query ADQL = 
          select res.waveband,res.short_name,cap.ivoid,res.res_description,
          intf.access_url, res.reference_url
          from rr.capability as cap
          natural join rr.resource as res
          natural join rr.interface as intf
           where cap.cap_type='conesearch' and cap.ivoid like '%ned%'

Queried: http://vao.stsci.edu/RegTAP/TapService.aspx/sync

Found 2 results:
                                      access_url                                      
--------------------------------------------------------------------------------------
http://ned.ipac.caltech.edu/cgi-bin/NEDobjsearch?search_type=Near+Position+Search&amp;
                                https://irsa.ipac.caltech.edu/SCS?table=shelacomb&amp;
ivo://irsa.ipac/spitzer/catalog/shela/shela_combined
<TableColumns names=('waveband','short_name','ivoid','res_description','access_url','reference_url')>


The Registry.query() method takes arguments which we can use to further filter the results (passed to internal function _build_adql):  

    service_type : "image", "cone", or "spectr"
    keyword      : any keyword contained in ivoid, title, or description
    waveband     : waveband string. Multiple options may be comma-delimited i.e. "optical, infrared"
    source       : any substring in ivoid
    publisher    : the name of any publishing organization
    order_by     : what field to order it by, but then you have to know the names, currently
                    ("waveband","short_name","ivoid","res_description","access_url","reference_url","role_name")
    logic_string : any other string you want to add to the ADQL where clause, should start with " and "

The results are returned by Registry.query() in an astropy table using the conversion function _astropy_table_from_votable_response(). 

The Registry.query_counts() method takes arguments which we can use to see which keyword values might help us narrow down our search, or possibly give us too MANY results (these are passed to internal function _build_counts_adql):

    field      : keyword field for which to see popular values: "waveband", "publisher" currently supported.
    minimum    : A minimum count of occurences for the keyword value to use as a cutoff (optional, defaults to 1)

In [4]:
results = Registry.query_counts('publisher', 15, debug=True)
print(results)

Registry:  sending query ADQL = select * from (select role_name, count(role_name) as count_field from rr.res_role where base_role = 'publisher'  group by role_name) as count_table where count_field >= 15 order by count_field desc

Queried: http://vao.stsci.edu/RegTAP/TapService.aspx/sync

                         role_name                           count_field
------------------------------------------------------------ -----------
                                                         CDS       17115
                                           NASA/GSFC HEASARC        1037
                          NASA/IPAC Infrared Science Archive         520
                                            The GAVO DC team         159
                   Space Telescope Science Institute Archive         101
      WFAU, Institute for Astronomy, University of Edinburgh          99
                                                         IA2          35
                                                     

With a 'publisher' field to work from, we can get a narrowed down query:

In [7]:
results = Registry.query(source='ned', publisher='Extragalactic Database', service_type='cone', debug=True)
print('Found {} results:'.format(len(results)))
print(results[:]['access_url'])

Registry:  sending query ADQL = 
          select res.waveband,res.short_name,cap.ivoid,res.res_description,
          intf.access_url, res.reference_url
          from rr.capability as cap
          natural join rr.resource as res
          natural join rr.interface as intf
          
             natural join rr.res_role as role
              where cap.cap_type='conesearch' and cap.ivoid like '%ned%' and ( res.waveband like '%optical%' or res.waveband like '%radio%') and role.base_role = 'publisher' and role.role_name like '%Extragalactic Database%'

Queried: http://vao.stsci.edu/RegTAP/TapService.aspx/sync

Found 1 results:
                                      access_url                                      
--------------------------------------------------------------------------------------
http://ned.ipac.caltech.edu/cgi-bin/NEDobjsearch?search_type=Near+Position+Search&amp;



Note we will need to URL-decode the access_url information in our results, as the registry resource standard expects it be encoded.

In [4]:
from html import unescape

for result in results:
    print(unescape(result['access_url']))

http://ned.ipac.caltech.edu/cgi-bin/NEDobjsearch?search_type=Near+Position+Search&


# 11. TAP

This currently doesn't work but should be perfectly doable:

    tap_services_2mass=Registry.query(keyword='2mass',service_type='table')
    
Look through the results, find the one you want, then assuming you know how to construct ADQL logic and you know the names of the columns in the catalog you're searching:

    tap_results=Tap.query(
        source=tap_services_2mass[32],
        logic_string='CONTAINS(POINT('J2000',ra,dec),CIRCLE('J2000',9.90704,8.96507,0.001))'
        )

is the equivalent to a cone search, but you could do whatever you wanted. If you didn't know what TAP service you wanted, you probably couldn't do this (as above for images, where you can get image information from all services in the registry). The reason is that the TAP query would depend on the column names, and they are not common.

On the other hand, since people have to know how to use ADQL and know the columns of the catalog they're interested in, it's not clear we can add much value with a wrapper.