In the following, the code cells are the prototypes of the code we need. What the user would actually have to type is shown in the text cells. 

Ideally, go so far as to prototype it, making functions that hardwire the notebook examples so you can see how it would work. Think about how it could work not as below with astroquery.heasarc.list_sia_services(), for example, but just as astroquery.list_sia_services(). The whole point is the user should care which site it comes from.

If NAVO develops astroquery.vo, we could use things like:

TAP:  

    astroquery.vo.Registry.query( ... lots of options ...)
    astroquery.vo.Registry.list_image_services(source='heasarc') 

TAP:  

    astroquery.vo.Registry.surveys_like("Redshift”,source='heasarc')

SIA:  
    
    astroquery.vo.Image.get_image() # skyview vs xamin? 


 notes on Vandana’s collected workshop notebook:


In [2]:
import matplotlib
import matplotlib.pyplot as plt
%matplotlib inline  
import requests, io, astropy
from IPython.display import Image, display

## For handling ordinary astropy Tables
from astropy.table import Table

## For reading FITS files
import astropy.io.fits as apfits

## There are a number of relatively unimportant warnings that 
## show up, so for now, suppress them:
import warnings
warnings.filterwarnings("ignore")

## our stuff
import sys
# Use the NASA_NAVO/astroquery
sys.path.insert(0,'../../astroquery/')
import astroquery
from astroquery.vo import Registry

Registry queries already coded by TomD and TJ in astroquery.vo.Registry() class. So you can, for example, if you already know you want to search NED, get it's URL as follows. Unfortunately, *with the current implementation, you get two results, where the second isn't NED but has "ned" in the ivoid ("shela_combined"). Not sure what to do about that.* Could hard-wire things like "ned", "heasarc", etc. But that's not ideal. 

In [3]:
results = Registry.query(source='ned', service_type='cone',debug=True)
print('Found {} results:'.format(len(results)))
print(results[:]['access_url'])
print(results[1]['ivoid'])
print(results.columns)

Registry:  sending query ADQL = 
          select res.waveband,res.short_name,cap.ivoid,res.res_description,
          int.access_url, res.reference_url
           from rr.capability cap
           natural join rr.resource res
           natural join rr.interface int
           where cap.cap_type='conesearch' and cap.ivoid like '%ned%'

Queried: https://vao.stsci.edu/RegTAP/TapService.aspx/sync

Found 2 results:
                                                 access_url                                                
-----------------------------------------------------------------------------------------------------------
http://ned.ipac.caltech.edu/cgi-bin/NEDobjsearch?search_type=Near+Position+Search&amp;of=xml_main&amp;&amp;
                                                     https://irsa.ipac.caltech.edu/SCS?table=shelacomb&amp;
ivo://irsa.ipac/spitzer/catalog/shela/shela_combined
<TableColumns names=('waveband','short_name','ivoid','res_description','access_url','reference_url'

The Registry.query() method takes arguments (passed to internal function _build_adql):  

    service_type   : "image", "cone", or "spectr"
    keyword        : any keyword contained in ivoid, title, or description
    waveband       : waveband string
    source         : any substring in ivoid
    order_by       : what field to order it by, but then you have to know the names, currently
                      ("waveband","short_name","ivoid","res_description", "access_url", "reference_url")
    logic_string   : any other string you want to add to the ADQL where clause, should start with " and "

The results are already in an astropy table from Tom's _astropy_table_from_votable_response(). 

**But note that the URLs are escaped and should not be by the time we get them back. How to fix?**

In [4]:
import html
print(html.unescape(results[0]['access_url']))

http://ned.ipac.caltech.edu/cgi-bin/NEDobjsearch?search_type=Near+Position+Search&of=xml_main&&


### 3. Workshop section on data discovery using NED's Cone search. 

Instead of searching NED ‘manually’, a generic cone search that you can give a list of ras, decs, and radii (or just one obviously) and optionally specify that you want ‘ned’ results or some other IVOID substring. If you ask for known things like ‘ned’, it has hard-wired (?) base URL and query path. If not, it queries the RegTAP to find out what cone searches are available that match a given ivoid_string. (Is there a way to get the NED URL dynamically from RegTAP without the above ambiguity? A special case if the ivoid requested is "ned"?) So the user would call:

    cone_results = astroquery.cone_search(ras, decs, radii, [source=’some_ivoid_string_eg_ned']) 

where ras, decs, and radii can be floats, strings, or arrays of either. If a single source (i.e., ivoid), then get back a table of objects; if several matching sources (which will have different columns), get back a list of tables, one for each matching source? Since every table will return different columns, need to return some kind of meta data result as well. Separate object or attached to each result column's meta data? 

So like the Registry, we need a Cone() that could work as follows:

In [9]:
from astroquery.query import BaseQuery
import html # to unescape, which shouldn't be neccessary but currently is
class ConeClass(BaseQuery):
    def __init__(self):
        super(ConeClass, self).__init__()

    def query(self, inra, indec, inradius, **kwargs):
        # Get the list of URLs that provide matching cone searches 
        services=Registry.query(service_type='cone',**kwargs)
        results=[]
        # If there's more than one service URL found, then what? Loop over those?
        print("Found {} services to query.".format(len(services)))
        for service in services:
            print("    Querying service {}".format(html.unescape(service['access_url'])))
            #  TO BE FIXED: should work if inra is a single float *or* string:
            for i,ra in enumerate(inra):
                # Construct params ... For now, hard wire:
                dec=indec[i]
                if len(inradius) > 1: 
                    radius=inradius[i]
                else:
                    radius=inradius
                result=self._one_cone_search(ra,dec,radius,html.unescape(service['access_url']))
                # Need a test that we got something back. Shouldn't error if not, just be empty
                if len(result) > 0:
                    # Extend requires that all the columns be the same. 
                    # (The meta data for the result columns are lost because assumed to be the same.)
                    # The "cone_table_from_votable" should do that but for now, append to a list.
                    #from IPython.core.debugger import Tracer; Tracer()() 
                    print("    Got {} results for source number {}".format(len(result),i))
                    results.append(result)
                else:
                    print("    (Got no results for source number {})".format(i))
        return results

    def _one_cone_search(self, ra, dec, radius, service):
        params = {'RA': ra, 'DEC': dec, 'SR':radius}
        # For some reason, this has to be a GET not a POST?
        response=self._request('GET',service,params=params)
        return self._astropy_cone_table_from_votable_response(response)
    
    def _astropy_cone_table_from_votable_response(self,response):
        """Need one of these for each class to make standard tables using UCDs etc.
        
        For now, just simple conversion"""
        try:
            table= Table.read(io.BytesIO(response.content))
            #from IPython.core.debugger import Tracer; Tracer()() 
            table.meta['xml_raw']=response.content
            table.meta['url']=response.url
            return table
        except:
            return Table()

Cone=ConeClass()


In [11]:
#from astroquery.vo import Cone
#  Single arguments:  should take floats or strings, converts floats to string for the query.
#  For now, make them all arrays until we sort the above issue
ras=[185.47873,35.323]
decs=[4.47365,6.934]
radius=[0.03]
# List arguments:  should take list of floats or strings, converts floats to strings, loops over them. 
#  Length of ras and decs should be the same, radius should be either same length or single value.
ned_results=Cone.query(ras,decs,radius,source='ned')
print(ned_results[0].meta['url'])
print(ned_results[0]['main_col3'].meta['ucd'])

Found 2 services to query.
    Querying service http://ned.ipac.caltech.edu/cgi-bin/NEDobjsearch?search_type=Near+Position+Search&of=xml_main&&
    Got 494 results for source number 0
    Got 6 results for source number 1
    Querying service https://irsa.ipac.caltech.edu/SCS?table=shelacomb&
    (Got no results for source number 0)
    (Got no results for source number 1)
http://ned.ipac.caltech.edu/cgi-bin/NEDobjsearch?search_type=Near+Position+Search&of=xml_main&&&RA=185.47873&DEC=4.47365&SR=0.03
POS_EQ_RA_MAIN


Switching the cell below to use GET works, but POST doesn't work. So the problem is the query method used in astroquery.query.BaseQuery

In [None]:
# But why does that find nothing? Test 'manually':
# This URL doesn't work because it's escaped! 
#url='http://ned.ipac.caltech.edu/cgi-bin/NEDobjsearch?search_type=Near+Position+Search&amp;of=xml_main&amp;&amp;'
url='http://ned.ipac.caltech.edu/cgi-bin/NEDobjsearch?search_type=Near+Position+Search&of=xml_main&&'
response=Cone._request('GET',url,
#                       params={'RA':353.23,'DEC':6.934,'SR':0.05})
                        params={'RA':185.47873,'DEC':4.47365,'SR':0.03})
#                       params={'POS':'185.47873,4.47365','SIZE':3})
print(response.url)
print(response.content)
table=Table.read(io.BytesIO(response.content))
print(len(table))
# This doesn't even return xml. WTF? This isn't a VO service? But it's in the RegTAP! WTF? 

Or if you don't know what the source is but you want to do a cone search on all catalogs related to gammas:

In [None]:
radio_results=Cone.query([185.47873],[4.47365],[0.1],waveband='gamma')
print(len(radio_results))

In [None]:
print(radio_results[1982].meta['url'])

*Notes: the _astropy_table_from_votable_response() should then be generic, not just in Registry class*

In [None]:
print(radio_results)

But these come with different columns:

In [None]:
print(radio_results[0].columns)
print(radio_results[402].meta)


In [None]:
# Just testing how to get meta data back from queries. From an image search, there's a ucd in the column
#  meta data. 
# A Cone search:
url='https://heasarc.gsfc.nasa.gov/cgi-bin/vo/cone/coneGet.pl?table=osse&amp;'
response=Cone._request('POST',url,
                       params={'RA':185.47873,'DEC':4.47365,'SR':3})
print(response.content)
table= Table.read(io.BytesIO(response.content))
print(table['target'].meta['ucd'])


In [None]:

# A TAP query?
url='https://heasarc.gsfc.nasa.gov/xamin/vo/tap/sync'
params={
    "request":"doQuery",  # for requests, specify the request type
    "lang":"ADQL",        # the language
    "query":              # and the query expressed in that language
    """SELECT ra, dec, Radial_Velocity FROM zcat as cat where 
    contains(point('ICRS',cat.ra,cat.dec),circle('ICRS',{0},{1},{2}))=1 and
    cat.bmag < 14
    order by cat.radial_velocity_error
    """.format(185.47873,4.47365,3.0)
    }
r = Cone._request('GET','https://heasarc.gsfc.nasa.gov/xamin/vo/tap/sync', params=params)
r.content

And there's no ucd in the meta data for the fields. TomM says this isn't required but it should have been there and he'll make a note.

In [None]:
print(r.url)
print(r.request)

Combining the columns from the different services is again annoying. Need to use a function that combines them using the UCD? But UCDs aren't always there. Eg., in some places HEASARC uses the column name “sia_url”, and this is the UCD “VOX:Image_AccessReference”, so something inside would have transparently translate from each different archive’s column names to something standardized. (The UCD itself? Or something we define, e.g., “image_url”?)

We should just use UCDs or UTYPEs and even just **ignore** all column names that we get back and immediately rename everything with the UCD and use only that internally. Then to hand it back to the user, we define our own astroquery column names that are obvious, like “URL”, etc., though storing the history (the UCD and the original service-specific column name) in the column’s meta data.

We need to keep the ucd information around, however. It's in the VOTable XML that we should get back from any service. And it IS kept in the astropy table as, e.g., uvot_table['Ra'].meta['ucd'] or uvot_table['URL'].meta['ucd']. So we can use these.

So under the hood, we would take any table we get back and rename each column with its UCD. Then to give things to the user, define simple dictionaries of the common columns such as:


Started to define a function for the second part of this cell in the workshop notebook that got the pass bands from NED. This is very NED-specific. Any way to generalize?

    ned_info = astroquery.get_ned_info( ra, dec, radii )

calls the cone search and passes the ACREF for each match to NED again to get the info. But ACREF isn't a required value returned by a cone search. All that's required is the ID, RA, and DEC. So I don't think this can be generalized.


In [None]:
# Somehow have this overloaded for objects of type SIA, SSA, etc.? Or just one list if there are no ambiguities? 
def ucd2col(ucd):
    u2c={
        "VOX:Image_AccessReference":"URL",
        "meta.ref.url":"URL"
    }
    return u2c[ucd]
    
def col2ucd(col):
    # But these aren't unique. 
    c2u={
        "URL":["VOX:Image_AccessReference","meta.ref.url"]        
    }
    return c2f[col]
    
    
def standardize( list_of_tables):
    """Take a list of astropy tables that have all different columns and convert to one standard astropy table
    
    Use the list_of_tables[table_number][column].meta['ucd'] to identify the standard columns we want.
    
    """
    return astropy_table, metadata

4 TAP:  

    sia_services = astroquery.list_sia_services( [source=’ivoid_string_eg_heasarc'] , [name_like=‘allwise’], [description_like=‘whatever’] ) 

This one can easily be generalized so you can get images from any service (or a chosen one) that have a short name substring, or description substring. (Or perhaps just string_like=‘whatever’ and it searches both short name as well as description?) It returns a table of information, including the ‘access_url’ that you can then plug into another generic function




In [None]:
def list_image_services( source='', name_like=''):
    service=astroquery.regtap_get_service(service_type="image",source=source,name_like=name_like)
    return request2table(url=service,)
def list_spectra_serices():
    service=astroquery.regtap_get_service(service_type="spectra",source=source,name_like=name_like)
    return request2table(url=service,)
def list_cone_serices():
    service=astroquery.regtap_get_service(service_type="cone",source=source,name_like=name_like)
    return request2table(url=service,)


5 SIA

Then pick one of the listed services (say number 20, after you looked at the descriptions) and query it to get an image.

    image_url = astroquery.get_image_url( access_url=sia_services[20][‘access_url’], pos=‘185.47873,4.47365', size=‘0’]

or perhaps you don't know which service is quite what you want, so get info for all of them:

    images_info = astroquery.get_images_info( access_url=sia_services[:][‘access_url’], pos=‘185.47873,4.47365', size=‘0’, naxis=‘300,300’]

to get a table list of all images in a list of services that contain that point. Then standardize them as above.



In [None]:
def get_image_url( access_url, pos, size, naxis='',format='fits'):
    """Return a single image url
    
    For some services, you have to specify the format, e.g., SkyView. Is this standard?"""
    # Send the query to get the URLs to any matching images
    result_table=astroquery.standardize(
        astroquery.request2table( url=access_url, params = {'pos': pos, 'size': size, 'naxis':naxis})
    )
    # Find the FITS image url. For SkyView, there will be 2 URLs, one FITS and one JPEG
    return result_table[the_right_one]['url']

def get_images_info( access_urls, pos, size, naxis=''):
    """Return the URLs etc. for images available from a list of services."""
    # Since the columns differ, first get a list of astropy tables.
    results=[] 
    for service_url in access_urls:
        results.append( astroquery.get_image( access_url=service_url, pos=pos,size=size,naxis=naxis) )
    return astroquery.standardize(results)

 

7.

You can look at the images_info and pick one to download:

    astroquery.download_image( images_info[6]['url'], filename='my_file.fits')
    
or get the image data to hand to the plotter:

    image=astroquery.get_image( images_info[6]['url'] )
    plt.imshow( image,  cmap='gray', origin='lower',vmax=0.02 )
    
    



In [None]:
                                   
def download_image( image_url, filename=''):
    """Download the image and optionally rename """
    urllib.request.urlretrieve(image_url, filename)
    return 

def get_image( image_url ):
    """Returns the data that can be handed to plt.imshow() from a URL"""
    astroquery.download_image( image_url, filename='tmp.fits')
    hdus=astropy.io.fits.open('tmp.fits')
    return hdus[0].data



8

This is complex but can use the above. 

    galex_image_services = astroquery.list_image_services( name_like=“galex”) 
    2mass_image_services = astroquery.list_image_services( name_like=“2mass”) 
    allwise_image_services = astroquery.list_image_services( name_like=“allwise”) 
    
    query_urls=galex_image_services[:][’access_url’]
    query_urls.extend( 2mass_sia_services[:][’access_url’])
    query_urls.extend( allwise_sia_services[:][’access_url’])
    
    for url in query_urls:
    



    