## ADQL = SQL for Astrophysicists
ADQL is a subset of SQL, users who are familiar with SQL can easily learn ADQL and apply their knowledge to query astronomical databases, such as those provided by MAST. At the same time, ADQL provides astronomers and astrophysicists with a powerful tool for querying large and complex datasets of astronomical data.

In [15]:
#first we configure the .ipynb (notebook with the required modules)
import sys
import os
import time
import re
import json

import requests
from urllib.parse import quote as urlencode

from astropy.table import Table
import numpy as np

import pprint
pp = pprint.PrettyPrinter(indent=4)

The next cell defines a function called mast_query which takes a dictionary object as an input, representing a request to the MAST (Mikulski Archive for Space Telescopes) API. The function returns two values: the HTTP headers of the response and the content (data) of the response.

The function first sets the base URL for the API (mast.stsci.edu/api/v0/invoke) and determines the version of Python being used. It then creates a dictionary of HTTP headers that will be sent along with the request. This includes the content type, accept type, and user agent (which identifies the client making the request as the Python "requests" library).

The function then encodes the request dictionary as a JSON string and URL encodes it. This is necessary because the MAST API expects the request to be sent as a URL-encoded string.

The function then sends an HTTP POST request to the MAST API, including the URL-encoded request string and the HTTP headers. The response headers and content are then extracted from the response object and returned by the function.

Overall, this function provides a convenient way to send requests to the MAST API and retrieve the associated data. It handles encoding the request in the proper format, sending it to the API, and extracting the response data.

In [16]:
def mast_query(request):
    """Perform a MAST query.
    
        Parameters
        ----------
        request (dictionary): The MAST request json object
        
        Returns head,content where head is the response HTTP headers, and content is the returned data"""
    
    # Base API url
    request_url='https://mast.stsci.edu/api/v0/invoke'    
    
    # Grab Python Version 
    version = ".".join(map(str, sys.version_info[:3]))

    # Create Http Header Variables
    headers = {"Content-type": "application/x-www-form-urlencoded",
               "Accept": "text/plain",
               "User-agent":"python-requests/"+version}

    # Encoding the request as a json string
    req_string = json.dumps(request)
    req_string = urlencode(req_string)
    
    # Perform the HTTP request
    resp = requests.post(request_url, data="request="+req_string, headers=headers)
    
    # Pull out the headers and response content
    head = resp.headers
    content = resp.content.decode('utf-8')

    return head, content

M51 (or NGC 5194) is a grand design spiral galaxy located in the constellation Canes Venatici, about 23 million light-years away from Earth. It has a similar size and shape to M101, with a bright central bulge and prominent spiral arms that are rich in gas and dust. The Whirlpool Galaxy is also notable for its interaction with a smaller companion galaxy, NGC 5195, which is visible as a small, distorted object near the bottom of the image of the Whirlpool Galaxy. This interaction has triggered star formation in the region and created a number of bright star clusters in the Whirlpool's spiral arms.

In [17]:
object_of_interest = 'M51'

resolver_request = {'service':'Mast.Name.Lookup',
                     'params':{'input':object_of_interest,
                               'format':'json'},
                     }

headers, resolved_object_string = mast_query(resolver_request)

resolved_object = json.loads(resolved_object_string)

pp.pprint(resolved_object)

{   'resolvedCoordinate': [   {   'cacheDate': 'Feb 21, 2023, 12:01:47 PM',
                                  'cached': True,
                                  'canonicalName': 'MESSIER 051',
                                  'decl': 47.23056,
                                  'objectType': 'GPair',
                                  'ra': 202.48417,
                                  'radius': 0.075,
                                  'resolver': 'NED',
                                  'resolverTime': 24,
                                  'searchRadius': -1.0,
                                  'searchString': 'm51'}],
    'status': ''}


In [18]:
obj_ra = resolved_object['resolvedCoordinate'][0]['ra']
obj_dec = resolved_object['resolvedCoordinate'][0]['decl']
print('object_of_interest', object_of_interest, 'is  located at', 'ra', obj_ra, 'decl', obj_dec)

object_of_interest M51 is  located at ra 202.48417 decl 47.23056


In [19]:
mast_request = {'service':'Mast.Caom.Cone',
                'params':{'ra':obj_ra,
                          'dec':obj_dec,
                          'radius':0.2},
                'format':'json',
                'pagesize':2000,
                'page':1,
                'removenullcolumns':True,
                'removecache':True}

headers, mast_data_str = mast_query(mast_request)

mast_data = json.loads(mast_data_str)

print(mast_data.keys())
print("Query status:",mast_data['status'])

dict_keys(['status', 'msg', 'data', 'fields', 'paging'])
Query status: COMPLETE


In [20]:
pp.pprint(mast_data['fields'][:5])

[   {'name': 'intentType', 'type': 'string'},
    {'name': 'obs_collection', 'type': 'string'},
    {'name': 'provenance_name', 'type': 'string'},
    {'name': 'instrument_name', 'type': 'string'},
    {'name': 'project', 'type': 'string'}]


In [21]:
pp.pprint(mast_data['data'][0])

{   '_selected_': None,
    'calib_level': 3,
    'dataRights': 'PUBLIC',
    'dataURL': None,
    'dataproduct_type': 'image',
    'distance': 0,
    'em_max': 1000,
    'em_min': 600,
    'filters': 'TESS',
    'instrument_name': 'Photometer',
    'intentType': 'science',
    'jpegURL': None,
    'mtFlag': False,
    'obs_collection': 'TESS',
    'obs_id': 'tess-s0016-4-3',
    'obs_title': None,
    'obsid': 27545566,
    'project': 'TESS',
    'proposal_id': 'N/A',
    'proposal_pi': 'Ricker, George',
    'proposal_type': None,
    'provenance_name': 'SPOC',
    's_dec': 50.01251405441704,
    's_ra': 200.6066940322731,
    's_region': 'POLYGON 188.96005900 47.31461100 195.80207500 58.03466100 '
                '213.76319300 51.45788000 203.63432700 41.78187500 '
                '188.96005900 47.31461100 ',
    'sequence_number': 16,
    'srcDen': None,
    't_exptime': 1425.599358,
    't_max': 58762.80885051,
    't_min': 58738.14190212,
    't_obs_release': 58782.3333334,
    't

In [30]:
mast_data_table = Table()

for col,atype in [(x['name'],x['type']) for x in mast_data['fields']]:
    if atype=="string":
        atype="str"
    if atype=="boolean":
        atype="bool"
    mast_data_table[col] = np.array([x.get(col,None) for x in mast_data['data']],dtype=atype)
    
print(mast_data_table)
#print(mast_data_table.dtype)
mast_data_table.info()
num_rows = len(mast_data_table)
num_cols = len(mast_data_table.colnames)
print("The mast_data_table has", num_rows, "rows and", num_cols, "columns.")

intentType obs_collection provenance_name ...      distance      _selected_
---------- -------------- --------------- ... ------------------ ----------
   science           TESS            SPOC ...                0.0      False
   science           TESS            SPOC ...                0.0      False
   science           TESS            SPOC ...                0.0      False
   science           TESS            SPOC ...                0.0      False
   science           TESS            SPOC ...  45.44407539669823      False
   science          SWIFT            None ...                0.0      False
   science          SWIFT            None ...                0.0      False
   science          SWIFT            None ...                0.0      False
   science          SWIFT            None ...                0.0      False
   science          SWIFT            None ...                0.0      False
       ...            ...             ... ...                ...        ...
   science  

The results are 2000 rows of data that comes from several astronomic collections.
This demonstrates one of the basic mechanisms used by astronomers and astrophisicists to
gather archived data from orbital (and earthbound) observatories.

Astronomers and astrophysicists often rely on archived data from space-based and ground-based observatories to study and understand the universe. These archives consist of vast amounts of data collected over many years from a variety of telescopes and instruments. By querying these archives and retrieving the relevant data, researchers can analyze and interpret the observations to answer their scientific questions.

This code queries one or more of these astronomical archives to retrieve a large dataset of astronomical observations. This demonstrates the power of using computer programs to automate the process of collecting and analyzing data, which can help researchers to more efficiently and effectively understand the universe.