Skip to content

PyVO API

Laurent MICHEL edited this page Apr 21, 2022 · 12 revisions

Proposals for an model-oriented API that does not break the existing one

In the following proposal, we consider working on TAP responses thus on tabular data. Retrieving complete instances such as Provenance or Cube is out of the scope of this section.

Regular PyVO API from readthedocs (For the record)

import pyvo as vo
service = vo.dal.TAPService("http://dc.g-vo.org/tap")
resultset = service.search("SELECT TOP 1 * FROM ivoa.obscore")
row = resultset[0]
column = resultset["dataproduct_type"]

for row in resultset:
    calib_level = row["calib_level"]

Proposals for an API with a DM flavour

Proposal 1: Featuring the resultset object with model capabilities

We assume to work with a vodml-enable Astropy package like as this.

The service connection has no reason to be changed.

import pyvo as vo
service = vo.dal.TAPService("http://dc.g-vo.org/tap")

Data search keeps similar, but the search methods must be overloaded to support the request of working with the model mapping.

  • We can assume that only TAP services will be able to provide annotated data. The simple services do not need annotation since the model mapping is defined by the standards.
  • The search endpoint returns an instance of a subclass of ResultSet (e.g. ResultSet_DM)enable to generate data model views.
  • ResultSet_DM operates all ResultSet functionalities plus some model specific ones dealing with models.
  • The usage of either ResultSet_DM or ResultSet is transparent for not model-aware users.
  • ResultSet_DM could replace ResultSet actually.
resultset_dm = service.search("SELECT TOP 1 * FROM ivoa.obscore", dm_mapping=True)
row = resultset_dm[0]
column = resultset_dm["dataproduct_type"]

Here starts my proposal for the model related functions.

  • The first step is to know which models are mapped
  • We can either get the whole mapping block as an etree instance
mapping _block = resultset_dm.get_block()
> <dm-mapping:VODML>
>    ...
> </dm-mapping:VODML>
  • or look for mapped models
mapped_models = resultset_dm.get_mapped_models()
> [ 'Measure', 'Coordinates', 'Mango', 'photDM']
  • or get the model classes on which data are mapped
mapped_classes = resultset_dm.get_mapped_dmtypes()
> [ 'meas:Position', 'mango:status', 'photdm:PhotometryCalib']

From that point, we have everything we need to play with model instances.

  • It is important that ResultSet_DM provides a getter on the last read row; this will facilitates the model operations to be applied on that row.
  • The first thing we can ask for is the complete mapped row (XML or JSON)
row = resultset_dm.next()
xml_row_model_view = resultset_dm.get_row_model_view()
> <dm-mapping:INSTANCE dmtype= 'meas:Position' .../>
> <dm-mapping:INSTANCE dmtype= 'mango:status' .../>
> <dm-mapping:INSTANCE dmtype= 'photdm:PhotometryCalib' .../>
json_row_model_view = resultset_dm.get_row_json_model_view()
> {@dmtype= 'meas:Position' ...}
  • We also can ask for a specific dm-type.
xml_instance = resultset_dm.get_row_model_view_by_type('mango:status')
> <dm-mapping:INSTANCE dmtype= 'mango:status' .../>

Results are given as etree instances that can easily be parsed with Xpath requests either by user code or by model-specific classes.

One can also ask for GLOBALS elements such as coordinate frames or photometric filters.

globals_models = resultset_dm.get_globals_dmtypes()
> [ 'coords:SpeceFrame', 'photdm:Filter']
globals_model = resultset_dm.get_globals_by_type('photdm:Filter')
> [<dm-mapping:INSTANCE dmtype='photdm:Filter''] .../>, ....]

This operation does not tell e.g. which measure is linked with which filter. This is achieved by parsing the Measure instance which contains e.g. the filter reference.

Usually models do not provide specific classes for each of the physical quantities; the latest (in Measure and Mango at least) are identified by UCDs. To deal with this, we have to provide an API based on UCDs.

mapped_quantities = resultset_dm.get_mapped_ucds()
> [ 'pos.eq;meta.main', 'pos.eq;meta.main', 'phot.mag']
xml_model_view = resultset_dm.get_row_model_view_by_ucd('phot.mag')
> <dm-mapping:INSTANCE dmtype='meas:GenericMeasure' .../>

Requiring clients to parse model instance with XPath, though easy, would likely dissuade people to use models. For the most common cases, we may provide placeholder objects that facilitate working with model instances

row = resultset_dm.next()
sky_position = resultset_dm.get_ap_sky_position()
print(f"{sky_position.ra} {sky_position.dec}")

This must be done for the most popular packages or models (Astropy, Meas, PhotDM Mango)

Proposal 2: Adding a model viewer isolated from the resultset

All the model stuff is grouped in one package, namely vomas (TBC)

Let's start with using the regular API

import pyvo as vo
service = vo.dal.TAPService("http://model.enable.org/tap")
resultset_dm = service.search("SELECT TOP 1 * FROM ivoa.obscore", dm_mapping=True)

After we can check that the VOTable contains annotations. Let's assume they are in the first resource.

mviewer = None    
for resource in resultset.votable.resources:
    if resource.model_mapping is None:
        print("no mapping block")
        sys.exit(1)
    mviewer = ModelViewer(resource)

Let's look for coords:MJD instances in the first row:

row = None
while mviewer.get_next_row() is not None:
    row = mviewer.current_data_row
    for mjd in mviewer.get_model_component_by_type("coords:MJD"):
        XmlUtils.pretty_print(mjd)
    break

With this proposal, the mapping block processor operates like a wrapper upon the regular VOtable processor.