Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use TAP with alma - initial version #1689

Merged
merged 1 commit into from Aug 6, 2020
Merged

Conversation

andamian
Copy link

No description provided.

@astropy-bot astropy-bot bot added the alma label Mar 31, 2020
'gal_longitude': 'Galactic longitude',
'gal_latitude': 'Galactic latitude',
'band_list': 'Band',
's_region': 'Spatial resolution',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could use some whitespace cleanup

try:
import pyvo
except ImportError:
print('Please install pyvo. astropy.vo does not work without it.')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as w/previous: shouldn't this just raise?

}

ALMA_BANDS = {
'3': (84*10**9*u.Hz, 116*10**9*u.Hz),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don't like GHz?

# galactic (Galactic longitude, Galactic latitude str) or
# source_name_resolver (object name) or combination.
# Add them all to the pos list
radius = payload.get('radius', 0.016666666666667*u.deg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe 1*u.arcmin?

@keflavich
Copy link
Contributor

To give you a sense of the API as I've used it, here's a list of a bunch of different queries I have run:

Alma.query_region(coordinate=orionkl_coords, radius=4*u.arcmin, public=False, science=False)
Alma.query_object('M83', public=False, science=False)
Alma.query(payload={'pi_name':'Bally'}, public=False)
alma.query(payload=dict(project_code='2016.1.00165.S'), public=False, cache=False)
alma.query(payload=dict(project_code='2017.1.01355.L', source_name_alma='G008.67'),)
Alma.query_region(coordinates.SkyCoord('5:35:14.461 -5:21:54.41', frame='fk5', unit=(u.hour, u.deg)), radius=0.034*u.deg)
Alma.query_region(coordinates.SkyCoord('5:35:14.461 -5:21:54.41', frame='fk5', unit=(u.hour, u.deg)), radius=0.034*u.deg, payload={'energy.frequency-asu':'215 .. 220'})
Alma.query(payload=dict(project_code='2012.*', public_data=True))
Alma.query(payload={'frequency':'96 .. 96.5'}, cache=False)
rslt = Alma.query_object('M83', band_list=[4,5,8])
Alma.query(payload={'pi_name':'ginsburg', 'band_list':'6'})

but I'll note that my history of using the band & frequency queries is littered with dozens of failed attempts to get the syntax right. The API we'd like to support does not have to take the same syntax, and in fact it would be helpful if we could check the syntax (and specify it) user-side.

So for the basic query API, as long as we continue to support the basic region queries and some variants of the criteria query (on PI, project ID, frequency, band, polarization [tricky, I know], etc), it's OK. I think we'll have to live with syntax-breaking in the uploaded values (e.g., 96 .. 96.5 above) because we don't really know what the old syntax was anyway.

@andamian
Copy link
Author

andamian commented Apr 7, 2020

The majority of those examples should work with the changes in this PR (and probably a few others). Wild cards (project_code='2012.*') however do not work.
That being said I suggest the following:
Have query interface for general users and one for advanced ALMA users. Thi,. in fact, is Simple Image Access and TAP Obscore protocols' raison d'être.
More concrete, I suggest exposing the SIA attributes in the query* methods. Issue deprecate warnings for payload arguments but for now try to have them still work by "translating" them into SIA constraints whenever possible. Same deal for the results. This will constitute the general user ALMA API.

For users with more intimate knowledge about ALMA, we can create a method to query the TAP service. Users will be given tools to explore the schema of the TAP service (help_tap) and write their own queries. That's where they can use wildcards, sorting, grouping and all the other goodies.

This approach is not too complicated, keeps the library simple, tries to make the changes less disruptive and appeals to any type of audience, including power users.

In summary, I'm proposing:

  1. add sia argument to query* as a dictionary with the following expected keys: pos, band, time, pol, field_of_view, spatial_resolution, spectral_resolving_power, exptime, timeres, publisher_did, facility, collection, instrument, data_type, calib_level, target_name, res_format, maxrec. Or should they be exposed directly as method arguments the way I've proposed it in the vo module?
  2. Add deprecate warnings when payload argument is detected but attempt to perform operations when possible.
  3. Add help_sia method. Also update help method to explain that the interface is deprecated and which arguments are still supported in the current version.
  • Add query_tap method to query ObsCore via TAP.
  • Add help_tap to query ALMA TAP schema and return information regarding the available ObsCore columns and their types.

Sounds acceptable?

@keflavich
Copy link
Contributor

I like this plan.

Re: (1) - let's have these as options to query_*. I don't immediately understand how they would work as methods, but we want to continue supporting query* interfaces since the rest of astroquery has them.

Everything else is fine.

@andamian
Copy link
Author

andamian commented Apr 7, 2020

I like this plan.

Re: (1) - let's have these as options to query_*. I don't immediately understand how they would work as methods, but we want to continue supporting query* interfaces since the rest of astroquery has them.

Everything else is fine.

What I meant is that query* can either have all those arguments add with defaults or use a single sia generic dictionary argument (similar to payload). The former is a bit better because works with autocompletion and IDEs, while the latter is consistent with the original payload (generic dictionary argument + corresponding help method).

@keflavich
Copy link
Contributor

yeah, I like the IDE-friendlier specific arguments approach.

@andamian
Copy link
Author

andamian commented Apr 9, 2020

I'm adding to the thread the link to the google docs used for mapping between old parameters and the new ones. Needs to be reviewed and commented on.
https://docs.google.com/spreadsheets/d/1X0rrHuTMtU4Tp_CnBR1CdylffvNUIliTgdNMhnLCCws/edit#gid=0

@andamian
Copy link
Author

The query samples you've posted with the new API:

        alma = Alma()
        #alma.query_region(coordinate=orionkl_coords, radius=4 * u.arcmin,
        #                  public=False, science=False)
        result = alma.query_object('M83', public=False, science=False)
        assert len(result) > 0
        #TODO test public
        result = alma.query_tap(
            "select * from ivoa.ObsCore where obs_creator_name like '%Bally%'")
        assert result
        for row in result:
            assert 'Bally' in row['obs_creator_name']
        result = alma.query_tap\
            ("select * from ivoa.Obscore where proposal_id='2016.1.00165.S'")
        assert result
        for row in result:
            assert '2016.1.00165.S' == row['proposal_id']
        result = alma.query_tap(
            "select * from ivoa.Obscore where proposal_id='2017.1.01355.L' "
            "and target_name like '%G008.67%' and data_rights='Public'")
        assert result
        for row in result:
            assert '2017.1.01355.L' == row['proposal_id']
            assert 'Public' == row['data_rights']
            assert 'G008.67'in row['target_name']

        result = alma.query_region(
            coordinates.SkyCoord('5:35:14.461 -5:21:54.41', frame='fk5',
                                 unit=(u.hour, u.deg)), radius=0.034 * u.deg)
        assert result

        # result = alma.query_region(
        #     coordinates.SkyCoord('5:35:14.461 -5:21:54.41', frame='fk5',
        #                          unit=(u.hour, u.deg)), radius=0.034 * u.deg,
        #     payload={'energy.frequency-asu': '215 .. 220'})
        #
        result = alma.query_tap("select * from ivoa.ObsCore where proposal_id "
                                "like '2012.%' and data_rights='Public'")
        assert result
        for row in result:
            assert '2012.' in row['proposal_id']
            assert 'Public' == row['data_rights']

        result = alma.query(payload={'frequency': '96 .. 96.5'})
        assert result
        for row in result:
            # TODO not sure how to test this
            pass

        result = alma.query_object('M83', band_list=[4, 5, 8])
        assert result
        for row in result:
            assert row['band_list'] in ['4', '5', '8']

        result = alma.query_tap(
            "select * from ivoa.ObsCore where lower(obs_creator_name) like "
            "'%ginsburg%' and band_list = '6'")
        assert result
        for row in result:
            assert '6' == row['band_list']
            assert 'ginsburg' in row['obs_creator_name'].lower()

Looks OK? I didn't know what energy.frequency-asu is or how to find it in ALMA ObsCore.

@keflavich
Copy link
Contributor

Yes, looks good, though I definitely want a shortcut approach to those SQL queries!

I don't know what energy.frequency-asu is either. It looks like I was just trying to find a way to search within a frequency range.

You can put some of this stuff in as actual tests and I can give you some inline comments

@pep8speaks
Copy link

pep8speaks commented May 25, 2020

Hello @andamian! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-07-29 18:27:58 UTC

@andamian andamian changed the title Used SIAv2 with alma - initial version Used TAP with alma - initial version May 25, 2020
@andamian andamian changed the title Used TAP with alma - initial version Use TAP with alma - initial version May 25, 2020
@andamian andamian force-pushed the alma branch 3 times, most recently from 4aa6534 to 6982b5f Compare June 29, 2020 03:12
@codecov
Copy link

codecov bot commented Jun 29, 2020

Codecov Report

Merging #1689 into master will increase coverage by 0.27%.
The diff coverage is 79.35%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1689      +/-   ##
==========================================
+ Coverage   63.63%   63.91%   +0.27%     
==========================================
  Files         199      200       +1     
  Lines       15543    15729     +186     
==========================================
+ Hits         9891    10053     +162     
- Misses       5652     5676      +24     
Impacted Files Coverage Δ
astroquery/alma/utils.py 29.22% <11.11%> (+0.08%) ⬆️
astroquery/alma/core.py 34.11% <66.23%> (-1.32%) ⬇️
astroquery/alma/tapsql.py 94.28% <94.28%> (ø)
astroquery/alma/__init__.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6c79ac4...bf3a7fb. Read the comment docs.

@keflavich
Copy link
Contributor

We've been given a deadline of "this Wednesday", after which the present ALMA archive will be disabled.

There are problems.

All queries were failing with:

rslt = Alma.query_object('W51', pi_name='*Ginsburg*', public=False)
Traceback (most recent call last):
  File "<ipython-input-12-dbcc322179c1>", line 1, in <module>
    rslt = Alma.query_object('W51', pi_name='*Ginsburg*', public=False)
  File "/home/adam/repos/astroquery/astroquery/utils/class_or_instance.py", line 25, in f
    return self.fn(obj, *args, **kwds)
  File "/home/adam/repos/astroquery/astroquery/utils/process_asyncs.py", line 26, in newmethod
    response = getattr(self, async_method_name)(*args, **kwargs)
  File "/home/adam/repos/astroquery/astroquery/alma/core.py", line 245, in query_object_async
    payload=payload, **kwargs)
  File "/home/adam/repos/astroquery/astroquery/alma/core.py", line 343, in query_async
    result[_OBSCORE_TO_ALMARESULT[ii]] = result[ii]
  File "/home/adam/repos/astropy/astropy/table/table.py", line 1648, in __getitem__
    return self.columns[item]
  File "/home/adam/repos/astropy/astropy/table/table.py", line 239, in __getitem__
    return OrderedDict.__getitem__(self, item)
KeyError: 'frequency'

which resulted from a bad mapping..

for _ in result]
for ii in _OBSCORE_TO_ALMARESULT:
if _OBSCORE_TO_ALMARESULT[ii] not in result.columns:
# duplicate equivalent columns
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we doing this? For backward compatibility? I don't like it, I'd rather toss one or the other. I guess we can toss the old, stick with the new. We should add, though, a renaming utility using the _OBSCORE_TO_ALMARESULT mapping to allow users to easily recover the old naming scheme.

row = result[0]
for item in _OBSCORE_TO_ALMARESULT.items():
if item[0] == 't_min':
assert Time(row[item[0]], format='mjd').strftime('%d-%m-%Y') ==\
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my changes cause a failure here, but I suspect that's because the data are out of date - alma_onerow contains data that aren't returned from alma at present.

cache : bool
Cache the query?
The object name. Will be resolved by astropy.coord.SkyCoord
cache : deprecated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it necessary to deprecate caching? Does pyvo not support it? The caching functionality is extremely useful, even though it is problematic in several ways (we have no way to test if the cache is out of date, for instance)

'is_mosaic': 'Mosaic',
't_exptime': 'Integration',
'obs_release_date': 'Release date',
'frequency': 'Frequency support',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this & coi_name, but that triggered some remote test failures. My changes may therefore be incorrect and need to be reverted - feel free to do so, but let's make sure the docs examples work.

@andamian andamian force-pushed the alma branch 3 times, most recently from 645cca3 to 6518c9d Compare July 29, 2020 04:36
CHANGES.rst Outdated Show resolved Hide resolved
@ceb8 ceb8 self-requested a review July 29, 2020 16:50
Copy link
Member

@ceb8 ceb8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once the extra changes leaking into the PR issues has been resolved I approve this PR. Looks good.

@andamian andamian force-pushed the alma branch 2 times, most recently from 3bc127c to a1e5fcc Compare July 29, 2020 17:35
@bsipocz bsipocz added this to the v0.4.2 milestone Jul 31, 2020
@keflavich keflavich merged commit 2fe796d into astropy:master Aug 6, 2020
@bsipocz
Copy link
Member

bsipocz commented Aug 6, 2020

Thanks @andamian!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants