Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datalink updates #2493

Merged
merged 23 commits into from
Aug 20, 2022
Merged

Datalink updates #2493

merged 23 commits into from
Aug 20, 2022

Conversation

at88mph
Copy link
Contributor

@at88mph at88mph commented Aug 15, 2022

Fixes #2489, replacement for #2438

@bsipocz
Copy link
Member

bsipocz commented Aug 15, 2022

Could you rebase, so we don't have the merges for branch update?

@codecov
Copy link

codecov bot commented Aug 15, 2022

Codecov Report

Merging #2493 (fefacc7) into main (3b37e03) will increase coverage by 0.07%.
The diff coverage is 72.91%.

@@            Coverage Diff             @@
##             main    #2493      +/-   ##
==========================================
+ Coverage   62.91%   62.99%   +0.07%     
==========================================
  Files         133      133              
  Lines       17313    17291      -22     
==========================================
- Hits        10893    10892       -1     
+ Misses       6420     6399      -21     
Impacted Files Coverage Δ
astroquery/alma/core.py 48.19% <72.91%> (+4.83%) ⬆️
astroquery/astrometry_net/core.py 47.64% <0.00%> (-0.83%) ⬇️
astroquery/simbad/core.py 90.64% <0.00%> (-0.21%) ⬇️
astroquery/esa/hubble/core.py 86.47% <0.00%> (+0.35%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@at88mph
Copy link
Contributor Author

at88mph commented Aug 16, 2022

Sure, @bsipocz . As we don't use rebase here, I'm a little unfamiliar with it. Are you looking to squash the Merge branch 'main' messages? git rebase datalink-updates main?

@bsipocz
Copy link
Member

bsipocz commented Aug 16, 2022

the rebase would remove those commits rather than squashing them. The trick is to rebase it on the astropy/main (or upstream/main branch depending on how you named the remote, rather than on main that is by default points to the one on your fork (and one that is most likely outdated compared to the upstream one, and thus the presence of those merge commits at the time of you try to pull it back to be up to date).

git rebase datalink-updates upstream/main. My muscle memory adds the -i for interactivity, that way you decide on the level of individual commits of what to do with them, but here a non-interactive rebase would do.

@bsipocz bsipocz added the alma label Aug 16, 2022
@bsipocz bsipocz added this to the v0.4.7 milestone Aug 16, 2022
@at88mph
Copy link
Contributor Author

at88mph commented Aug 16, 2022

Thanks @bsipocz , I think I figured it out. The history is much cleaner now.

Copy link
Member

@bsipocz bsipocz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actual line reviews are minor nitpicks, however there are 5 related test failures, as well as the xfailing test is still failing, though the mark should have been removed in this PR.

self._datalink_url = f"{self._get_dataarchive_url()}{DATALINK_SERVICE_PATH}"
except requests.exceptions.HTTPError as err:
log.debug(
f"ERROR getting the CADC registry: {str(err)}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and below, too

Suggested change
f"ERROR getting the CADC registry: {str(err)}")
f"ERROR getting the ALMA registry: {str(err)}")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

though maybe the whole message needs an update as at the end, if I see it correctly, you don't use the registry.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The entire message was inappropriate. It has been modified.

temp = res.to_table()
if ASTROPY_LT_4_1:
if commons.ASTROPY_LT_4_1:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@bsipocz
Copy link
Member

bsipocz commented Aug 18, 2022

I still see the remote test failures, one of them is here as an example:

astroquery/alma/tests/test_alma.py ............................                                                  [ 51%]
astroquery/alma/tests/test_alma_remote.py .......s.xFFF...xs..Fs..                                                                           [ 96%]
astroquery/alma/tests/test_alma_utils.py .                                                                                                   [ 98%]
docs/alma/alma.rst F                                                                                                                         [100%]

===================================================================== FAILURES =====================================================================
___________________________________________________________ TestAlma.test_download_data ____________________________________________________________

self = {'ID': 'uid://A001/X12a3/Xe9'}, post = False

    def execute_votable(self, post=False):
        """
        Submit the query and return the results as an AstroPy votable instance.
        As this is the level where qualified error messages are available,
        they are raised here instead of in the underlying execute_stream.
    
        Returns
        -------
        astropy.io.votable.tree.Table
           an Astropy votable Table instance
    
        Raises
        ------
        DALServiceError
           for errors connecting to or communicating with the service
        DALFormatError
           for errors parsing the VOTable response
    
        See Also
        --------
        astropy.io.votable
        DALServiceError
        DALFormatError
        DALQueryError
        """
        try:
>           return votableparse(self.execute_stream(post=post).read)

/Users/bsipocz/munka/devel/pyvo/pyvo/dal/query.py:243: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

args = (functools.partial(<bound method HTTPResponse.read of <urllib3.response.HTTPResponse object at 0x12e2cd930>>, decode_content=True),)
kwargs = {}, i = 0, msg = '"pedantic" was deprecated in version 5.0 and will be removed in a future version. '

    @functools.wraps(function)
    def wrapper(*args, **kwargs):
        for i in range(n):
            msg = message[i] or (f'"{old_name[i]}" was deprecated in '
                                 f'version {since[i]} and will be removed '
                                 'in a future version. ')
            # The only way to have oldkeyword inside the function is
            # that it is passed as kwarg because the oldkeyword
            # parameter was renamed to newkeyword.
            if old_name[i] in kwargs:
                value = kwargs.pop(old_name[i])
                # Display the deprecation warning only when it's not
                # pending.
                if not pending[i]:
                    if not message[i]:
                        if new_name[i] is not None:
                            msg += f'Use argument "{new_name[i]}" instead.'
                        elif alternative:
                            msg += f'\n        Use {alternative} instead.'
                    warnings.warn(msg, warning_type, stacklevel=2)
    
                # Check if the newkeyword was given as well.
                newarg_in_args = (position[i] is not None and
                                  len(args) > position[i])
                newarg_in_kwargs = new_name[i] in kwargs
    
                if newarg_in_args or newarg_in_kwargs:
                    if not pending[i]:
                        # If both are given print a Warning if relax is
                        # True or raise an Exception is relax is False.
                        if relax[i]:
                            warnings.warn(
                                f'"{old_name[i]}" and "{new_name[i]}" '
                                'keywords were set. '
                                f'Using the value of "{new_name[i]}".',
                                AstropyUserWarning)
                        else:
                            raise TypeError(
                                f'cannot specify both "{old_name[i]}" and '
                                f'"{new_name[i]}".')
                else:
                    # Pass the value of the old argument with the
                    # name of the new argument to the function
                    if new_name[i] is not None:
                        kwargs[new_name[i]] = value
                    # If old argument has no replacement, cast it back.
                    # https://github.com/astropy/astropy/issues/9914
                    else:
                        kwargs[old_name[i]] = value
    
            # Deprecated keyword without replacement is given as
            # positional argument.
            elif (not pending[i] and not new_name[i] and position[i] and
                  len(args) > position[i]):
                if alternative and not message[i]:
                    msg += f'\n        Use {alternative} instead.'
                warnings.warn(msg, warning_type, stacklevel=2)
    
>       return function(*args, **kwargs)

/Users/bsipocz/.pyenv/versions/3.10.0/lib/python3.10/site-packages/astropy/utils/decorators.py:546: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

source = functools.partial(<bound method HTTPResponse.read of <urllib3.response.HTTPResponse object at 0x12e2cd930>>, decode_content=True)
columns = None, invalid = 'exception', verify = 'ignore', chunk_size = 256, table_number = None, table_id = None, filename = None
unit_format = None, datatype_mapping = {}, _debug_python_based_parser = False

    @deprecated_renamed_argument('pedantic', 'verify', since='5.0')
    def parse(source, columns=None, invalid='exception', verify=None,
              chunk_size=tree.DEFAULT_CHUNK_SIZE, table_number=None,
              table_id=None, filename=None, unit_format=None,
              datatype_mapping=None, _debug_python_based_parser=False):
        """
        Parses a VOTABLE_ xml file (or file-like object), and returns a
        `~astropy.io.votable.tree.VOTableFile` object.
    
        Parameters
        ----------
        source : path-like or file-like
            Path or file-like object containing a VOTABLE_ xml file.
            If file, must be readable.
    
        columns : sequence of str, optional
            List of field names to include in the output.  The default is
            to include all fields.
    
        invalid : str, optional
            One of the following values:
    
                - 'exception': throw an exception when an invalid value is
                  encountered (default)
    
                - 'mask': mask out invalid values
    
        verify : {'ignore', 'warn', 'exception'}, optional
            When ``'exception'``, raise an error when the file violates the spec,
            otherwise either issue a warning (``'warn'``) or silently continue
            (``'ignore'``). Warnings may be controlled using the standard Python
            mechanisms.  See the `warnings` module in the Python standard library
            for more information. When not provided, uses the configuration setting
            ``astropy.io.votable.verify``, which defaults to 'ignore'.
    
            .. versionchanged:: 4.0
               ``verify`` replaces the ``pedantic`` argument, which will be
               deprecated in future.
            .. versionchanged:: 5.0
                The ``pedantic`` argument is deprecated.
    
        chunk_size : int, optional
            The number of rows to read before converting to an array.
            Higher numbers are likely to be faster, but will consume more
            memory.
    
        table_number : int, optional
            The number of table in the file to read in.  If `None`, all
            tables will be read.  If a number, 0 refers to the first table
            in the file, and only that numbered table will be parsed and
            read in.  Should not be used with ``table_id``.
    
        table_id : str, optional
            The ID of the table in the file to read in.  Should not be
            used with ``table_number``.
    
        filename : str, optional
            A filename, URL or other identifier to use in error messages.
            If *filename* is None and *source* is a string (i.e. a path),
            then *source* will be used as a filename for error messages.
            Therefore, *filename* is only required when source is a
            file-like object.
    
        unit_format : str, astropy.units.format.Base instance or None, optional
            The unit format to use when parsing unit attributes.  If a
            string, must be the name of a unit formatter. The built-in
            formats include ``generic``, ``fits``, ``cds``, and
            ``vounit``.  A custom formatter may be provided by passing a
            `~astropy.units.UnitBase` instance.  If `None` (default),
            the unit format to use will be the one specified by the
            VOTable specification (which is ``cds`` up to version 1.3 of
            VOTable, and ``vounit`` in more recent versions of the spec).
    
        datatype_mapping : dict, optional
            A mapping of datatype names (`str`) to valid VOTable datatype names
            (str). For example, if the file being read contains the datatype
            "unsignedInt" (an invalid datatype in VOTable), include the mapping
            ``{"unsignedInt": "long"}``.
    
        Returns
        -------
        votable : `~astropy.io.votable.tree.VOTableFile` object
    
        See Also
        --------
        astropy.io.votable.exceptions : The exceptions this function may raise.
        """
        from . import conf
    
        invalid = invalid.lower()
        if invalid not in ('exception', 'mask'):
            raise ValueError("accepted values of ``invalid`` are: "
                             "``'exception'`` or ``'mask'``.")
    
        if verify is None:
    
            conf_verify_lowercase = conf.verify.lower()
    
            # We need to allow verify to be booleans as strings since the
            # configuration framework doesn't make it easy/possible to have mixed
            # types.
            if conf_verify_lowercase in ['false', 'true']:
                verify = conf_verify_lowercase == 'true'
            else:
                verify = conf_verify_lowercase
    
        if isinstance(verify, bool):
            verify = 'exception' if verify else 'warn'
        elif verify not in VERIFY_OPTIONS:
            raise ValueError(f"verify should be one of {'/'.join(VERIFY_OPTIONS)}")
    
        if datatype_mapping is None:
            datatype_mapping = {}
    
        config = {
            'columns': columns,
            'invalid': invalid,
            'verify': verify,
            'chunk_size': chunk_size,
            'table_number': table_number,
            'filename': filename,
            'unit_format': unit_format,
            'datatype_mapping': datatype_mapping
        }
    
        if filename is None and isinstance(source, str):
            config['filename'] = source
    
        with iterparser.get_xml_iterator(
                source,
                _debug_python_based_parser=_debug_python_based_parser) as iterator:
            return tree.VOTableFile(
>               config=config, pos=(1, 1)).parse(iterator, config)

/Users/bsipocz/.pyenv/versions/3.10.0/lib/python3.10/site-packages/astropy/io/votable/table.py:160: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <VOTABLE>... 0 tables ...</VOTABLE>, iterator = <astropy.utils.xml._iterparser.IterParser object at 0x12de5b060>
config = {'_current_table_number': 0, 'chunk_size': 256, 'columns': None, 'datatype_mapping': {}, ...}

    def parse(self, iterator, config):
        config['_current_table_number'] = 0
    
        for start, tag, data, pos in iterator:
            if start:
                if tag == 'xml':
                    pass
                elif tag == 'VOTABLE':
                    if 'version' not in data:
                        warn_or_raise(W20, W20, self.version, config, pos)
                        config['version'] = self.version
                    else:
                        config['version'] = self._version = data['version']
                        if config['version'].lower().startswith('v'):
                            warn_or_raise(
                                W29, W29, config['version'], config, pos)
                            self._version = config['version'] = \
                                            config['version'][1:]
                        if config['version'] not in self._version_namespace_map:
                            vo_warn(W21, config['version'], config, pos)
    
                    if 'xmlns' in data:
                        ns_info = self._version_namespace_map.get(config['version'], {})
                        correct_ns = ns_info.get('namespace_uri')
                        if data['xmlns'] != correct_ns:
                            vo_warn(W41, (correct_ns, data['xmlns']), config, pos)
                    else:
                        vo_warn(W42, (), config, pos)
    
                    break
                else:
>                   vo_raise(E19, (), config, pos)

/Users/bsipocz/.pyenv/versions/3.10.0/lib/python3.10/site-packages/astropy/io/votable/tree.py:3600: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

exception_class = <class 'astropy.io.votable.exceptions.E19'>, args = ()
config = {'_current_table_number': 0, 'chunk_size': 256, 'columns': None, 'datatype_mapping': {}, ...}, pos = (2, 0)

    def vo_raise(exception_class, args=(), config=None, pos=None):
        """
        Raise an exception, with proper position information if available.
        """
        if config is None:
            config = {}
>       raise exception_class(args, config, pos)
E       astropy.io.votable.exceptions.E19: None:2:0: E19: File does not appear to be a VOTABLE

/Users/bsipocz/.pyenv/versions/3.10.0/lib/python3.10/site-packages/astropy/io/votable/exceptions.py:112: E19

During handling of the above exception, another exception occurred:

self = <astroquery.alma.tests.test_alma_remote.TestAlma object at 0x12e2263e0>
temp_dir = '/var/folders/9s/070g0pd502q70k3gffpxv8km0000gq/T/tmpgem0thuy', alma = <astroquery.alma.core.AlmaClass object at 0x12e2ce3e0>

    def test_download_data(self, temp_dir, alma):
        # test only fits files from a program
        alma.cache_location = temp_dir
    
        uid = 'uid://A001/X12a3/Xe9'
>       data_info = alma.get_data_info(uid, expand_tarfiles=True)

astroquery/alma/tests/test_alma_remote.py:197: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
astroquery/alma/core.py:562: in get_data_info
    res = self.datalink.run_sync(uid)
/Users/bsipocz/munka/devel/pyvo/pyvo/dal/adhoc.py:302: in run_sync
    return self.create_query(id, responseformat, **keywords).execute()
/Users/bsipocz/munka/devel/pyvo/pyvo/dal/adhoc.py:443: in execute
    return DatalinkResults(self.execute_votable(post),
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = {'ID': 'uid://A001/X12a3/Xe9'}, post = False

    def execute_votable(self, post=False):
        """
        Submit the query and return the results as an AstroPy votable instance.
        As this is the level where qualified error messages are available,
        they are raised here instead of in the underlying execute_stream.
    
        Returns
        -------
        astropy.io.votable.tree.Table
           an Astropy votable Table instance
    
        Raises
        ------
        DALServiceError
           for errors connecting to or communicating with the service
        DALFormatError
           for errors parsing the VOTable response
    
        See Also
        --------
        astropy.io.votable
        DALServiceError
        DALFormatError
        DALQueryError
        """
        try:
            return votableparse(self.execute_stream(post=post).read)
        except Exception as e:
            self.raise_if_error()
>           raise DALFormatError(e, self.queryurl)
E           pyvo.dal.exceptions.DALFormatError: E19: None:2:0: E19: File does not appear to be a VOTABLE

/Users/bsipocz/munka/devel/pyvo/pyvo/dal/query.py:246: DALFormatError

@at88mph
Copy link
Contributor Author

at88mph commented Aug 18, 2022

Thanks @bsipocz . I left these alone as I think they're covered by #2475 when I had to fix the remote tests due to the DataLink service update. Did that one get merged? If so, then I'll revisit this.

@at88mph
Copy link
Contributor Author

at88mph commented Aug 18, 2022

Sorry @bsipocz, disregard. I can reproduce it properly. Fixing now.

@at88mph
Copy link
Contributor Author

at88mph commented Aug 18, 2022

A bit of an embarrassing regression there, but it's fixed now. I've also had to adjust the URL used for the doctests as the NRAO host has been fairly unreliable. Finally, while the tests do pass, there seems to be a requests Session in the test left open somewhere as I see this message:

Exception ignored in: <socket.socket fd=14, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('172.17.0.2', 42962), raddr=('134.171.46.218', 80)>
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/sre_parse.py", line 499, in _parse
    subpatternappend = subpattern.append
ResourceWarning: unclosed <socket.socket fd=14, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('172.17.0.2', 42962), raddr=('134.171.46.218', 80)>

It doesn't hold anything up, but I hope it's not a smell of something deeper.

Comment on lines 233 to 238
base_url = self._get_dataarchive_url()

if base_url.endswith('/'):
self._datalink_url = f'{base_url}{DATALINK_SERVICE_PATH}'
else:
self._datalink_url = f'{base_url}/{DATALINK_SERVICE_PATH}'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
base_url = self._get_dataarchive_url()
if base_url.endswith('/'):
self._datalink_url = f'{base_url}{DATALINK_SERVICE_PATH}'
else:
self._datalink_url = f'{base_url}/{DATALINK_SERVICE_PATH}'
self._datalink_url = urljoin(self._get_dataarchive_url(), DATALINK_SERVICE_PATH)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion.

@bsipocz
Copy link
Member

bsipocz commented Aug 20, 2022

I've also had to adjust the URL used for the doctests as the NRAO host has been fairly unreliable.

perfect, thank you.

Copy link
Member

@bsipocz bsipocz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only have tiny nitpick comments that can be cleaned up next time.

Also, as I see the xfailing test def test_data_info is still failing, it would be great to fix that up, too.

However, none of the above are critical and as this alma issue has been brought up several times recently, I would rather go ahead and merge this PR now as is.

pytest.fail('Should not get here.')


# @patch('pyvo.dal.adhoc.DatalinkService', side_effect=_mocked_datalink_sync)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be cleaned up?

@@ -127,7 +127,9 @@ You can query by object name or by circular region:
.. doctest-remote-data::

>>> from astroquery.alma import Alma
>>> m83_data = Alma.query_object('M83')
>>> alma = Alma()
>>> alma.archive_url = 'https://almascience.eso.org' # optional to make doctest work
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether we should hide this from the documentation e.g. setting it in the config that is being hidden from the rendering, so users are not copy pasting it?

cc @keflavich?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this seems unnecessary to include in the documentation since users should default to the default auto-redirect tool?

But otoh, there should be an example of how to change servers because too often one or more are down

from io import StringIO
import os

import pytest
import requests
import unittest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these unused imports came back accidentally.

@bsipocz bsipocz merged commit 0883e92 into astropy:main Aug 20, 2022
@bsipocz
Copy link
Member

bsipocz commented Aug 20, 2022

Thank you @at88mph!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ALMA data download broken
4 participants