Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3.2rc1 Table tests that call pandas failed #8682

Closed
pllim opened this issue May 9, 2019 · 8 comments
Closed

3.2rc1 Table tests that call pandas failed #8682

pllim opened this issue May 9, 2019 · 8 comments

Comments

@pllim
Copy link
Member

pllim commented May 9, 2019

Is pandas 0.23.4 too old? UPDATE: Same errors with pandas 0.24.2.

platform win32 -- Python 3.7.1, pytest-4.1.0, py-1.7.0, pluggy-0.8.0

Running tests with Astropy version 3.2rc1.
Running tests in Miniconda3\envs\py37\lib\site-packages\astropy.

Date: 2019-05-09T14:24:44

Platform: Windows-10-10.0.17763-SP0

Executable: C:\...\Miniconda3\envs\py37\python.exe

Full Python Version: 
3.7.1 (default, Oct 28 2018, 08:39:03) [MSC v.1912 64 bit (AMD64)]

encodings: sys: utf-8, locale: cp1252, filesystem: utf-8
byteorder: little
float info: dig: 15, mant_dig: 15

Numpy: 1.16.2
Scipy: 1.1.0
Matplotlib: 3.0.3
h5py: 2.8.0
Pandas: 0.23.4
Asdf: 2.3.3
Cython: 0.29
Scikit-image: not available
astropy_helpers: 3.2rc1
Using Astropy options: remote_data: any.

Matplotlib: 3.0.3
Freetype: 2.9.1
rootdir: C:\..., inifile:
plugins: xdist-1.24.0, remotedata-0.3.1, openfiles-0.3.0, mpl-0.10, forked-0.2, doctestplus-0.1.3, arraydiff-0.2
collected 12639 items
_________________________ test_read_write_format[csv] _________________________

fmt = 'csv'

    @pytest.mark.parametrize('fmt', WRITE_FMTS)
    def test_read_write_format(fmt):
        """
        Test round-trip through pandas write/read for supported formats.
    
        :param fmt: format name, e.g. csv, html, json
        :return:
        """
        # Skip the reading tests
        if fmt == 'html' and not HAS_HTML_DEPS:
            pytest.skip('Missing lxml or bs4 + html5lib for HTML read/write test')
    
        pandas_fmt = 'pandas.' + fmt
        t = Table([[1, 2, 3], [1.0, 2.5, 5.0], ['a', 'b', 'c']])
        buf = StringIO()
        t.write(buf, format=pandas_fmt)
    
        buf.seek(0)
        t2 = Table.read(buf, format=pandas_fmt)
    
        assert t.colnames == t2.colnames
>       assert np.all(t == t2)
E       assert False
E        +  where False = <function all at 0x000002315B0B4EA0>(<Table length...3     5.0    c == <Table length=...3     5.0    c
E        +    where <function all at 0x000002315B0B4EA0> = np.all
E           Use -v to get the full diff)

..\Miniconda3\envs\py37\lib\site-packages\astropy\io\misc\tests\test_pandas.py:46: AssertionError
________________________ test_read_write_format[html] _________________________

fmt = 'html'

    @pytest.mark.parametrize('fmt', WRITE_FMTS)
    def test_read_write_format(fmt):
        """
        Test round-trip through pandas write/read for supported formats.
    
        :param fmt: format name, e.g. csv, html, json
        :return:
        """
        # Skip the reading tests
        if fmt == 'html' and not HAS_HTML_DEPS:
            pytest.skip('Missing lxml or bs4 + html5lib for HTML read/write test')
    
        pandas_fmt = 'pandas.' + fmt
        t = Table([[1, 2, 3], [1.0, 2.5, 5.0], ['a', 'b', 'c']])
        buf = StringIO()
        t.write(buf, format=pandas_fmt)
    
        buf.seek(0)
        t2 = Table.read(buf, format=pandas_fmt)
    
        assert t.colnames == t2.colnames
>       assert np.all(t == t2)
E       assert False
E        +  where False = <function all at 0x000002315B0B4EA0>(<Table length...3     5.0    c == <Table length=...3     5.0    c
E        +    where <function all at 0x000002315B0B4EA0> = np.all
E           Use -v to get the full diff)

..\Miniconda3\envs\py37\lib\site-packages\astropy\io\misc\tests\test_pandas.py:46: AssertionError
________________________ test_read_write_format[json] _________________________

fmt = 'json'

    @pytest.mark.parametrize('fmt', WRITE_FMTS)
    def test_read_write_format(fmt):
        """
        Test round-trip through pandas write/read for supported formats.
    
        :param fmt: format name, e.g. csv, html, json
        :return:
        """
        # Skip the reading tests
        if fmt == 'html' and not HAS_HTML_DEPS:
            pytest.skip('Missing lxml or bs4 + html5lib for HTML read/write test')
    
        pandas_fmt = 'pandas.' + fmt
        t = Table([[1, 2, 3], [1.0, 2.5, 5.0], ['a', 'b', 'c']])
        buf = StringIO()
        t.write(buf, format=pandas_fmt)
    
        buf.seek(0)
        t2 = Table.read(buf, format=pandas_fmt)
    
        assert t.colnames == t2.colnames
>       assert np.all(t == t2)
E       assert False
E        +  where False = <function all at 0x000002315B0B4EA0>(<Table length...3     5.0    c == <Table length=...3     5.0    c
E        +    where <function all at 0x000002315B0B4EA0> = np.all
E           Use -v to get the full diff)

..\Miniconda3\envs\py37\lib\site-packages\astropy\io\misc\tests\test_pandas.py:46: AssertionError
________________________ test_read_fixed_width_format _________________________

    def test_read_fixed_width_format():
        """Test reading with pandas read_fwf()
    
        """
        tbl = """\
        a   b   c
        1  2.0  a
        2  3.0  b"""
        buf = StringIO()
        buf.write(tbl)
    
        t = Table.read(tbl, format='ascii', guess=False)
    
        buf.seek(0)
        t2 = Table.read(buf, format='pandas.fwf')
    
        assert t.colnames == t2.colnames
>       assert np.all(t == t2)
E       assert False
E        +  where False = <function all at 0x000002315B0B4EA0>(<Table length...2     3.0    b == <Table length=...2     3.0    b
E        +    where <function all at 0x000002315B0B4EA0> = np.all
E           Use -v to get the full diff)

..\Miniconda3\envs\py37\lib\site-packages\astropy\io\misc\tests\test_pandas.py:66: AssertionError
___________________________ test_write_with_mixins ____________________________

    def test_write_with_mixins():
        """Writing a table with mixins just drops them via to_pandas()
    
        This also tests passing a kwarg to pandas read and write.
        """
        sc = SkyCoord([1, 2], [3, 4], unit='deg')
        q = [5, 6] * u.m
        qt = QTable([[1, 2], q, sc], names=['i', 'q', 'sc'])
    
        buf = StringIO()
        qt.write(buf, format='pandas.csv', sep=' ')
        exp = ['i q sc.ra sc.dec',
               '1 5.0 1.0 3.0',
               '2 6.0 2.0 4.0']
        assert buf.getvalue().splitlines() == exp
    
        # Read it back
        buf.seek(0)
        qt2 = Table.read(buf, format='pandas.csv', sep=' ')
        exp_t = ascii.read(exp)
        assert qt2.colnames == exp_t.colnames
>       assert np.all(qt2 == exp_t)
E       assert False
E        +  where False = <function all at 0x000002315B0B4EA0>(<Table length...   2.0     4.0 == <Table length=...   2.0     4.0
E        +    where <function all at 0x000002315B0B4EA0> = np.all
E           Use -v to get the full diff)

..\Miniconda3\envs\py37\lib\site-packages\astropy\io\misc\tests\test_pandas.py:90: AssertionError
@pllim pllim added this to the v3.2 milestone May 9, 2019
@pllim pllim changed the title 3.2rc1 test_read_write_format failed (CSV, HTML, JSON) 3.2rc1 Table tests that call pandas failed May 9, 2019
@pllim pllim added the testing label May 9, 2019
@pllim
Copy link
Member Author

pllim commented May 10, 2019

For the test_read_write_format[csv] case, I can reproduce the error locally as well, in addition to what is shown in the Appveyor failure in #8688.

from io import StringIO
from astropy.table import Table

fmt = 'csv'
pandas_fmt = 'pandas.' + fmt

t = Table([[1, 2, 3], [1.0, 2.5, 5.0], ['a', 'b', 'c']])
buf = StringIO()
t.write(buf, format=pandas_fmt)

buf.seek(0)
t2 = Table.read(buf, format=pandas_fmt)
>>> t
<Table length=3>
 col0   col1  col2
int32 float64 str1
----- ------- ----
    1     1.0    a
    2     2.5    b
    3     5.0    c

>>> t2
<Table length=3>
 col0   col1  col2
int64 float64 str1
----- ------- ----
    1     1.0    a
    2     2.5    b
    3     5.0    c

In this case, col0 was int32 before being written out, but turned into int64 on Windows when being read back in. I suspect all the other table failures reported here are of similar nature.

@taldcroft , what is the best way forward here? Do we mark these xfail for Windows? Do we relax the data type matching for Windows? Do we fix the underlying problem causing such a difference on Windows?

xref #8381

@pllim
Copy link
Member Author

pllim commented May 10, 2019

p.s. But but... it seems like pandas only has int64 regardless of OS (https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.dtypes.html)? Then, why is this not failing in Travis? 🤔

Hmm... Maybe https://stackoverflow.com/questions/29245848/what-are-all-the-dtypes-that-pandas-recognizes explains it?

@bsipocz
Copy link
Member

bsipocz commented May 10, 2019

Then, why is this not failing in Travis?

because the default int is int64 there and not int32, so the issue stays hidden? (But then I suppose it should be failing on circleCI 32bit).

@taldcroft
Copy link
Member

Yes, I don't understand the non-failure on circleCI. AFAIK the default for int on all windows platforms is int32, and this has been a continual source of pain in table test development.

My usual workaround is being explicit with int types, so e.g. use np.array([1, 2, 3], dtype=np.int64) instead of [1, 2, 3] in the Table() call.

@bsipocz
Copy link
Member

bsipocz commented May 10, 2019

Yes, I don't understand the non-failure on circleCI.

I got it, we don't install pandas in that build, so this test never gets run.

@pllim
Copy link
Member Author

pllim commented May 11, 2019

Aha! Adding pandas to 32-bit CircleCI reproduced the same failures: https://circleci.com/gh/astropy/astropy/31117

@pllim
Copy link
Member Author

pllim commented May 11, 2019

@taldcroft , would #8688 sufficiently address this? If so, please approve that PR so we can move on, thanks!

@pllim
Copy link
Member Author

pllim commented May 11, 2019

Fixed in #8688

@pllim pllim closed this as completed May 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants