Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_to_hdf_with_object_column_names and test_to_csv_defualt_encoding weird failures #19774

Closed
yarikoptic opened this issue Feb 19, 2018 · 2 comments
Labels
IO HDF5 read_hdf, HDFStore Linux Linux OS Testing pandas testing functions or related to the test suite Unreliable Test Unit tests that occasionally fail

Comments

@yarikoptic
Copy link
Contributor

While trying to build Debian package of 0.22 I am running into two strange test failures:

1

self = <pandas.tests.io.test_pytables.TestHDFStore object at 0x7ff583fbe110>

    def test_to_hdf_with_object_column_names(self):
        # GH9057
        # Writing HDF5 table format should only work for string-like
        # column types
    
        types_should_fail = [tm.makeIntIndex, tm.makeFloatIndex,
                             tm.makeDateIndex, tm.makeTimedeltaIndex,
                             tm.makePeriodIndex]
        types_should_run = [tm.makeStringIndex, tm.makeCategoricalIndex]
    
        if compat.PY3:
            types_should_run.append(tm.makeUnicodeIndex)
        else:
            types_should_fail.append(tm.makeUnicodeIndex)
    
        for index in types_should_fail:
            df = DataFrame(np.random.randn(10, 2), columns=index(2))
            with ensure_clean_path(self.path) as path:
                with catch_warnings(record=True):
                    with pytest.raises(
                        ValueError, msg=("cannot have non-object label "
                                         "DataIndexableCol")):
                        df.to_hdf(path, 'df', format='table',
>                                 data_columns=True)

../debian/tmp/usr/lib/python2.7/dist-packages/pandas/tests/io/test_pytables.py:5035: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../debian/tmp/usr/lib/python2.7/dist-packages/pandas/core/generic.py:1471: in to_hdf
    return pytables.to_hdf(path_or_buf, key, self, **kwargs)
../debian/tmp/usr/lib/python2.7/dist-packages/pandas/io/pytables.py:281: in to_hdf
    f(store)
../debian/tmp/usr/lib/python2.7/dist-packages/pandas/io/pytables.py:275: in <lambda>
    f = lambda store: store.put(key, value, **kwargs)
../debian/tmp/usr/lib/python2.7/dist-packages/pandas/io/pytables.py:866: in put
    self._write_to_group(key, value, append=append, **kwargs)
../debian/tmp/usr/lib/python2.7/dist-packages/pandas/io/pytables.py:1341: in _write_to_group
    s.write(obj=value, append=append, complib=complib, **kwargs)
../debian/tmp/usr/lib/python2.7/dist-packages/pandas/io/pytables.py:3924: in write
    self._handle.create_table(self.group, **options)
/usr/lib/python2.7/dist-packages/tables/file.py:1055: in create_table
    chunkshape=chunkshape, byteorder=byteorder)
/usr/lib/python2.7/dist-packages/tables/table.py:833: in __init__
    byteorder, _log)
/usr/lib/python2.7/dist-packages/tables/leaf.py:272: in __init__
    super(Leaf, self).__init__(parentnode, name, _log)
/usr/lib/python2.7/dist-packages/tables/node.py:266: in __init__
    self._v_objectid = self._g_create()
/usr/lib/python2.7/dist-packages/tables/table.py:1020: in _g_create
    self._v_new_title, self.filters.complib or '', obversion)
tables/tableextension.pyx:181: in tables.tableextension.Table._create_table
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???
E   KeyError: "Field named '\\u05d8\\u05e70\\u05e3\\u05d1\\u05dc\\u05dd\\u05e1\\u05dd7' not found."

FWIW :

(Pdb) print u'\u05d8\u05e70\u05e3\u05d1\u05dc\u05dd\u05e1\u05dd7'
טק0ףבלםסם7

This, and another test (pandas/tests/io/formats/test_to_csv.py::TestToCSV::test_to_csv_defualt_encoding) seems to pass if I run only them, without all the rest ... so smells like there could be some cross-tests dependency/shared state (that is my day today to figure out such tests, also in our datalad, heh):

2

../debian/tmp/usr/lib/python2.7/dist-packages/pandas/tests/io/formats/test_to_csv.py::TestToCSV::test_to_csv_defualt_encoding FAILED
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> traceback >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

self = <pandas.tests.io.formats.test_to_csv.TestToCSV object at 0x7ff58467a6d0>

    def test_to_csv_defualt_encoding(self):
        # GH17097
        df = DataFrame({'col': [u"AAAAA", u"ÄÄÄÄÄ", u"ßßßßß", u"聞聞聞聞聞"]})
    
        with tm.ensure_clean('test.csv') as path:
            # the default to_csv encoding in Python 2 is ascii, and that in
            # Python 3 is uft-8.
            if pd.compat.PY2:
                # the encoding argument parameter should be utf-8
                with tm.assert_raises_regex(UnicodeEncodeError, 'ascii'):
>                   df.to_csv(path)

../debian/tmp/usr/lib/python2.7/dist-packages/pandas/tests/io/formats/test_to_csv.py:21: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <pandas.util.testing._AssertRaisesContextmanager object at 0x7ff58467a1d0>, exc_type = None, exc_value = None, trace_back = None

    def __exit__(self, exc_type, exc_value, trace_back):
        expected = self.exception
    
        if not exc_type:
            exp_name = getattr(expected, "__name__", str(expected))
>           raise AssertionError("{name} not raised.".format(name=exp_name))
E           AssertionError: UnicodeEncodeError not raised.

../debian/tmp/usr/lib/python2.7/dist-packages/pandas/util/testing.py:2500: AssertionError
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> entering PDB >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> /home/yoh/deb/gits/pkg-exppsy/pandas/debian/tmp/usr/lib/python2.7/dist-packages/pandas/util/testing.py(2500)__exit__()
-> raise AssertionError("{name} not raised.".format(name=exp_name))

ok -- I tripple checked now that they pass if I just run them. Here is my sample invocation

export PYTHONPATH=`/bin/ls -d $PWD/debian/tmp/usr/lib/python$PY/*/`; \
export MPLCONFIGDIR=/home/yoh/deb/gits/pkg-exppsy/pandas/build HOME=/home/yoh/deb/gits/pkg-exppsy/pandas/build; \
python2.7 ci/print_versions.py; \
    cd build/; LOCALE_OVERRIDE=C xvfb-run -a -s "-screen 0 1280x1024x24 -noreset" \
      python2.7 -m pytest -s -v -m 'not single and not network and not disabled'   -m "not intel" --pdb $PYTHONPATH/pandas/tests/io/formats/test_to_csv.py::TestToCSV::test_to_csv_defualt_encoding $PYTHONPATH/pandas/tests/io/test_pytables.py::TestHDFStore::test_to_hdf_with_object_column_names;

and it is $PYTHONPATH/pandas when I want to run them all.

I wondered if anyone observed something like that before or has a clue on what could cause such behavior?

@gfyoung gfyoung added the Testing pandas testing functions or related to the test suite label Feb 21, 2018
@gfyoung
Copy link
Member

gfyoung commented Feb 21, 2018

@yarikoptic : Thanks for reporting this! Can't say that I have unfortunately. My first guess is to say that it's a locale problem, but maybe some of the other maintainers might have better insight on this.

raspbian-autopush pushed a commit to raspbian-packages/pandas that referenced this issue May 2, 2018
Origin: (Neuro)Debian
Bug: pandas-dev/pandas#19774
Last-Update: 2018-02-20


Gbp-Pq: Name deb_skip_difffailingtests
raspbian-autopush pushed a commit to raspbian-packages/pandas that referenced this issue Jun 5, 2018
Origin: (Neuro)Debian
Bug: pandas-dev/pandas#19774
Last-Update: 2018-02-20


Gbp-Pq: Name deb_skip_difffailingtests
raspbian-autopush pushed a commit to raspbian-packages/pandas that referenced this issue Jun 26, 2018
Origin: (Neuro)Debian
Bug: pandas-dev/pandas#19774
Last-Update: 2018-02-20


Gbp-Pq: Name deb_skip_difffailingtests
raspbian-autopush pushed a commit to raspbian-packages/pandas that referenced this issue Aug 9, 2018
Origin: (Neuro)Debian
Bug: pandas-dev/pandas#19774
Last-Update: 2018-02-20


Gbp-Pq: Name deb_skip_difffailingtests
raspbian-autopush pushed a commit to raspbian-packages/pandas that referenced this issue Mar 18, 2019
Origin: (Neuro)Debian
Bug: pandas-dev/pandas#19774
Last-Update: 2018-02-20


Gbp-Pq: Name deb_skip_difffailingtests
@jbrockmendel jbrockmendel added the IO HDF5 read_hdf, HDFStore label Jul 23, 2019
raspbian-autopush pushed a commit to raspbian-packages/pandas that referenced this issue Sep 27, 2019
Origin: (Neuro)Debian
Bug: pandas-dev/pandas#19774
Last-Update: 2018-02-20


Gbp-Pq: Name deb_skip_difffailingtests
raspbian-autopush pushed a commit to raspbian-packages/pandas that referenced this issue Oct 4, 2019
Origin: (Neuro)Debian
Bug: pandas-dev/pandas#19774
Last-Update: 2018-02-20


Gbp-Pq: Name deb_skip_difffailingtests
raspbian-autopush pushed a commit to raspbian-packages/pandas that referenced this issue Oct 31, 2019
Origin: (Neuro)Debian
Bug: pandas-dev/pandas#19774
Last-Update: 2018-02-20


Gbp-Pq: Name deb_skip_difffailingtests
@jbrockmendel jbrockmendel added the Unreliable Test Unit tests that occasionally fail label Dec 20, 2019
raspbian-autopush pushed a commit to raspbian-packages/pandas that referenced this issue Dec 27, 2019
Origin: (Neuro)Debian
Bug: pandas-dev/pandas#19774
Last-Update: 2018-02-20


Gbp-Pq: Name deb_skip_difffailingtests
@mroeschke mroeschke added the Linux Linux OS label Apr 19, 2020
@mroeschke
Copy link
Member

This version of pandas is pretty old and python 2.7 isnt supported anymore. If anyone could reproduce with a more modern version of pandas and python 3 that would be helpful, otherwise closing this issue for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO HDF5 read_hdf, HDFStore Linux Linux OS Testing pandas testing functions or related to the test suite Unreliable Test Unit tests that occasionally fail
Projects
None yet
Development

No branches or pull requests

4 participants