TST: test_encode in test_html #7927

Closed
jreback opened this Issue Aug 4, 2014 · 27 comments

Comments

Projects
None yet
2 participants
Contributor

jreback commented Aug 4, 2014

@cpcloud not sure if this is something I did (or didn't do)

I was testing the index sub-class on 3.4 (may have appeared on travis too)

/mnt/home/jreback/venv/py3.4/index/pandas/io/tests/data/html_encoding/chinese_utf-16.html'

======================================================================
ERROR: test_encode (pandas.io.tests.test_html.TestReadHtmlEncoding)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/mnt/home/jreback/venv/py3.4/index/pandas/io/tests/test_html.py", line 627, in test_encode
    from_string = self.read_string(f, encoding).pop()
  File "/mnt/home/jreback/venv/py3.4/index/pandas/io/tests/test_html.py", line 622, in read_string
    return self.read_html(fobj.read(), encoding=encoding, index_col=0)
  File "/mnt/home/jreback/venv/py3.4/index/pandas/io/tests/test_html.py", line 610, in read_html
    return read_html(*args, **kwargs)
  File "/mnt/home/jreback/venv/py3.4/index/pandas/io/html.py", line 843, in read_html
    parse_dates, tupleize_cols, thousands, attrs, encoding)
  File "/mnt/home/jreback/venv/py3.4/index/pandas/io/html.py", line 709, in _parse
    raise_with_traceback(retained)
  File "/mnt/home/jreback/venv/py3.4/index/pandas/compat/__init__.py", line 705, in raise_with_traceback
    raise exc.with_traceback(traceback)
TypeError: Cannot read object of type 'bytes'

----------------------------------------------------------------------
Ran 64 tests in 61.014s

FAILED (errors=1)
(py3.4)jreback@sheep:~/venv/py3.4/index$ nosetests pandas//io/tests/test_html.py  --pdb --pdb-failure^C
(py3.4)jreback@sheep:~/venv/py3.4/index$ python ci/print_versions.py 

INSTALLED VERSIONS
------------------
commit: d1c4fbb0d170cfaf920a27907c014e8cc45752d1
python: 3.4.0.beta.3
python-bits: 64
OS: Linux
OS-release: 2.6.32-5-amd64
machine: x86_64
processor: 
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: None
nose: 1.3.3
Cython: 0.20.2
numpy: 1.8.0
scipy: 0.13.3
statsmodels: None
IPython: None
sphinx: None
patsy: 0.3.0
scikits.timeseries: None
dateutil: 2.2
pytz: 2014.4
bottleneck: 0.8.0
tables: 3.1.0
numexpr: 2.4
matplotlib: 1.3.1
openpyxl: 2.0.4
xlrd: 0.9.3
xlwt: None
xlsxwriter: 0.5.6
lxml: 3.3.5
bs4: 4.3.2
html5lib: 0.999
httplib2: None
apiclient: None
rpy2: None
sqlalchemy: 0.9.6
pymysql: 0.6.1.None
psycopg2: 2.5.2 (dt dec pq3 ext)

jreback added this to the 0.15.0 milestone Aug 4, 2014

cpcloud self-assigned this Aug 4, 2014

Member

cpcloud commented Aug 4, 2014

looks like those farm animals are byte-ing.

sorry i couldn't resist

Member

cpcloud commented Aug 4, 2014

jokes aside not sure what's going on here let me take a look

Member

cpcloud commented Aug 14, 2014

@jreback any reason why this doesn't currently fail?

i can repro locally, but this failure isn't showing up on travis

Contributor

jreback commented Aug 14, 2014

I think this is using a pretty recent bs4 (4.3.2), not testing on travis with that

Member

cpcloud commented Aug 14, 2014

but that's not actually where the bug is. it happens because i don't check for bytes and str types in _read i only check for str types ... this is way before any parser libraries get called

Member

cpcloud commented Aug 14, 2014

this line in the tests

        with open(f, 'rb') as fobj:
            return self.read_html(fobj.read(), encoding=encoding, index_col=0)

should pass in a bytes type in py3 and this line

    elif isinstance(obj, string_types):

should test False (because string_types == (str,) in py3 and obj is bytes) and thus the type error

Member

cpcloud commented Aug 14, 2014

@jreback don't worry about it i'll figure it out

Member

cpcloud commented Aug 14, 2014

actually those tests aren't even being run on travis

https://travis-ci.org/cpcloud/pandas/jobs/32559013

Contributor

jreback commented Aug 14, 2014

really, they don't appear to be skipped?

Member

cpcloud commented Aug 14, 2014

i know it's strange ... i think there's an installation issue tho not sure

Member

cpcloud commented Aug 14, 2014

i changed to python setup.py develop in the conda builds

Member

cpcloud commented Aug 14, 2014

i don't see why that should matter tho

Member

cpcloud commented Aug 14, 2014

ok it turns out that does matter. when i install using the sdist method everything passes and when i run it using make develop it doesn't. mind boggling

the test is run tho which is why this is strange

Member

cpcloud commented Aug 14, 2014

@jreback any reason to use sdist vs make develop on travis?

Contributor

jreback commented Aug 14, 2014

no idea

Member

cpcloud commented Aug 14, 2014

oh duh this is totally a data path issue,

self.files is [] so nothing is actually iterated over

Member

cpcloud commented Aug 14, 2014

so after this let's try to wrap up the conda stuff, maybe wait a little longer to see if anyone replies to the py32 dropping

Contributor

jreback commented Aug 14, 2014

what are you doing about the numpy master build? can you support that in conda?

if not, maybe we leave the 3.2/numpy_master as is?

Member

cpcloud commented Aug 14, 2014

can we use the pandas wheel server to build things? if so i can build a nightly conda package for numpy master ...

3.2 is another story and will take longer, because i have to build all the required packages for python 3.2 and i have to build python 3.2 itself

Member

cpcloud commented Aug 14, 2014

i've almost got the python 3.2 dist ready 2 go, then i can start building the deps on that

Contributor

jreback commented Aug 14, 2014

ok that's fine.

but why not use have a install_conda.sh and install_pydata.sh (and call one or the other based on an env variable)? seems simple enough

Member

cpcloud commented Aug 14, 2014

could do that too

Member

cpcloud commented Aug 14, 2014

i guess i was just trying to use one thing do it all, which turned out to be harder than i might've thought mostly bc of py32

cpcloud closed this Aug 14, 2014

cpcloud reopened this Aug 14, 2014

Contributor

jreback commented Aug 14, 2014

I mean you could be specific about the requirements files (to avoid confusino)

requirements_conda-2.7.txt, requirements_pydata-3.2.txt

and such

Contributor

jreback commented Aug 14, 2014

then can leave numpy_master/3.2 alone (as sort of static cases)

Member

cpcloud commented Aug 14, 2014

i see ok i'll do that

Contributor

jreback commented Aug 14, 2014

yeh just have install.sh have a big if-then (or have 2 separate scripts) and just call depending on the variable (maybe just add another one)

BUILD_TYPE=conda|pydata or something

cpcloud closed this in #8030 Aug 14, 2014

cpcloud was unassigned by wesm Oct 12, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment