Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TST: test_encode in test_html #7927

Closed
jreback opened this issue Aug 4, 2014 · 27 comments · Fixed by #8030
Closed

TST: test_encode in test_html #7927

jreback opened this issue Aug 4, 2014 · 27 comments · Fixed by #8030
Labels
Bug IO Data IO issues that don't fit into a more specific label IO HTML read_html, to_html, Styler.apply, Styler.applymap Testing pandas testing functions or related to the test suite
Milestone

Comments

@jreback
Copy link
Contributor

jreback commented Aug 4, 2014

@cpcloud not sure if this is something I did (or didn't do)

I was testing the index sub-class on 3.4 (may have appeared on travis too)

/mnt/home/jreback/venv/py3.4/index/pandas/io/tests/data/html_encoding/chinese_utf-16.html'

======================================================================
ERROR: test_encode (pandas.io.tests.test_html.TestReadHtmlEncoding)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/mnt/home/jreback/venv/py3.4/index/pandas/io/tests/test_html.py", line 627, in test_encode
    from_string = self.read_string(f, encoding).pop()
  File "/mnt/home/jreback/venv/py3.4/index/pandas/io/tests/test_html.py", line 622, in read_string
    return self.read_html(fobj.read(), encoding=encoding, index_col=0)
  File "/mnt/home/jreback/venv/py3.4/index/pandas/io/tests/test_html.py", line 610, in read_html
    return read_html(*args, **kwargs)
  File "/mnt/home/jreback/venv/py3.4/index/pandas/io/html.py", line 843, in read_html
    parse_dates, tupleize_cols, thousands, attrs, encoding)
  File "/mnt/home/jreback/venv/py3.4/index/pandas/io/html.py", line 709, in _parse
    raise_with_traceback(retained)
  File "/mnt/home/jreback/venv/py3.4/index/pandas/compat/__init__.py", line 705, in raise_with_traceback
    raise exc.with_traceback(traceback)
TypeError: Cannot read object of type 'bytes'

----------------------------------------------------------------------
Ran 64 tests in 61.014s

FAILED (errors=1)
(py3.4)jreback@sheep:~/venv/py3.4/index$ nosetests pandas//io/tests/test_html.py  --pdb --pdb-failure^C
(py3.4)jreback@sheep:~/venv/py3.4/index$ python ci/print_versions.py 

INSTALLED VERSIONS
------------------
commit: d1c4fbb0d170cfaf920a27907c014e8cc45752d1
python: 3.4.0.beta.3
python-bits: 64
OS: Linux
OS-release: 2.6.32-5-amd64
machine: x86_64
processor: 
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: None
nose: 1.3.3
Cython: 0.20.2
numpy: 1.8.0
scipy: 0.13.3
statsmodels: None
IPython: None
sphinx: None
patsy: 0.3.0
scikits.timeseries: None
dateutil: 2.2
pytz: 2014.4
bottleneck: 0.8.0
tables: 3.1.0
numexpr: 2.4
matplotlib: 1.3.1
openpyxl: 2.0.4
xlrd: 0.9.3
xlwt: None
xlsxwriter: 0.5.6
lxml: 3.3.5
bs4: 4.3.2
html5lib: 0.999
httplib2: None
apiclient: None
rpy2: None
sqlalchemy: 0.9.6
pymysql: 0.6.1.None
psycopg2: 2.5.2 (dt dec pq3 ext)
@jreback jreback added this to the 0.15.0 milestone Aug 4, 2014
@cpcloud cpcloud self-assigned this Aug 4, 2014
@cpcloud
Copy link
Member

cpcloud commented Aug 4, 2014

looks like those farm animals are byte-ing.

sorry i couldn't resist

@cpcloud
Copy link
Member

cpcloud commented Aug 4, 2014

jokes aside not sure what's going on here let me take a look

@cpcloud
Copy link
Member

cpcloud commented Aug 14, 2014

@jreback any reason why this doesn't currently fail?

i can repro locally, but this failure isn't showing up on travis

@jreback
Copy link
Contributor Author

jreback commented Aug 14, 2014

I think this is using a pretty recent bs4 (4.3.2), not testing on travis with that

@cpcloud
Copy link
Member

cpcloud commented Aug 14, 2014

but that's not actually where the bug is. it happens because i don't check for bytes and str types in _read i only check for str types ... this is way before any parser libraries get called

@cpcloud
Copy link
Member

cpcloud commented Aug 14, 2014

this line in the tests

        with open(f, 'rb') as fobj:
            return self.read_html(fobj.read(), encoding=encoding, index_col=0)

should pass in a bytes type in py3 and this line

    elif isinstance(obj, string_types):

should test False (because string_types == (str,) in py3 and obj is bytes) and thus the type error

@cpcloud
Copy link
Member

cpcloud commented Aug 14, 2014

@jreback don't worry about it i'll figure it out

@cpcloud
Copy link
Member

cpcloud commented Aug 14, 2014

actually those tests aren't even being run on travis

https://travis-ci.org/cpcloud/pandas/jobs/32559013

@jreback
Copy link
Contributor Author

jreback commented Aug 14, 2014

really, they don't appear to be skipped?

@cpcloud
Copy link
Member

cpcloud commented Aug 14, 2014

i know it's strange ... i think there's an installation issue tho not sure

@cpcloud
Copy link
Member

cpcloud commented Aug 14, 2014

i changed to python setup.py develop in the conda builds

@cpcloud
Copy link
Member

cpcloud commented Aug 14, 2014

i don't see why that should matter tho

@cpcloud
Copy link
Member

cpcloud commented Aug 14, 2014

ok it turns out that does matter. when i install using the sdist method everything passes and when i run it using make develop it doesn't. mind boggling

the test is run tho which is why this is strange

@cpcloud
Copy link
Member

cpcloud commented Aug 14, 2014

@jreback any reason to use sdist vs make develop on travis?

@jreback
Copy link
Contributor Author

jreback commented Aug 14, 2014

no idea

@cpcloud
Copy link
Member

cpcloud commented Aug 14, 2014

oh duh this is totally a data path issue,

self.files is [] so nothing is actually iterated over

@cpcloud
Copy link
Member

cpcloud commented Aug 14, 2014

so after this let's try to wrap up the conda stuff, maybe wait a little longer to see if anyone replies to the py32 dropping

@jreback
Copy link
Contributor Author

jreback commented Aug 14, 2014

what are you doing about the numpy master build? can you support that in conda?

if not, maybe we leave the 3.2/numpy_master as is?

@cpcloud
Copy link
Member

cpcloud commented Aug 14, 2014

can we use the pandas wheel server to build things? if so i can build a nightly conda package for numpy master ...

3.2 is another story and will take longer, because i have to build all the required packages for python 3.2 and i have to build python 3.2 itself

@cpcloud
Copy link
Member

cpcloud commented Aug 14, 2014

i've almost got the python 3.2 dist ready 2 go, then i can start building the deps on that

@jreback
Copy link
Contributor Author

jreback commented Aug 14, 2014

ok that's fine.

but why not use have a install_conda.sh and install_pydata.sh (and call one or the other based on an env variable)? seems simple enough

@cpcloud
Copy link
Member

cpcloud commented Aug 14, 2014

could do that too

@cpcloud
Copy link
Member

cpcloud commented Aug 14, 2014

i guess i was just trying to use one thing do it all, which turned out to be harder than i might've thought mostly bc of py32

@cpcloud cpcloud closed this as completed Aug 14, 2014
@cpcloud cpcloud reopened this Aug 14, 2014
@jreback
Copy link
Contributor Author

jreback commented Aug 14, 2014

I mean you could be specific about the requirements files (to avoid confusino)

requirements_conda-2.7.txt, requirements_pydata-3.2.txt

and such

@jreback
Copy link
Contributor Author

jreback commented Aug 14, 2014

then can leave numpy_master/3.2 alone (as sort of static cases)

@cpcloud
Copy link
Member

cpcloud commented Aug 14, 2014

i see ok i'll do that

@jreback
Copy link
Contributor Author

jreback commented Aug 14, 2014

yeh just have install.sh have a big if-then (or have 2 separate scripts) and just call depending on the variable (maybe just add another one)

BUILD_TYPE=conda|pydata or something

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO Data IO issues that don't fit into a more specific label IO HTML read_html, to_html, Styler.apply, Styler.applymap Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants