BUG/TST: read_html should follow pandas conventions when creating empty data #6447

cpcloud · 2014-02-22T21:33:01Z

closes #5129
closes #6445

This a very specific bug fix. Only using lxml exposes this bug, whereas using
bs4 raises an exception. lxml drops data which allows it to parse the multiindex
header differently and succeed.

cpcloud · 2014-02-24T20:24:15Z

@jreback what do u think about this? this bug makes me think we should change the default flavor to bs4 because that forces you to say "i'm ok with dropping data" whereas bs4 will keep data raise b/c there's empty data in the header rows

cpcloud · 2014-02-24T20:25:09Z

of course that is backwards incompatible .... but only in very few cases i think ... just not sure ... it's annyoing that the two parsers will not parse the same gvien the same data

cpcloud · 2014-02-24T20:26:11Z

regardless ... the nan should be changed to the empty string bc that's how the text parser detects empty multiindexes

jreback · 2014-02-24T21:19:57Z

no idea as I really don't use this

go with consistency if u can

cpcloud · 2014-02-27T17:48:10Z

@jreback going to merge after travis passes ...

jreback · 2014-02-27T18:08:24Z

sure

cpcloud · 2014-02-27T18:09:53Z

Alright I'm going to reopen this, I screwed something up with git.

BUG/TST: read_html should follow pandas conventions when creating empty data

…ty data

cpcloud self-assigned this Feb 23, 2014

cpcloud closed this Feb 27, 2014

cpcloud deleted the read-html-float-iterable-5129 branch February 27, 2014 17:59

cpcloud restored the read-html-float-iterable-5129 branch February 27, 2014 18:08

cpcloud deleted the read-html-float-iterable-5129 branch February 27, 2014 18:10

cpcloud restored the read-html-float-iterable-5129 branch February 27, 2014 18:10

cpcloud reopened this Feb 27, 2014

cpcloud added a commit that referenced this pull request Feb 27, 2014

Merge pull request #6447 from cpcloud/read-html-float-iterable-5129

fb6b803

BUG/TST: read_html should follow pandas conventions when creating empty data

cpcloud merged commit fb6b803 into pandas-dev:master Feb 27, 2014

cpcloud deleted the read-html-float-iterable-5129 branch February 27, 2014 19:02

BUG/TST: read_html should follow pandas conventions when creating emp…

1d573b4

…ty data

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG/TST: read_html should follow pandas conventions when creating empty data #6447

BUG/TST: read_html should follow pandas conventions when creating empty data #6447

cpcloud commented Feb 22, 2014

cpcloud commented Feb 24, 2014

cpcloud commented Feb 24, 2014

cpcloud commented Feb 24, 2014

jreback commented Feb 24, 2014

cpcloud commented Feb 27, 2014

jreback commented Feb 27, 2014

cpcloud commented Feb 27, 2014

BUG/TST: read_html should follow pandas conventions when creating empty data #6447

BUG/TST: read_html should follow pandas conventions when creating empty data #6447

Conversation

cpcloud commented Feb 22, 2014

cpcloud commented Feb 24, 2014

cpcloud commented Feb 24, 2014

cpcloud commented Feb 24, 2014

jreback commented Feb 24, 2014

cpcloud commented Feb 27, 2014

jreback commented Feb 27, 2014

cpcloud commented Feb 27, 2014