New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Python parser now accepts delim_whitespace=True #12958

Closed
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
2 participants
@gfyoung
Member

gfyoung commented Apr 22, 2016

Title is self-explanatory.

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Apr 22, 2016

Contributor

can you confirm with #12686 that all options are covered?

Contributor

jreback commented Apr 22, 2016

can you confirm with #12686 that all options are covered?

@jreback jreback added the IO CSV label Apr 22, 2016

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Apr 22, 2016

Contributor

I think doc-string and io.rst need small edits for this (to remove that this is only for c engine)

Contributor

jreback commented Apr 22, 2016

I think doc-string and io.rst need small edits for this (to remove that this is only for c engine)

@jreback jreback added this to the 0.18.1 milestone Apr 22, 2016

@gfyoung

This comment has been minimized.

Show comment
Hide comment
@gfyoung

gfyoung Apr 22, 2016

Member

I don't quite understand your question. Could you clarify?

Ah, yes, good point. Need to change what I just added since it's supported for Python now!

Member

gfyoung commented Apr 22, 2016

I don't quite understand your question. Could you clarify?

Ah, yes, good point. Need to change what I just added since it's supported for Python now!

@jreback

View changes

Show outdated Hide outdated pandas/io/tests/test_parsers.py Outdated
@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Apr 22, 2016

Contributor

I meant we are now listing all of the options in that issue so that we can check them off. And you have them listed in the variables to check for the options, just want a cross check.

bonus points if you can autogenerate a section in the doc-string which explicity lists the unsupported options (by using those variables). Or maybe don't need to auto-generate, maybe just list a section in doc-string & io.rst to at least list the un-described / only certain engine supported options.

Contributor

jreback commented Apr 22, 2016

I meant we are now listing all of the options in that issue so that we can check them off. And you have them listed in the variables to check for the options, just want a cross check.

bonus points if you can autogenerate a section in the doc-string which explicity lists the unsupported options (by using those variables). Or maybe don't need to auto-generate, maybe just list a section in doc-string & io.rst to at least list the un-described / only certain engine supported options.

@gfyoung

This comment has been minimized.

Show comment
Hide comment
@gfyoung

gfyoung Apr 22, 2016

Member

Well, they used to be a copy of _c_parser_defaults, which I would consider to be the "correct" version until demonstrated otherwise (Python will raise if any of these are passed with a python engine specified). So in fact the list for C-only features in #12686 should be the one I created in this PR, which it is not.

Member

gfyoung commented Apr 22, 2016

Well, they used to be a copy of _c_parser_defaults, which I would consider to be the "correct" version until demonstrated otherwise (Python will raise if any of these are passed with a python engine specified). So in fact the list for C-only features in #12686 should be the one I created in this PR, which it is not.

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Apr 22, 2016

Contributor

@gfyoung if you can post a correct list I will update

Contributor

jreback commented Apr 22, 2016

@gfyoung if you can post a correct list I will update

@gfyoung

This comment has been minimized.

Show comment
Hide comment
@gfyoung

gfyoung Apr 22, 2016

Member

FYI, Travis has just given the green light for my changes, so if you don't have any other code-related complaints, I'll update the documentation with [ci skip].

Member

gfyoung commented Apr 22, 2016

FYI, Travis has just given the green light for my changes, so if you don't have any other code-related complaints, I'll update the documentation with [ci skip].

@gfyoung

This comment has been minimized.

Show comment
Hide comment
@gfyoung

gfyoung Apr 22, 2016

Member
as_recarray
na_filter
compact_ints
use_unsigned
low_memory
memory_map
buffer_lines
error_bad_lines
warn_bad_lines
dtype
decimal
float_precision

If any of these are non-default with a python engine, the parser will raise.

Member

gfyoung commented Apr 22, 2016

as_recarray
na_filter
compact_ints
use_unsigned
low_memory
memory_map
buffer_lines
error_bad_lines
warn_bad_lines
dtype
decimal
float_precision

If any of these are non-default with a python engine, the parser will raise.

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Apr 22, 2016

Contributor

ok I edited the other issue, pls have a look

Contributor

jreback commented Apr 22, 2016

ok I edited the other issue, pls have a look

@gfyoung

This comment has been minimized.

Show comment
Hide comment
@gfyoung

gfyoung Apr 22, 2016

Member

Are those uncapitalised headers (e.g. "undocumented in C engine only") sub sections for "Features supported in the C engine only"? If not, all of the ones that I listed just now should also go under "Features supported in the C engine only"

Member

gfyoung commented Apr 22, 2016

Are those uncapitalised headers (e.g. "undocumented in C engine only") sub sections for "Features supported in the C engine only"? If not, all of the ones that I listed just now should also go under "Features supported in the C engine only"

@@ -101,8 +101,7 @@ delim_whitespace : boolean, default False
Specifies whether or not whitespace (e.g. ``' '`` or ``'\t'``)
will be used as the delimiter. Equivalent to setting ``sep='\+s'``.
If this option is set to True, nothing should be passed in for the
``delimiter`` parameter. This parameter is currently supported for
the C parser only.
``delimiter`` parameter.

This comment has been minimized.

@jreback

jreback Apr 22, 2016

Contributor

add that this was added in 0.18.1

@jreback

jreback Apr 22, 2016

Contributor

add that this was added in 0.18.1

This comment has been minimized.

@jreback

jreback Apr 22, 2016

Contributor

@gfyoung if you can update this, and lmlk when you push.

@jreback

jreback Apr 22, 2016

Contributor

@gfyoung if you can update this, and lmlk when you push.

This comment has been minimized.

@gfyoung

gfyoung Apr 22, 2016

Member

Done.

@gfyoung

gfyoung Apr 22, 2016

Member

Done.

@@ -57,7 +57,7 @@
Specifies whether or not whitespace (e.g. ``' '`` or ``'\t'``) will be
used as the sep. Equivalent to setting ``sep='\+s'``. If this option
is set to True, nothing should be passed in for the ``delimiter``
parameter. This parameter is currently supported for the C parser only.
parameter.

This comment has been minimized.

@jreback

jreback Apr 22, 2016

Contributor

same here

@jreback

jreback Apr 22, 2016

Contributor

same here

This comment has been minimized.

@gfyoung

gfyoung Apr 22, 2016

Member

Done.

@gfyoung

gfyoung Apr 22, 2016

Member

Done.

@@ -647,8 +660,13 @@ def _get_options_with_defaults(self, engine):
value = kwds[argname]
if engine != 'c' and value != default:
raise ValueError('The %r option is not supported with the'
' %r engine' % (argname, engine))
if ('python' in engine and

This comment has been minimized.

@jreback

jreback Apr 22, 2016

Contributor

engine == 'python'

@jreback

jreback Apr 22, 2016

Contributor

engine == 'python'

This comment has been minimized.

@jreback

jreback Apr 22, 2016

Contributor

nvm. we have python-fwf as well

@jreback

jreback Apr 22, 2016

Contributor

nvm. we have python-fwf as well

This comment has been minimized.

@gfyoung

gfyoung Apr 22, 2016

Member

Yep, that's why I wrote it like that.

@gfyoung

gfyoung Apr 22, 2016

Member

Yep, that's why I wrote it like that.

@@ -4652,6 +4655,9 @@ def test_invalid_c_parser_opts_with_not_c_parser(self):
engines = 'python', 'python-fwf'
for default in c_defaults:
for engine in engines:
if 'python' in engine and default not in py_unsupported:

This comment has been minimized.

@jreback

jreback Apr 22, 2016

Contributor

does this fully test the converse. e.g. python only options on c-parser?

@jreback

jreback Apr 22, 2016

Contributor

does this fully test the converse. e.g. python only options on c-parser?

This comment has been minimized.

@gfyoung

gfyoung Apr 22, 2016

Member

The test name indicates that the answer is no. However, I do believe that the invalid parameters are tested but in different parts of the test suite (again, this is where a refactor would be good in a separate PR).

@gfyoung

gfyoung Apr 22, 2016

Member

The test name indicates that the answer is no. However, I do believe that the invalid parameters are tested but in different parts of the test suite (again, this is where a refactor would be good in a separate PR).

This comment has been minimized.

@jreback

jreback Apr 22, 2016

Contributor

ok another PR is fine.

ideally we have some tests which will explicity test for the unsupported features (so that when one IS defined in say the python-parser that was previously only in the c-parser) it will force an explict change (in the PR).

@jreback

jreback Apr 22, 2016

Contributor

ok another PR is fine.

ideally we have some tests which will explicity test for the unsupported features (so that when one IS defined in say the python-parser that was previously only in the c-parser) it will force an explict change (in the PR).

This comment has been minimized.

@gfyoung

gfyoung Apr 22, 2016

Member

Yep, I certainly agree. Will make note for that PR.

@gfyoung

gfyoung Apr 22, 2016

Member

Yep, I certainly agree. Will make note for that PR.

@jreback jreback closed this in 8a9e643 Apr 22, 2016

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Apr 22, 2016

Contributor

ty

Contributor

jreback commented Apr 22, 2016

ty

@gfyoung gfyoung deleted the gfyoung:delim-whitespace-python branch Apr 22, 2016

nps added a commit to nps/pandas that referenced this pull request May 17, 2016

ENH: Python parser now accepts delim_whitespace=True
Title is self-explanatory.

Author: gfyoung <gfyoung17@gmail.com>

Closes pandas-dev#12958 from gfyoung/delim-whitespace-python and squashes the following commits:

08da127 [gfyoung] ENH: Python parser now accepts delim_whitespace=True
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment