Fix unbound local with bad engine #16511

Merged
merged 3 commits into from May 31, 2017

Conversation

Projects
None yet
4 participants
@jtratner
Contributor

jtratner commented May 26, 2017

This was so small I figured simpler to put up a PR rather than issue then PR. :)

Previously, passing a bad engine to read_csv gave an less-than-informative UnboundLocalError:

Traceback (most recent call last):
  File "example_test.py", line 9, in <module>
    pd.read_csv(tfp.name, engine='pyt')
  File "/Users/jtratner/pandas/pandas/io/parsers.py", line 655, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "/Users/jtratner/pandas/pandas/io/parsers.py", line 405, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "/Users/jtratner/pandas/pandas/io/parsers.py", line 762, in __init__
    self._make_engine(self.engine)
  File "/Users/jtratner/pandas/pandas/io/parsers.py", line 972, in _make_engine
    self._engine = klass(self.f, **self.options)
UnboundLocalError: local variable 'klass' referenced before assignment

Now it gives a much nicer ValueError:

Traceback (most recent call last):
  File "example_test.py", line 9, in <module>
    pd.read_csv(fp, engine='pyt')
  File "/Users/jtratner/pandas/pandas/io/parsers.py", line 655, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "/Users/jtratner/pandas/pandas/io/parsers.py", line 405, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "/Users/jtratner/pandas/pandas/io/parsers.py", line 762, in __init__
    self._make_engine(self.engine)
  File "/Users/jtratner/pandas/pandas/io/parsers.py", line 974, in _make_engine
    ' or "python-fwf")' % engine)
ValueError: Unknown engine: 'pyt' (valid are "c", "python", or "python-fwf")
  • tests added / passed - added test that correct ValueError is generated
  • passes git diff upstream/master --name-only -- '*.py' | flake8 --diff
  • whatsnew entry

I was not sure where to stick the test or the whatsnew entry (or if a whatsnew entry is really necessary), so please tell me if I should move it elsewhere.

Cheers!

@jtratner

This comment has been minimized.

Show comment
Hide comment
@jtratner

jtratner May 26, 2017

Contributor

(or if you want to do this a totally different way / different error I can make changes or close)

Contributor

jtratner commented May 26, 2017

(or if you want to do this a totally different way / different error I can make changes or close)

doc/source/whatsnew/v0.21.0.txt
@@ -83,6 +83,8 @@ Performance Improvements
Bug Fixes
~~~~~~~~~
+- Passing an invalid engine to `read_csv` now raises an informative ValueError rather than UnboundLocalError. (:issue:`16511`)

This comment has been minimized.

@jreback

jreback May 26, 2017

Contributor

:func:read_csv is better, double back-ticks on ValueError.

@jreback

jreback May 26, 2017

Contributor

:func:read_csv is better, double back-ticks on ValueError.

This comment has been minimized.

@jreback

jreback May 26, 2017

Contributor

u can put in0.20.2

@jreback

jreback May 26, 2017

Contributor

u can put in0.20.2

pandas/tests/io/parser/test_parsers.py
@@ -99,3 +102,14 @@ def read_table(self, *args, **kwds):
kwds = kwds.copy()
kwds['engine'] = self.engine
return read_table(*args, **kwds)
+
+
+class TestParameterValidation(object):

This comment has been minimized.

@jreback

jreback May 26, 2017

Contributor

use tm.ensure_clean() as path

@jreback

jreback May 26, 2017

Contributor

use tm.ensure_clean() as path

This comment has been minimized.

@gfyoung

gfyoung May 26, 2017

Member

Let's move this test into common.py (in same directory). This base test class should not be touched for organizational purposes.

@gfyoung

gfyoung May 26, 2017

Member

Let's move this test into common.py (in same directory). This base test class should not be touched for organizational purposes.

This comment has been minimized.

@gfyoung

gfyoung May 26, 2017

Member

Also, why are we writing a round trip test? This can be much simpler:

data = "a\n1"
msg = "Unknown engine"
with tm.assert_raises_regex(ValueError, msg):
  read_csv(StringIO(data), engine='pyt')  # don't use self.read_csv because that will override the engine parameter

Oh and yes, use tm.assert_raises_regex instead of the pytest.raises(...) (pandas regex error message matching is a little more compact).

@gfyoung

gfyoung May 26, 2017

Member

Also, why are we writing a round trip test? This can be much simpler:

data = "a\n1"
msg = "Unknown engine"
with tm.assert_raises_regex(ValueError, msg):
  read_csv(StringIO(data), engine='pyt')  # don't use self.read_csv because that will override the engine parameter

Oh and yes, use tm.assert_raises_regex instead of the pytest.raises(...) (pandas regex error message matching is a little more compact).

This comment has been minimized.

@jtratner

jtratner May 26, 2017

Contributor

docstring for tm.assert_raises_regex says to use pytest.raises

@jtratner

jtratner May 26, 2017

Contributor

docstring for tm.assert_raises_regex says to use pytest.raises

This comment has been minimized.

@jtratner

jtratner May 26, 2017

Contributor

so okay that this will get run once for every engine, even though it's the same test?

@jtratner

jtratner May 26, 2017

Contributor

so okay that this will get run once for every engine, even though it's the same test?

This comment has been minimized.

@gfyoung

gfyoung May 26, 2017

Member

Yikes! You're right. We changed our minds about that. Mind fixing the documentation on that in a separate commit / PR?

@gfyoung

gfyoung May 26, 2017

Member

Yikes! You're right. We changed our minds about that. Mind fixing the documentation on that in a separate commit / PR?

This comment has been minimized.

@gfyoung

gfyoung May 26, 2017

Member

so okay that this will get run once for every engine, even though it's the same test?

Actually, better idea: move it to test_common.py (the directory above)

@gfyoung

gfyoung May 26, 2017

Member

so okay that this will get run once for every engine, even though it's the same test?

Actually, better idea: move it to test_common.py (the directory above)

@jreback

lgtm. minor comments.

pandas/tests/io/parser/test_parsers.py
+ def test_unknown_engine(self):
+ with tempfile.NamedTemporaryFile() as fp:
+ df = tm.makeDataFrame()
+ df.to_csv(fp.name)

This comment has been minimized.

@jreback

jreback May 26, 2017

Contributor

@gfyoung good location for this type of test?

@jreback

jreback May 26, 2017

Contributor

@gfyoung good location for this type of test?

This comment has been minimized.

@gfyoung

gfyoung May 26, 2017

Member

No. There shouldn't be any tests in this file. I made a comment here about it.

@gfyoung

gfyoung May 26, 2017

Member

No. There shouldn't be any tests in this file. I made a comment here about it.

@jtratner

This comment has been minimized.

Show comment
Hide comment
@jtratner

jtratner May 26, 2017

Contributor

made all the requested changes - thanks for the review @jreback !

Contributor

jtratner commented May 26, 2017

made all the requested changes - thanks for the review @jreback !

@codecov

This comment has been minimized.

Show comment
Hide comment
@codecov

codecov bot May 26, 2017

Codecov Report

Merging #16511 into master will decrease coverage by 0.36%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #16511      +/-   ##
==========================================
- Coverage   90.79%   90.43%   -0.37%     
==========================================
  Files         161      161              
  Lines       51063    51046      -17     
==========================================
- Hits        46363    46162     -201     
- Misses       4700     4884     +184
Flag Coverage Δ
#multiple 88.27% <100%> (-0.37%) ⬇️
#single 40.16% <0%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/io/parsers.py 95.33% <100%> (-0.33%) ⬇️
pandas/io/formats/excel.py 74.24% <0%> (-22.41%) ⬇️
pandas/io/excel.py 62.31% <0%> (-18.33%) ⬇️
pandas/conftest.py 95.83% <0%> (-0.6%) ⬇️
pandas/util/testing.py 80.79% <0%> (-0.2%) ⬇️
pandas/core/series.py 94.71% <0%> (-0.19%) ⬇️
pandas/core/generic.py 92.16% <0%> (-0.1%) ⬇️
pandas/core/resample.py 96.08% <0%> (-0.02%) ⬇️
pandas/core/reshape/pivot.py 95.08% <0%> (ø) ⬆️
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b0d9ee0...26a3300. Read the comment docs.

codecov bot commented May 26, 2017

Codecov Report

Merging #16511 into master will decrease coverage by 0.36%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #16511      +/-   ##
==========================================
- Coverage   90.79%   90.43%   -0.37%     
==========================================
  Files         161      161              
  Lines       51063    51046      -17     
==========================================
- Hits        46363    46162     -201     
- Misses       4700     4884     +184
Flag Coverage Δ
#multiple 88.27% <100%> (-0.37%) ⬇️
#single 40.16% <0%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/io/parsers.py 95.33% <100%> (-0.33%) ⬇️
pandas/io/formats/excel.py 74.24% <0%> (-22.41%) ⬇️
pandas/io/excel.py 62.31% <0%> (-18.33%) ⬇️
pandas/conftest.py 95.83% <0%> (-0.6%) ⬇️
pandas/util/testing.py 80.79% <0%> (-0.2%) ⬇️
pandas/core/series.py 94.71% <0%> (-0.19%) ⬇️
pandas/core/generic.py 92.16% <0%> (-0.1%) ⬇️
pandas/core/resample.py 96.08% <0%> (-0.02%) ⬇️
pandas/core/reshape/pivot.py 95.08% <0%> (ø) ⬆️
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b0d9ee0...26a3300. Read the comment docs.

@codecov

This comment has been minimized.

Show comment
Hide comment
@codecov

codecov bot May 26, 2017

Codecov Report

Merging #16511 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #16511      +/-   ##
==========================================
+ Coverage   90.79%   90.79%   +<.01%     
==========================================
  Files         161      161              
  Lines       51063    51064       +1     
==========================================
+ Hits        46363    46366       +3     
+ Misses       4700     4698       -2
Flag Coverage Δ
#multiple 88.63% <100%> (ø) ⬆️
#single 40.15% <0%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/io/parsers.py 95.66% <100%> (ø) ⬆️
pandas/core/indexes/datetimes.py 95.33% <0%> (+0.09%) ⬆️
pandas/compat/__init__.py 62.22% <0%> (+0.44%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b0d9ee0...7c5f2c4. Read the comment docs.

codecov bot commented May 26, 2017

Codecov Report

Merging #16511 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #16511      +/-   ##
==========================================
+ Coverage   90.79%   90.79%   +<.01%     
==========================================
  Files         161      161              
  Lines       51063    51064       +1     
==========================================
+ Hits        46363    46366       +3     
+ Misses       4700     4698       -2
Flag Coverage Δ
#multiple 88.63% <100%> (ø) ⬆️
#single 40.15% <0%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/io/parsers.py 95.66% <100%> (ø) ⬆️
pandas/core/indexes/datetimes.py 95.33% <0%> (+0.09%) ⬆️
pandas/compat/__init__.py 62.22% <0%> (+0.44%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b0d9ee0...7c5f2c4. Read the comment docs.

BUG: Fix uninformative error on bad CSV engine name
Previously had an UnboundLocalError - no fun!
pandas/io/parsers.py
@@ -969,6 +969,9 @@ def _make_engine(self, engine='c'):
klass = PythonParser
elif engine == 'python-fwf':
klass = FixedWidthFieldParser
+ else:
+ raise ValueError('Unknown engine: %r (valid are "c", "python",'

This comment has been minimized.

@gfyoung

gfyoung May 26, 2017

Member

How about valid options are... instead of valid are...

@gfyoung

gfyoung May 26, 2017

Member

How about valid options are... instead of valid are...

@gfyoung

This comment has been minimized.

Show comment
Hide comment
@gfyoung

gfyoung May 26, 2017

Member

@jtratner : Thanks for this! Seems like we need some more fuzzy-testing for read_csv...

One minor comment about the actual error message, and two bigger ones regarding the actual test.

Member

gfyoung commented May 26, 2017

@jtratner : Thanks for this! Seems like we need some more fuzzy-testing for read_csv...

One minor comment about the actual error message, and two bigger ones regarding the actual test.

doc/source/whatsnew/v0.20.2.txt
@@ -39,6 +39,9 @@ Bug Fixes
- Bug in using ``pathlib.Path`` or ``py.path.local`` objects with io functions (:issue:`16291`)
- Bug in ``DataFrame.update()`` with ``overwrite=False`` and ``NaN values`` (:issue:`15593`)
+- Passing an invalid engine to :func:`read_csv` now raises an informative
+ ValueError rather than UnboundLocalError. (:issue:`16511`)

This comment has been minimized.

@jreback

jreback May 26, 2017

Contributor

double backtics on ValueError and UnboundLocalError

@jreback

jreback May 26, 2017

Contributor

double backtics on ValueError and UnboundLocalError

pandas/io/parsers.py
@@ -969,6 +969,9 @@ def _make_engine(self, engine='c'):
klass = PythonParser
elif engine == 'python-fwf':
klass = FixedWidthFieldParser
+ else:
+ raise ValueError('Unknown engine: %r (valid are "c", "python",'
+ ' or "python-fwf")' % engine)

This comment has been minimized.

@jreback

jreback May 26, 2017

Contributor

can you use .format(...)

@jreback

jreback May 26, 2017

Contributor

can you use .format(...)

jtratner added some commits May 27, 2017

@jtratner

This comment has been minimized.

Show comment
Hide comment
@jtratner

jtratner May 27, 2017

Contributor

okay, covered everybody's comments and moved tests again

Contributor

jtratner commented May 27, 2017

okay, covered everybody's comments and moved tests again

@jtratner

This comment has been minimized.

Show comment
Hide comment
@jtratner

jtratner May 31, 2017

Contributor

@gfyoung @jreback - if either of you have a moment to look - all tests are green and I've made your changes.

Contributor

jtratner commented May 31, 2017

@gfyoung @jreback - if either of you have a moment to look - all tests are green and I've made your changes.

@gfyoung

This comment has been minimized.

Show comment
Hide comment
@gfyoung

gfyoung May 31, 2017

Member

@jtratner : LGTM!

Member

gfyoung commented May 31, 2017

@jtratner : LGTM!

@jreback jreback added this to the 0.20.2 milestone May 31, 2017

@jreback jreback merged commit 9b0ea41 into pandas-dev:master May 31, 2017

5 checks passed

ci/circleci Your tests passed on CircleCI!
Details
codecov/patch 100% of diff hit (target 50%)
Details
codecov/project 90.79% (target 82%)
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback May 31, 2017

Contributor

thanks!

Contributor

jreback commented May 31, 2017

thanks!

@jtratner jtratner deleted the jtratner:fix-unbound-local-with-bad-engine branch May 31, 2017

TomAugspurger added a commit to TomAugspurger/pandas that referenced this pull request Jun 1, 2017

TomAugspurger added a commit that referenced this pull request Jun 4, 2017

Kiv added a commit to Kiv/pandas that referenced this pull request Jun 11, 2017

stangirala added a commit to stangirala/pandas that referenced this pull request Jun 11, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment