Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: fix HDFStore.append with all empty strings error (GH12242) #23435

Merged
merged 3 commits into from
Nov 1, 2018
Merged

BUG: fix HDFStore.append with all empty strings error (GH12242) #23435

merged 3 commits into from
Nov 1, 2018

Conversation

josham
Copy link
Contributor

@josham josham commented Oct 31, 2018

@pep8speaks
Copy link

Hello @josham! Thanks for submitting the PR.

@gfyoung gfyoung added Bug IO HDF5 read_hdf, HDFStore labels Oct 31, 2018
@@ -1216,6 +1216,7 @@ Notice how we now instead output ``np.nan`` itself instead of a stringified form
- :func:`read_sas()` will correctly parse sas7bdat files with data page types having also bit 7 set (so page type is 128 + 256 = 384) (:issue:`16615`)
- Bug in :meth:`detect_client_encoding` where potential ``IOError`` goes unhandled when importing in a mod_wsgi process due to restricted access to stdout. (:issue:`21552`)
- Bug in :func:`to_string()` that broke column alignment when ``index=False`` and width of first column's values is greater than the width of first column's header (:issue:`16839`, :issue:`13032`)
- Bug in :meth:`HDFStore.append` when appending a :class:`DataFrame` with an empty string column and min_itemsize < 8 (:issue:`12242`)
Copy link
Member

@gfyoung gfyoung Oct 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double backticks around min_itemsize

store.append('df', df1, min_itemsize={'x': 1})
store.append('df', df2, min_itemsize={'x': 1})
tm.assert_frame_equal(store.select('df'),
pd.concat([df1, df2]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of comparing to pd.concat. Can you explicitly construct the DataFrame?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you push this to a new test, the existing ones are really long already

@codecov
Copy link

codecov bot commented Oct 31, 2018

Codecov Report

Merging #23435 into master will decrease coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #23435      +/-   ##
==========================================
- Coverage   92.21%   92.21%   -0.01%     
==========================================
  Files         161      161              
  Lines       51189    51187       -2     
==========================================
- Hits        47204    47202       -2     
  Misses       3985     3985
Flag Coverage Δ
#multiple 90.64% <100%> (-0.01%) ⬇️
#single 42.22% <100%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/io/pytables.py 92.43% <100%> (ø) ⬆️
pandas/core/reshape/melt.py 97.32% <0%> (-0.03%) ⬇️
pandas/core/reshape/reshape.py 99.54% <0%> (-0.01%) ⬇️
pandas/core/reshape/pivot.py 96.55% <0%> (ø) ⬆️
pandas/core/reshape/tile.py 94.73% <0%> (ø) ⬆️
pandas/core/reshape/merge.py 94.01% <0%> (ø) ⬆️
pandas/core/reshape/util.py 100% <0%> (ø) ⬆️
pandas/core/reshape/concat.py 97.6% <0%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a4f9a44...d27fe02. Read the comment docs.

store.append('df', df1, min_itemsize={'x': 1})
store.append('df', df2, min_itemsize={'x': 1})
tm.assert_frame_equal(store.select('df'),
pd.concat([df1, df2]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you push this to a new test, the existing ones are really long already

@jreback jreback added this to the 0.24.0 milestone Oct 31, 2018
@jreback
Copy link
Contributor

jreback commented Nov 1, 2018

can you rebase again, the whatsnew is conflicted. ping on green.

@jreback
Copy link
Contributor

jreback commented Nov 1, 2018

lgtm. ping on green.

@josham
Copy link
Contributor Author

josham commented Nov 1, 2018

@jreback green

@jreback jreback merged commit 2eef865 into pandas-dev:master Nov 1, 2018
@jreback
Copy link
Contributor

jreback commented Nov 1, 2018

thanks @josham

@josham josham deleted the issue-12242 branch November 1, 2018 19:18
thoo added a commit to thoo/pandas that referenced this pull request Nov 3, 2018
…xamples

* repo_org/master: (66 commits)
  CLN: doc string (pandas-dev#23469)
  DOC: Add cookbook entry for triangular correlation matrix (GH22840) (pandas-dev#23032)
  add number of Errors, Warnings to scripts/validate_docstrings.py (pandas-dev#23150)
  BUG: Allow freq conversion from dt64 to period (pandas-dev#23460)
  ENH: Add FrozenList.union and .difference (pandas-dev#23394)
  REF: cython cleanup, typing, optimizations (pandas-dev#23464)
  strictness and checks for Timedelta _simple_new (pandas-dev#23433)
  Fixing flake8 problems new to flake8 3.6.0 (pandas-dev#23472)
  DOC: Updating the docstring of Series.dot  (pandas-dev#22890)
  TST: Fixturize series/test_analytics.py (pandas-dev#22755)
  BUG/ENH: Handle NonexistentTimeError in date rounding (pandas-dev#23406)
  PERF: speed up concat on Series by making _get_axis_number() a classmethod (pandas-dev#23404)
  REF: Remove DatetimelikeArrayMixin._shallow_copy (pandas-dev#23430)
  REF: strictness/simplification in DatetimeArray/Index _simple_new (pandas-dev#23431)
  REF: cython cleanup, typing, optimizations (pandas-dev#23456)
  TST: tweak Hypothesis configuration and idioms (pandas-dev#23441)
  BUG: fix HDFStore.append with all empty strings error (GH12242) (pandas-dev#23435)
  TST: Skip 32bit failing IntervalTree tests (pandas-dev#23442)
  BUG: Deprecate nthreads argument (pandas-dev#23112)
  style: fix import format at pandas/core/reshape (pandas-dev#23387)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO HDF5 read_hdf, HDFStore
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HDFStore.append fails when appending dataframe with empty string column for which min_itemsize < 8
4 participants