Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

combine_first behavior change #2525

Closed
craustin opened this issue Dec 13, 2012 · 3 comments
Closed

combine_first behavior change #2525

craustin opened this issue Dec 13, 2012 · 3 comments
Labels
Testing pandas testing functions or related to the test suite
Milestone

Comments

@craustin
Copy link

from pandas import DataFrame
from datetime import datetime
df = DataFrame({'US': [1]}, index=[datetime(2012,1,1)])
df2 = DataFrame({}, columns=['EU'])
df.combine_first(df2)

In 0.9.0, this returns:

            US
2012-01-01   1

In 0.10.0b1, this returns:

            EU  US
2012-01-01 NaN   1
@ghost
Copy link

ghost commented Dec 14, 2012

this was 9edd478 due to #2307,
it's mentioned in RELEASE.rst (a bit opaquely).

Note that as far back as 0.9.0 the docstring read:

Combine two DataFrame objects and default to non-null values in frame
calling the method. Result index will be the union of the two indexes

so this was a long-standing bug fixed.

@ghost
Copy link

ghost commented Dec 14, 2012

the columns behaviour is not clearly documented, though.
@wesm, what's the right thing here?

@wesm
Copy link
Member

wesm commented Dec 14, 2012

Yes, it should be the union of the columns. I added a unit test-- sorry for the disruption, but I think doing otherwise is a bug

@wesm wesm closed this as completed Dec 14, 2012
yarikoptic added a commit to neurodebian/pandas that referenced this issue Dec 20, 2012
* commit 'v0.10.0b1-51-gbbe2fc1': (518 commits)
  BLD: add patsy, numexpr to ci/print_versions.py
  BUG: fix DataFrame.icol with list of integers when columns are integers with duplicates. close pandas-dev#2259
  TST: unit test to assert behavior described in pandas-dev#2525
  BUG: compat OrderedDict import for python 2.6
  DOC: Emphazise that cython is needed when installing from the repo in install.rst
  BLD: document pytz as a hard dependency
  BUG: import OrderedDict from util.compat for 2.6
  BUG: df.from_dict should respect OrderedDict 2517
  start date -> 7/1/12
  ENH: vbench support for HDFStore      added benchmarchs to compare (100,000) rows:        read/write store        read/write store mixed        read/write table        read/write table wide (200 columns)        read/write table mixed        query wide/table
  TST: refactoring to speed up test suite
  BUG: more floating point error robustness in rolling mean. close pandas-dev#2527
  BUG: fix python 3 zip usage
  DOC: updated HDFStore docs for indexing support and better explanations on how to deal with strings in indexables/values
  ENH: allow index recreation by calling create_table_index with new parameters
  BUG: fixed versioning of the data, not reporting correct warnings
  BUG: fixed string appending when length of subsequent is longer/shorter that existing      removed meta data saving      disable memory tests (and put a try:except: around it)
  DOC: small doc change w.r.t. min_itemsize
  BUG: fixed string truncation in values by passing min_itemsize = { 'values' : 1024 }
  BUG: non-datetime indicies were not being handled correctly in searchings (via Terms)       added support for integer, float, date
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Testing pandas testing functions or related to the test suite
Projects
None yet
Development

No branches or pull requests

2 participants