Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

concat DataFrame along timeseries indexes #1702

Closed
mnbbrown opened this issue Jul 30, 2012 · 2 comments

Comments

@mnbbrown
Copy link

commented Jul 30, 2012

Please see: http://stackoverflow.com/questions/11714768/concat-pandas-dataframe-along-timeseries-indexes

NumPy v1.6.3
Pandas v0.8.1

I have two largish (snippets provided) pandas DateFrames with unequal dates as indexes that I wish to concat into one:

           NAB.AX                                  CBA.AX
            Close    Volume                         Close    Volume
Date                                    Date
2009-06-05  36.51   4962900             2009-06-08  21.95         0
2009-06-04  36.79   5528800             2009-06-05  21.95   8917000
2009-06-03  36.80   5116500             2009-06-04  22.21  18723600
2009-06-02  36.33   5303700             2009-06-03  23.11  11643800
2009-06-01  36.16   5625500             2009-06-02  22.80  14249900
2009-05-29  35.14  13038600   --AND--   2009-06-01  22.52  11687200
2009-05-28  33.95   7917600             2009-05-29  22.02  22350700
2009-05-27  35.13   4701100             2009-05-28  21.63   9679800
2009-05-26  35.45   4572700             2009-05-27  21.74   9338200
2009-05-25  34.80   3652500             2009-05-26  21.64   8502900

Problem is, if I run this:

keys = ['CBA.AX','NAB.AX']
mv = pandas.concat([data['CBA.AX'][650:660],data['NAB.AX'][650:660]], axis=1, keys=stocks,) 

the following DateFrame is produced:

                                 CBA.AX          NAB.AX        
                              Close  Volume   Close  Volume
Date                                                      
2200-08-16 04:24:21.460041     NaN     NaN     NaN     NaN
2203-05-13 04:24:21.460041     NaN     NaN     NaN     NaN
2206-02-06 04:24:21.460041     NaN     NaN     NaN     NaN
2208-11-02 04:24:21.460041     NaN     NaN     NaN     NaN
2211-07-30 04:24:21.460041     NaN     NaN     NaN     NaN
2219-10-16 04:24:21.460041     NaN     NaN     NaN     NaN
2222-07-12 04:24:21.460041     NaN     NaN     NaN     NaN
2225-04-07 04:24:21.460041     NaN     NaN     NaN     NaN
2228-01-02 04:24:21.460041     NaN     NaN     NaN     NaN
2230-09-28 04:24:21.460041     NaN     NaN     NaN     NaN
2238-12-15 04:24:21.460041     NaN     NaN     NaN     NaN

Does anybody have any idea why this might be the case?

On another point: is there any python libraries around that pull data from yahoo and normalise it?

For reference:

data = {
'CBA.AX': <class 'pandas.core.frame.DataFrame'>
    DatetimeIndex: 2313 entries, 2011-12-29 00:00:00 to 2003-01-01 00:00:00
    Data columns:
        Close     2313  non-null values
        Volume    2313  non-null values
    dtypes: float64(1), int64(1),

 'NAB.AX': <class 'pandas.core.frame.DataFrame'>
    DatetimeIndex: 2329 entries, 2011-12-29 00:00:00 to 2003-01-01 00:00:00
    Data columns:
        Close     2329  non-null values
        Volume    2329  non-null values
    dtypes: float64(1), int64(1)
}
@mnbbrown

This comment has been minimized.

Copy link
Author

commented Aug 1, 2012

http://cl.ly/2B0h0c1p1W3D Pickled DataFrames.

@wesm

This comment has been minimized.

Copy link
Member

commented Aug 10, 2012

This has been fixed, referenced in #1745

@wesm wesm closed this Aug 10, 2012

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.