New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Request] convert DT to xts efficiently #882

Closed
jangorecki opened this Issue Oct 13, 2014 · 13 comments

Comments

Projects
None yet
5 participants
@jangorecki
Member

jangorecki commented Oct 13, 2014

Proposed feature request:

As data.table is used in finance I wonder if we could convert to xts by reference?
Assumptions:

  • first column contain POSIXct or Date
  • because xts is a matrix all other columns needs to be numeric or integer (others would be dropped)
  • setDT should also accept xts object to ensure the following is TRUE: setXTS(setDT(xts_obj)) == xts_obj

This may enhance the integration of data.table with already well developed functionalities around xts.

@jangorecki jangorecki changed the title from convert to xts by reference: setXTS(dt) to [Request] convert to xts by reference: setXTS(dt) Nov 13, 2014

@arunsrinivasan

This comment has been minimized.

Show comment
Hide comment
@arunsrinivasan

arunsrinivasan Dec 4, 2014

Member

xts object is a matrix, isn't it? If so, can't convert a matrix to data.table by reference IIUC. The underlying data structure isn't the same. A data.frame is a vector of column pointers - rows of each column is contiguous in memory, but not the columns themselves. But a matrix is just a vector with attributes - entire vector is contiguous in memory.

Member

arunsrinivasan commented Dec 4, 2014

xts object is a matrix, isn't it? If so, can't convert a matrix to data.table by reference IIUC. The underlying data structure isn't the same. A data.frame is a vector of column pointers - rows of each column is contiguous in memory, but not the columns themselves. But a matrix is just a vector with attributes - entire vector is contiguous in memory.

@jangorecki

This comment has been minimized.

Show comment
Hide comment
@jangorecki

jangorecki Dec 4, 2014

Member

Thanks. You are right. The matrix is just a dimensioned vector. So by reference cannot be done. Closing as I don't see any alternative for this.

non by-reference wrappers for xts-data.table conversion available here: https://github.com/jangorecki/dwtools/blob/b8981b5658509333378f6325d5b1a0c71bf9da9a/R/xts.R

Member

jangorecki commented Dec 4, 2014

Thanks. You are right. The matrix is just a dimensioned vector. So by reference cannot be done. Closing as I don't see any alternative for this.

non by-reference wrappers for xts-data.table conversion available here: https://github.com/jangorecki/dwtools/blob/b8981b5658509333378f6325d5b1a0c71bf9da9a/R/xts.R

@mattdowle

This comment has been minimized.

Show comment
Hide comment
@mattdowle

mattdowle Dec 8, 2014

Member

Jan - very nice answer : http://stackoverflow.com/a/27356884/403310
Reopening and changed title to consider including this in data.table as per Dirk's comment?

Member

mattdowle commented Dec 8, 2014

Jan - very nice answer : http://stackoverflow.com/a/27356884/403310
Reopening and changed title to consider including this in data.table as per Dirk's comment?

@mattdowle mattdowle reopened this Dec 8, 2014

@mattdowle mattdowle changed the title from [Request] convert to xts by reference: setXTS(dt) to [Request] convert DT to xts efficiently Dec 8, 2014

@jangorecki

This comment has been minimized.

Show comment
Hide comment
@jangorecki

jangorecki Dec 8, 2014

Member

I can prepare PR to data.table.
Just some things to consider:

  1. roxygen did not produce expected NAMESPACE entries in my case, currently it produces:
S3method(as.data.table,xts)
export(as.xts.data.table)

I'm not sure but this might be related to lack of as.data.table.default method in data.table (related #969).
2. changing arg name from keep.rownames to keep.index?
3. handling case when the xts object already contain column named index, maybe the simple warning about duplicate colnames in resulting dt?
4. GSee comment from SO:

The as.data.table.xts function converts the index to character and the as.xts.data.table function does not allow for xts objects that are not numeric (e.g. an all character xts)

Didn't check this one yet.
5. prepare xts index by reference. I believe I could use setattr instead rownames on the DF right before as.xts, here.

Member

jangorecki commented Dec 8, 2014

I can prepare PR to data.table.
Just some things to consider:

  1. roxygen did not produce expected NAMESPACE entries in my case, currently it produces:
S3method(as.data.table,xts)
export(as.xts.data.table)

I'm not sure but this might be related to lack of as.data.table.default method in data.table (related #969).
2. changing arg name from keep.rownames to keep.index?
3. handling case when the xts object already contain column named index, maybe the simple warning about duplicate colnames in resulting dt?
4. GSee comment from SO:

The as.data.table.xts function converts the index to character and the as.xts.data.table function does not allow for xts objects that are not numeric (e.g. an all character xts)

Didn't check this one yet.
5. prepare xts index by reference. I believe I could use setattr instead rownames on the DF right before as.xts, here.

@gsee

This comment has been minimized.

Show comment
Hide comment
@gsee

gsee Dec 8, 2014

I suppose, it's really as.data.frame.xts that loses the class of the index when it converts it to rownames. Since there are some choices to be made, simple is probably better -- i.e. leave it up to the user to convert back to a timeBased class.

gsee commented Dec 8, 2014

I suppose, it's really as.data.frame.xts that loses the class of the index when it converts it to rownames. Since there are some choices to be made, simple is probably better -- i.e. leave it up to the user to convert back to a timeBased class.

@jangorecki

This comment has been minimized.

Show comment
Hide comment
@jangorecki

jangorecki Dec 13, 2014

Member

I've address 2-5.
Find recent dev version of code in:

It has now the advantage over data.frame because it retains the class of index after conversion.
When completed I will submit PR, to 1.9.7 hopefully :)
I still have problem with point 1, tried few combinations to keep xts in suggests and do not attach it on search path. After compiling in package it needs to by used as:

as.data.table(my_xts)
as.xts.data.table(my_dt)

Seems to not be related to lack of as.data.table.default but suggested package xts.
Info from: http://r-pkgs.had.co.nz/namespace.html

A method for a generic in a suggested package. Namespace directives must refer to available functions, so they can not reference suggested packages.

looks like ti will not be handled nice using xts in suggests.

Member

jangorecki commented Dec 13, 2014

I've address 2-5.
Find recent dev version of code in:

It has now the advantage over data.frame because it retains the class of index after conversion.
When completed I will submit PR, to 1.9.7 hopefully :)
I still have problem with point 1, tried few combinations to keep xts in suggests and do not attach it on search path. After compiling in package it needs to by used as:

as.data.table(my_xts)
as.xts.data.table(my_dt)

Seems to not be related to lack of as.data.table.default but suggested package xts.
Info from: http://r-pkgs.had.co.nz/namespace.html

A method for a generic in a suggested package. Namespace directives must refer to available functions, so they can not reference suggested packages.

looks like ti will not be handled nice using xts in suggests.

@jangorecki

This comment has been minimized.

Show comment
Hide comment
@jangorecki

jangorecki Jan 8, 2015

Member

I've pushed it as a new DT branch: https://github.com/jangorecki/data.table/tree/as.xts
Tests are ready (as testthat at the moment) but not yet merged into tests.Rraw.
Am I correct they should be transformed into test(num,x,y) tests?
I will PR it to 1.9.7.

Member

jangorecki commented Jan 8, 2015

I've pushed it as a new DT branch: https://github.com/jangorecki/data.table/tree/as.xts
Tests are ready (as testthat at the moment) but not yet merged into tests.Rraw.
Am I correct they should be transformed into test(num,x,y) tests?
I will PR it to 1.9.7.

@arunsrinivasan

This comment has been minimized.

Show comment
Hide comment
@arunsrinivasan

arunsrinivasan Jan 8, 2015

Member

@jangorecki looks great! Looks like you've .Rd files now. If they've to be merged to .Rraw, then yes, they have to be converted. I can help with that.

But if you'd like to keep the file as-is, then maybe @lianos can help? He has setup tests for S4-classes using testthat framework.

I'll push a as.data.table.default method. xts should be in suggests and should the examples be wrapped with \dontrun{}?

Member

arunsrinivasan commented Jan 8, 2015

@jangorecki looks great! Looks like you've .Rd files now. If they've to be merged to .Rraw, then yes, they have to be converted. I can help with that.

But if you'd like to keep the file as-is, then maybe @lianos can help? He has setup tests for S4-classes using testthat framework.

I'll push a as.data.table.default method. xts should be in suggests and should the examples be wrapped with \dontrun{}?

@lianos

This comment has been minimized.

Show comment
Hide comment
@lianos

lianos Jan 8, 2015

Contributor

Yes, can help w/ tests if you want to keep w/ testthat -- I'm not actually sure if these are being run anymore? We can migrate this to the new suggested layout for testthat tests if so ...

Contributor

lianos commented Jan 8, 2015

Yes, can help w/ tests if you want to keep w/ testthat -- I'm not actually sure if these are being run anymore? We can migrate this to the new suggested layout for testthat tests if so ...

@jangorecki

This comment has been minimized.

Show comment
Hide comment
@jangorecki

jangorecki Jan 8, 2015

Member

There is not need to use testthat if current test process is working fine. It should be straight to transfer expect_equal calls.
I will do the edits and merge into one commit before PR. Unfortunately I can't make cran check to work (dependency not available on 3.1.2).

Member

jangorecki commented Jan 8, 2015

There is not need to use testthat if current test process is working fine. It should be straight to transfer expect_equal calls.
I will do the edits and merge into one commit before PR. Unfortunately I can't make cran check to work (dependency not available on 3.1.2).

@arunsrinivasan

This comment has been minimized.

Show comment
Hide comment
@arunsrinivasan

arunsrinivasan Jan 8, 2015

Member

Great. I can help you with that.. You can go ahead and make the PR...

Member

arunsrinivasan commented Jan 8, 2015

Great. I can help you with that.. You can go ahead and make the PR...

@arunsrinivasan

This comment has been minimized.

Show comment
Hide comment
@arunsrinivasan

arunsrinivasan Jan 9, 2015

Member

@lianos missed your reply before, sorry. I'm not sure if they're being run. I assumed so. What's the new suggested layout you refer to?

Member

arunsrinivasan commented Jan 9, 2015

@lianos missed your reply before, sorry. I'm not sure if they're being run. I assumed so. What's the new suggested layout you refer to?

jangorecki added a commit to jangorecki/data.table that referenced this issue Jan 9, 2015

@jangorecki

This comment has been minimized.

Show comment
Hide comment
@jangorecki

jangorecki Jan 9, 2015

Member

I've pushed PR, dontrun and tests.Rraw. Let me know if everything ok so I can merge them as single commit (there are two commits at the moment).
It failed on test numbering, will try to fix it soon.

Member

jangorecki commented Jan 9, 2015

I've pushed PR, dontrun and tests.Rraw. Let me know if everything ok so I can merge them as single commit (there are two commits at the moment).
It failed on test numbering, will try to fix it soon.

arunsrinivasan added a commit that referenced this issue Jan 9, 2015

Merge pull request #998 from jangorecki/as.xts
Closes #882. xts <-> data.table conversions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment