Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow explicitly setting .datatable.aware <- FALSE to use base R data.frame syntax #5654

Closed
dvg-p4 opened this issue Jun 12, 2023 · 2 comments

Comments

@dvg-p4
Copy link
Contributor

dvg-p4 commented Jun 12, 2023

Feature request: Allow setting .datatable.aware <- FALSE to explicitly use base R data.frame syntax in a datatable-aware package

Motivation

I'm coding a package that uses data.table features in some places but not in others. We import data.table in NAMESPACE in order to have access to := etc. for the parts of our code that do use the full features of data.table. We would like to be able to write user-facing functions that take either a base data.frame or a data.table as an argument, and use them as input with some simple selecting and filtering.

Problem

data.table indexing semantics are by design different from data.frame semantics. So in many simple cases, it's difficult or impossible to write a call that will index both a data.frame and a data.table the same way.

As this vignette explains, you can set .datatable.aware = TRUE in a package to use data.table indexing semantics without importing data.table in NAMESPACE. However, as far as I can tell, there's no way to do the reverse--disable data.table semantics and go back to base R data.frame indexing for a certain function call.

Partial workarounds

A couple imperfect strategies I've considered to get around this:

  • as.data.table() / as.data.frame(): Creates an unnecessary copy of the data--slow on big tables and antithetical to the core point of data.table
  • setDT() / setDF(): Does not behave consistently/stably when called on a function argument, see setDT and setkeyv inside function #5618
  • subset(): Lets me do programmatic column selection and subsets, but verbose and not actually consistent in some edge cases

Suggestion:

Allow setting .datatable.aware <- FALSE to tell data.table that it should use base R data.frame indexing semantics in an otherwise data.table-aware package. This does not work currently since data.table:::cedta() does a big OR of this variable with the other conditions. But it seems like adding an if (isFALSE(ns$.datatable.aware)) return FALSE check before the big OR might do the trick.

MichaelChirico pushed a commit to dvg-p4/data.table that referenced this issue Feb 19, 2024
MichaelChirico added a commit that referenced this issue Feb 19, 2024
* Check for .datatable.aware being FALSE #5654

* Add tests

* Fix tests

* Simplify logic as suggested

* Band-aid on underlying selfrefok() problem for test

* Update news and add comment

* Fix transform slowness (#5493)

* Fix 5492 by limiting the costly deparse to `nlines=1`

* Implementing PR feedbacks

* Added  inside

* Fix typo in name

* Idiomatic use of  inside

* Separating the deparse line limit to a different PR

---------

Co-authored-by: Michael Chirico <chiricom@google.com>

* Improvements to the introductory vignette (#5836)

* Added my improvements to the intro vignette

* Removed two lines I added extra as a mistake earlier

* Requested changes

* switch to 3.2.0 R dep (#5905)

* frollmax1: frollmax, frollmax adaptive, left adaptive support (#5889)

* frollmax exact, buggy fast, no fast adaptive

* frollmax fast fixing bugs

* frollmax man to fix CRAN check

* frollmax fast adaptive non NA, dev

* froll docs, adaptive left

* no frollmax fast adaptive

* frollmax adaptive exact NAs handling

* PR summary in news

* copy-edit changes from reviews

Co-authored-by: Benjamin Schwendinger <52290390+ben-schwen@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Michael Chirico <michaelchirico4@gmail.com>
Co-authored-by: Benjamin Schwendinger <52290390+ben-schwen@users.noreply.github.com>

* comment requested by Michael

* update NEWS file

* Apply suggestions from code review

Co-authored-by: Michael Chirico <michaelchirico4@gmail.com>

* Apply suggestions from code review

Co-authored-by: Michael Chirico <michaelchirico4@gmail.com>

* add comment requested by Michael

* add comment about int iterator for loop over k-1 obs

* extra comments

* Revert "extra comments"

This reverts commit 03af0e3.

* add comments to frollmax and frollsum

* typo fix

---------

Co-authored-by: Michael Chirico <michaelchirico4@gmail.com>
Co-authored-by: Benjamin Schwendinger <52290390+ben-schwen@users.noreply.github.com>

* Run GHA jobs on 1-15-99 dev branch (#5909)

* Make declarations static for covr (#5910)

* botched rebase

* stray \

* smaller diff

* test #s

---------

Co-authored-by: Ofek <ofekshilon@gmail.com>
Co-authored-by: Michael Chirico <chiricom@google.com>
Co-authored-by: Ani <bloodraven166@gmail.com>
Co-authored-by: Jan Gorecki <J.Gorecki@wit.edu.pl>
Co-authored-by: Michael Chirico <michaelchirico4@gmail.com>
Co-authored-by: Benjamin Schwendinger <52290390+ben-schwen@users.noreply.github.com>
@dvg-p4
Copy link
Contributor Author

dvg-p4 commented Feb 19, 2024

Implemented and merged into master (v1.15.99) by #5655

@dvg-p4
Copy link
Contributor Author

dvg-p4 commented Apr 22, 2024

Another workaround that I found in the meantime:

class(df) <- "data.frame"

within a function appears to do a shallow copy without affecting the object outside the function, and thus can accomplish a similar effect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant