Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encourage Imports not Depends #3076

Open
mattdowle opened this issue Sep 26, 2018 · 20 comments
Open

Encourage Imports not Depends #3076

mattdowle opened this issue Sep 26, 2018 · 20 comments
Milestone

Comments

@mattdowle
Copy link
Member

mattdowle commented Sep 26, 2018

CRAN + BioC: Depends Imports


I've only recently realized how bad Depends: is, thanks to Jan's importing vignette. I just made its discouragement stronger : 51590bc

We could disallow Depends. This would also be beneficial to cedta()'s awkward implementation on its last line where it needs to do the tryCatch just for packages which Depend; that line could be removed.

But before we disallow Depends, we'd need to ask 69 CRAN packages to change from Depends to Imports. (Most revdeps already Import.) The longer we leave it, the greater the potential for new packages using Depends to be added to CRAN and the harder it will be to change.

Check when changed to Imports and published on CRAN too :

List as code for easier maintenance of this list:
known_depends = c('Ac3net', 'AF', 'batchtools', 'bea.R', 'behavr', 'BuyseTest', 'cdparcoord', 'cffdrs', 'classifierplots', 'clickstream', 'corpustools', 'cvAUC', 'damr', 'dfmeta', 'drgee', 'easycsv', 'edgeRun', 'EurosarcBayes', 'gbp', 'GenomicTools', 'GenomicTools.fileHandler', 'greport', 'haploReconstruct', 'heatwaveR', 'heims', 'ie2misc', 'iemisc', 'JWileymisc', 'koRpus', 'LabourMarketAreas', 'lookupTable', 'LSPFP', 'miLineage', 'mrMLM', 'mrMLM.GUI', 'multicastR', 'musica', 'networkR', 'NNS', 'openSTARS', 'orgR', 'panelaggregation', 'partools', 'penaltyLearning', 'pkggraph', 'QuantTools', 'Rbitcoin', 'reinsureR', 'riskRegression', 'Rnets', 'robCompositions', 'RSauceLabs', 'RWildbook', 'sentometrics', 'simPop', 'simstudy', 'sitree', 'skm', 'slim', 'sparseFLMM', 'stranger', 'surveyplanning', 'tcpl', 'ttwa', 'twl', 'validateRS', 'vardpoor', 'VIM', 'word.alignment')

current_depends = devtools::revdep('data.table', 'Depends')

new_depends = setdiff(current_depends, known_depends)

fixed_depends = setdiff(known_depends, current_depends)

current_depends_w_bioc = devtools::revdep('data.table', 'Depends', bioconductor = TRUE)
bioc_depends = setdiff(current_depends_w_bioc, current_depends)

UNCONTACTED AS OF 2019-08-13

  • CoSMoS
  • EBPRS
  • GenoScan
  • glmaag
  • lori
  • MGDrivE as of v1.1.0 2019-08-19
  • phenocamapi GH
  • PROSPER
  • rblt GH
  • shinyML
  • sitreeE
  • somspace
  • SVN
  • synthACS
  • TreeLS GH
  • Tushare
  • WGScan

Quick links to CRAN index of all packages:

cat(sprintf('[%s](https://cran.r-project.org/web/packages/%s/index.html)', current_depends, current_depends), sep = ' ')

Ac3net AF batchtools bea.R behavr BuyseTest cdparcoord classifierplots clickstream corpustools CoSMoS cvAUC damr dfmeta drgee easycsv EBPRS edgeRun EurosarcBayes gbp GenomicTools GenomicTools.fileHandler GenoScan glmaag greport haploReconstruct heims ie2misc iemisc JWileymisc LabourMarketAreas lookupTable lori LSPFP miLineage mrMLM mrMLM.GUI multicastR musica openSTARS orgR panelaggregation partools phenocamapi pkggraph PROSPER QuantTools Rbitcoin rblt reinsureR riskRegression Rnets robCompositions RSauceLabs RWildbook sentometrics shinyML simPop sitreeE skm slim somspace sparseFLMM SVN synthACS TreeLS ttwa Tushare twl validateRS VIM WGScan word.alignment

@mattdowle mattdowle added this to the 1.12.0 milestone Sep 26, 2018
@jangorecki
Copy link
Member

jangorecki commented Sep 27, 2018

It shouldn't be disallowed because there are valid use cases for Depends, for example my package dtq which is a logging mechanism for [.data.table. I recall other package too: dt.nuggets(?). We should definitely recommend to imports instead of depends, but IMO it should not be disallowed as there are valid use cases for Depends.

@mattdowle
Copy link
Member Author

Fair enough. Why exactly is it that those packages need Depends and can't use Imports?

@jangorecki
Copy link
Member

They are extensions of data.table, their existence without data.table is pointless. They not just "use" data.table.

@mattdowle
Copy link
Member Author

mattdowle commented Sep 28, 2018

I still don't follow. Importing is as strong as depending in that imports are required. No package importing data.table can install without it. What would fail if they changed to Imports? There is also Enhances:.
Does dtq mask [.data.table so it need to be attached in front of data.table on the search path?

@jangorecki
Copy link
Member

jangorecki commented Sep 30, 2018

It is as strong, except the fact that imported namespace doesn't need to be attached. It doesn't mask, but it expects data.table to be attached. There are definitely very few use cases like this so general rule should be to import.
In case of listed rev deps that depends on data.table we could propose to change that to imports. Change might need to update examples and unit tests (thus all dependent scripts too) to require data.table explicitly together with such rev dep package, which is potentially significant breaking change, still easy to fix.
Regarding change of depends to imports of Rbitcoin rev dep, I am not maintaining this package anymore. It might eventually be incompatible with later versions of data.table or exchange market APIs, I have no idea. Latest CRAN release is 4+ years old, dev version is 2+ years old. If someone is willing to take maintenance of it then it will be best, otherwise it will probably gets removed from CRAN once it will start breaking to many newly added rules.

@mattdowle mattdowle modified the milestones: 1.12.0, 1.12.2 Jan 11, 2019
@mattdowle
Copy link
Member Author

75 maintainers emailed today (up from the 69 in Sep 2018) :

deps = tools::package_dependencies("data.table", which="Depends", reverse=TRUE)[[1]]
paste(sapply(deps, maintainer),collapse=";")

Dear maintainer,

Hello! Thanks for using data.table. Your package is one of 75 CRAN packages that Depend on data.table. Another 469 Import data.table. Importing is much preferred.

The data.table importing vignette now contains this text :

" Besides the Imports: field, you can also use Depends: data.table but we strongly discourage this approach (and may disallow it in future) because this loads data.table into your user’s workspace; i.e. it enables data.table functionality in your user’s scripts without them requesting that. Imports: is the proper way to use data.table within your package without inflicting data.table on your user. In fact, we hope the Depends: field is eventually deprecated in R since this is true for all packages. "

https://cran.r-project.org/web/packages/data.table/vignettes/datatable-importing.html

Another motivation is that it would be beneficial to cedta()'s awkward implementation on its last line where it needs to do the tryCatch just for packages which Depend. That line could be removed.

Eventually we'll add a message, then a warning, and after a few years possibly only allow Importing and not Depending on data.table. The next time you update your package on CRAN, are you ok to change from Depends to Imports please?

We're tracking this issue here : #3076

It's very early days on this issue and this is just a first email to let you know our thoughts and the plan. Feedback and comments are very welcome, ideally directly in the GitHub issue.

Best, Matt

@mattdowle mattdowle modified the milestones: 1.12.2, 1.12.4 Feb 27, 2019
@mattdowle mattdowle changed the title Disallow Depends:; require Imports: only. Encourage Imports not Depends Feb 27, 2019
@mbannert
Copy link

mbannert commented Feb 27, 2019

First of all thanks a ton for reaching out to the package maintainers.

I fully agree. I always use data.table for my backends and have imported on my more recent packages. In my case panelaggregation was a try to put a package on CRAN back when people wore pyjamas and lived life slow. Last time a data.table change caused issues I tried to get the package of CRAN but was discouraged to do so. I'd rather put my time into my newer packages than to fix a package which is more or less abandoned. On the other hand it would probably not be that much of deal to fix it though I haven't looked at the package in years.

Bottom line: from my point of view, disallowing would be a bit too much since it would get my old package of CRAN forcefully.

@tdhock
Copy link
Member

tdhock commented Feb 27, 2019

hi I just updated my penaltyLearning package, tdhock/penaltyLearning@ef8e7fb. now it imports data.table instead of depends. basically I just had to add a bunch of library(data.table) lines in tests and examples.

@bozenne
Copy link

bozenne commented Feb 28, 2019

Hi thanks for the email and for data.table
I have also changed to imports in the Github version of the BuyseTest package.
Will be updated on CRAN at some point.

Best
brice

@MichaelChirico
Copy link
Member

Adding these from Bioconductor:

- [ ] amplican
- [ ] Chicago
- [ ] chimeraviz
- [ ] GenoGAM
- [ ] GGtools
- [ ] GOTHiC
- [ ] metavizr
- [ ] OUTRIDER
- [ ] PGA
- [ ] QUALIFIER
- [ ] QuartPAC
- [ ] rBiopaxParser
- [ ] RCAS
- [ ] rfPred
- [ ] rTANDEM
- [ ] SNPhood
- [ ] TIN

Also added some code for repeating this & some more new packages from CRAN under <details> in OP

brown-jason added a commit to USEPA/CompTox-ToxCast-tcpl that referenced this issue Jun 5, 2019
And updated files to pass devtools::check()
@brown-jason
Copy link

tcpl package has moved data.table to imports

@mattdowle
Copy link
Member Author

mattdowle commented Sep 19, 2019

Reminder email sent today to these 74 :

 [1] "Ac3net"                   "AF"                       "batchtools"              
 [4] "bea.R"                    "behavr"                   "BuyseTest"               
 [7] "cdparcoord"               "classifierplots"          "clickstream"             
[10] "corpustools"              "CoSMoS"                   "cvAUC"                   
[13] "damr"                     "dfmeta"                   "drgee"                   
[16] "easycsv"                  "EBPRS"                    "edgeRun"                 
[19] "EurosarcBayes"            "gbp"                      "GenomicTools"            
[22] "GenomicTools.fileHandler" "GenoScan"                 "glmaag"                  
[25] "greport"                  "haploReconstruct"         "heims"                   
[28] "ie2misc"                  "iemisc"                   "JWileymisc"              
[31] "LabourMarketAreas"        "lookupTable"              "lori"                    
[34] "LSPFP"                    "miLineage"                "mrMLM"                   
[37] "mrMLM.GUI"                "multicastR"               "musica"                  
[40] "networkR"                 "openSTARS"                "orgR"                    
[43] "panelaggregation"         "partools"                 "phenocamapi"             
[46] "pkggraph"                 "PROSPER"                  "QuantTools"              
[49] "Rbitcoin"                 "rblt"                     "reinsureR"               
[52] "riskRegression"           "Rnets"                    "robCompositions"         
[55] "RSauceLabs"               "RWildbook"                "sentometrics"            
[58] "shinyML"                  "simPop"                   "sitreeE"                 
[61] "skm"                      "slim"                     "somspace"                
[64] "sparseFLMM"               "SVN"                      "synthACS"                
[67] "TreeLS"                   "ttwa"                     "Tushare"                 
[70] "twl"                      "validateRS"               "VIM"                     
[73] "WGScan"                   "word.alignment"  

Dear maintainer,
Hello! Thanks for using data.table. Your package is one of 74 CRAN packages that Depend on data.table. Another 667 Import data.table. Importing is much preferred.
The data.table importing vignette now contains this text :" Besides the Imports: field, you can also use Depends: data.table but we strongly discourage this approach (and may disallow it in future) because this loads data.table into your user’s workspace; i.e. it enables data.table functionality in your user’s scripts without them requesting that. Imports: is the proper way to use data.table within your package without inflicting data.table on your user. In fact, we hope the Depends: field is eventually deprecated in R since this is true for all packages. "
https://cran.r-project.org/web/packages/data.table/vignettes/datatable-importing.html
Another motivation is that it would be beneficial to cedta()'s awkward implementation on its last line where it needs to do the tryCatch just for packages which Depend. That line could be removed.Eventually we'll add a message, then a warning, and after a few years possibly only allow Importing and not Depending on data.table. The next time you update your package on CRAN, are you ok to change from Depends to Imports please?
We're tracking this issue here : #3076

It has been 6 months since I first emailed. Each email and package status is logged in the tracking issue. Feedback and comments are very welcome, ideally directly in the GitHub issue.

Best, Matt

@mattdowle mattdowle modified the milestones: 1.12.4, 1.13.0 Sep 19, 2019
@ekstroem
Copy link

Updated networkR package with Depends -> Imports on its way to CRAN

@jangorecki
Copy link
Member

jangorecki commented Sep 21, 2019

Added badges of Depends/Imports count to first post. They are being refreshed daily.

@gyang274
Copy link

Hi, @mattdowle

Thank you for the notice, and the wonderful data.table package.

I experienced inconsistent solution for communicating with Rcpp, when used imports instead of depends. This could caused by other issues. But at that time, depends is a reasonable choice as the package relied on and built around data.table features.

Let me test on imports and get back to you.

Thank you.

@sborms
Copy link

sborms commented Sep 26, 2019

Thanks for the reminder! Should be done for the sentometrics package by next CRAN release.

Sam

@bozenne
Copy link

bozenne commented Nov 18, 2019

Hello,
I have (finally) updated the version of the BuyseTest package on CRAN (1.8)
where data.table has been moved to the imports section, as you suggested.

Regards
brice

@JWiley
Copy link

JWiley commented Nov 22, 2019

Hello,

This is fixed in JWileymisc now propagating through CRAN. I took the opportunity to change all depends into imports.

Cheers,

Josh

@mattdowle mattdowle modified the milestones: 1.12.7, 1.12.9 Dec 8, 2019
@alexWhitworth
Copy link

synthACS now uses Imports instead of Depends. Push sent to CRAN.

carlosparadis added a commit to sailuh/kaiaulu that referenced this issue May 27, 2020
Removing the packages from Depends on DESCRIPTION
required adding @importsFrom and @export across all
functions, hence this commit slightly modifies the
entire codebase.

Vignettes were edited to require() instead of attaching
using library() following best practies.

Rather than copy and paste data.table across all functions,
I followed XGBoost package approach and loaded only on a
single file. Since all codebase wont work without parsers,
that seem the most logical place. Other more situational
dependence on functions will use the typical :: approach
which is hard for data.table.

Arguably data.table could belong on Depends, but even the
package maintainer argues against it, see:

Rdatatable/data.table#3076
@mattdowle mattdowle modified the milestones: 1.13.1, 1.13.3 Oct 17, 2020
@jangorecki jangorecki modified the milestones: 1.14.3, 1.14.5 Jul 19, 2022
@mattdowle
Copy link
Member Author

mattdowle commented Oct 31, 2022

Currently 79 packages that Depend rather than Import and they continue to be created or updated.

download.file("http://cloud.r-project.org/web/packages/packages.rds", tmp<-tempfile())  # has more fields than PACKAGES.rds
x = readRDS(tmp)
rownames(x) = x[,"Package"]
deps = tools::package_dependencies("data.table", which="Depends", reverse=TRUE)[[1]]
length(deps)   # 90
y = as.matrix(table(substring(x[deps,"Published"],1,7)))
y = y[sort(rownames(y), decreasing=TRUE),,drop=FALSE]
y
# Number of packages Depending (not Importing) data.table by publish month
2024-03    5
2024-02    4
2024-01    3
2023-12    1
2023-11    3
2023-10    4
2023-09    2
2023-08    6
2023-06    1
2023-04    2
...

I never did get to the bottom of Jan's objections above to deprecating (very gently and slowly over several years) Depend'ing; i.e. what the valid use cases of Depend'ing are: why Import'ing won't work or is not as good as Depend'ing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests