Skip to content
This repository has been archived by the owner on Jan 12, 2019. It is now read-only.

R 启动时自动加载包的正确姿势 #13

Open
JackieMium opened this issue Mar 9, 2018 · 0 comments
Open

R 启动时自动加载包的正确姿势 #13

JackieMium opened this issue Mar 9, 2018 · 0 comments
Labels
R R related 基础 很基础的东西
Milestone

Comments

@JackieMium
Copy link
Owner

2017-12-21

今天看 Hadley Wickham 大大的《R for Data Science》的时候无意踩坑了,记录一下。

看到章节 4. Workflow: basics4.4 节做练习的时候,本来这一章十分简单,5 分钟看完的,练习也简单,基本上就是拼写错误啥的。然后第二题:

Tweak each of the following R commands so that they run correctly:

library(tidyverse)

ggplot(dota = mpg) + 
  geom_point(mapping = aes(x = displ, y = hwy))

fliter(mpg, cyl = 8)
filter(diamond, carat > 3

ggplot 里面 data 写成了 dota,嗯好的,原来 Hadley 大大也是宅男。

改过来之后 OK 了。

filter写成了fliter= 应该是 ==,口亨,so easy。

然后, 然后,还是报错:

R >>> filter(mpg, cyl == 8)
Error in stats::filter(mpg, cyl == 8) : object 'cyl' not found
In addition: Warning messages:
1: In data.matrix(data) : NAs introduced by coercion
2: In data.matrix(data) : NAs introduced by coercion
3: In data.matrix(data) : NAs introduced by coercion
4: In data.matrix(data) : NAs introduced by coercion
5: In data.matrix(data) : NAs introduced by coercion
6: In data.matrix(data) : NAs introduced by coercion

我瞬间炸了。

把眼睛凑近点看以为是不是哪个 l 其实是个 1 之类的,发现没问题啊。

实在不行,我觉得可能是代码是复制粘贴的,有看不见的字符之类的问题,决定手打一遍,然后, 然后,然后:

R >>> filter(mpg, cyl == 8)
Error in stats::filter(mpg, cyl == 8) : object 'cyl' not found
In addition: Warning messages:
1: In data.matrix(data) : NAs introduced by coercion
2: In data.matrix(data) : NAs introduced by coercion
3: In data.matrix(data) : NAs introduced by coercion
4: In data.matrix(data) : NAs introduced by coercion
5: In data.matrix(data) : NAs introduced by coercion
6: In data.matrix(data) : NAs introduced by coercion

我一度以为我眼睛瞎了。

算了,可能环境乱了,我重新开一个 R 试试,然后, 然后,然后, 然后:

R >>> filter(mpg, cyl == 8)
Error in stats::filter(mpg, cyl == 8) : object 'cyl' not found
In addition: Warning messages:
1: In data.matrix(data) : NAs introduced by coercion
2: In data.matrix(data) : NAs introduced by coercion
3: In data.matrix(data) : NAs introduced by coercion
4: In data.matrix(data) : NAs introduced by coercion
5: In data.matrix(data) : NAs introduced by coercion
6: In data.matrix(data) : NAs introduced by coercion

。。。卒 。。。

只能 Google 了,结果还真有悲摧的人碰到这个问题你别说,Unable to run examples ,直接在 Hadley 的 GitHub repo 里提问了。大大不愧是大大,一语道破真相:

Are you loading dplyr in your .Rprofile?

可不是嘛,我偷懒在 ~/.Rprofile 里加载了好几个常用的包。这个在我另一篇文里写了: R启动设置。 当时还只加载了 colorout 这个包。之后我加了几个,其中就包括 tidyverse,然后 dplyr 作为光荣的 tidyverse 全家桶的一员当然也就一起加载了。

Hadley 下面解释了原因,并且再下面还有人直接提出了解决方案:

That's a bad idea for exactly this reason. It gets loaded before stats, so stats::filter() overrides dplyr::filter()

A better way to handle this is to set the defaultPackages option, and ensure the packages are set in the order you wish to load them. E.g. in your .Rprofile you could have:

.First <- function() {
   autoloads <- c("dplyr", "ggplot2", "Hmisc")
   options(defaultPackages = c(getOption("defaultPackages"), autoloads))
}

就是说因为 dplyr 加载太早,早于 stats,所以最后 stats::filter 覆盖了 dplyr::filter。也就是说上面报错是 stats::filter 在报错(细心一点其实早就应该看到啊)。验证一下:

R >>> stats::filter(mpg, cyl == 8)
Error in stats::filter(mpg, cyl == 8) : object 'cyl' not found
In addition: Warning messages:
1: In data.matrix(data) : NAs introduced by coercion
2: In data.matrix(data) : NAs introduced by coercion
3: In data.matrix(data) : NAs introduced by coercion
4: In data.matrix(data) : NAs introduced by coercion
5: In data.matrix(data) : NAs introduced by coercion
6: In data.matrix(data) : NAs introduced by coercion

R >>> dplyr::filter(mpg, cyl == 8)
# A tibble: 70 x 11
   manufacturer              model displ  year   cyl      trans   drv   cty   hwy    fl   class
          <chr>              <chr> <dbl> <int> <int>      <chr> <chr> <int> <int> <chr>   <chr>
 1         audi         a6 quattro   4.2  2008     8   auto(s6)     4    16    23     p midsize
 2    chevrolet c1500 suburban 2wd   5.3  2008     8   auto(l4)     r    14    20     r     suv
 3    chevrolet c1500 suburban 2wd   5.3  2008     8   auto(l4)     r    11    15     e     suv
 4    chevrolet c1500 suburban 2wd   5.3  2008     8   auto(l4)     r    14    20     r     suv
 5    chevrolet c1500 suburban 2wd   5.7  1999     8   auto(l4)     r    13    17     r     suv
 6    chevrolet c1500 suburban 2wd   6.0  2008     8   auto(l4)     r    12    17     r     suv
 7    chevrolet           corvette   5.7  1999     8 manual(m6)     r    16    26     p 2seater
 8    chevrolet           corvette   5.7  1999     8   auto(l4)     r    15    23     p 2seater
 9    chevrolet           corvette   6.2  2008     8 manual(m6)     r    16    26     p 2seater
10    chevrolet           corvette   6.2  2008     8   auto(s6)     r    15    25     p 2seater
# ... with 60 more rows

小样儿,果不其然啊。

然后改了加载方式之后, defaultPackages 本身就有 stats,这样就保证 stats 会先加载而 dplyr 后加载并且 filter 不会被覆盖。

至此问题圆满解决。

人生何处不踩坑。

顺便,现在我的.Rprofile 升级了:

# customized options
options(prompt="\033[0;36mR >>> \033[0m", continue="... ")
options(editor="vim", menu.graphics=FALSE)
options(stringsAsFactors = FALSE, show.signif.stars = TRUE, digits = 4)

# launch Bioconductor and set Bioconductor mirror at startup
#source("http://bioconductor.org/biocLite.R")
#options(BioC_mirror="http://mirrors.ustc.edu.cn/bioc/")   # mighty USTC
#options(BioC_mirror="https://mirrors.tuna.tsinghua.edu.cn/bioconductor")   # TUNA then
#source("https://bioc.ism.ac.jp/biocLite.R")   # an alternative Japanese mirror

# set CRAN mirror. Better add at least two in case that one of them stops working
options(repos=c("http://mirrors.ustc.edu.cn/CRAN/", "https://mirrors.tongji.edu.cn/CRAN/",
                "http://mirrors.tuna.tsinghua.edu.cn/CRAN/", "https://mirrors.aliyun.com/CRAN/"))

# lanuch Bioconductor may take too long, disable auto start and and define a DIY func
# to start when needed
source.bio <- function(){
	source("http://bioconductor.org/biocLite.R")
	options(BioC_mirror="http://mirrors.ustc.edu.cn/bioc/")
}


# useful little customized functions
cd <- setwd
pwd <- getwd
hh <- function(d) {
  row_num <- min(5,nrow(d))
  col_num <- min(5,ncol(d))
  return(d[1:row_num,1:col_num])
}

# load favorite packages automatically at startup
options(defaultPackages=c(getOption("defaultPackages"), 'beepr',
       "colorout"))

# display greeting message at startup
.First <- function(){
	message("Welcome back, ", Sys.getenv("USER"),"!\n","Current working directory: ", getwd(),
                "\nDate and time: ", format(Sys.time(), "%Y-%m-%d %H:%M"), "\r\n")
	# display a message when all above loaded successfully
	message("###### SUCCESSFULLY LOADED. LET'S DO THIS! ######")
}

# goodbye at closing
.Last <- function() {
	cat("\nGoodbye at ", date(), "\n")
}

决定把配置文件也存在 GitHub 备份了。

@JackieMium JackieMium added R R related 基础 很基础的东西 labels Mar 9, 2018
@JackieMium JackieMium added this to the Migration milestone Mar 9, 2018
@JackieMium JackieMium added this to R in 博文分类 Jun 16, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
R R related 基础 很基础的东西
Projects
Development

No branches or pull requests

1 participant