Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

更新统计词话数据案例的代码和数据 #31

Merged
merged 8 commits into from
Aug 21, 2019
Merged

更新统计词话数据案例的代码和数据 #31

merged 8 commits into from
Aug 21, 2019

Conversation

XiangyunHuang
Copy link
Contributor

数据重编码为 UTF-8,绘图代码根据最新版 igraph 包升级

感谢董安澜升级代码

* 数据重编码为 UTF-8,绘图代码根据最新版 igraph 包升级
* 将所有 Demo 运行一遍,添加运行 Demo 所需的 R 包到 suggestion 字段,并标注版本号,将 MSG 包更新到 1.0 版本,标志着现代统计图形第一版
Copy link
Owner

@yihui yihui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

以后代码清理工作和真正的工作(比如这里的修编码和修 igraph 问题)可以分开提交 PR,一次只做一件事,这样合并起来会快一些。

}
return(invisible(list(root = xy[i, ], color = cl)))
gradient <- function(x, y, z, main = NULL, ..., FUN = function(x,
y) x^2 + 2 * y^2, rg = c(-3, -3, 3, 3), init = c(-3, 3),
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这些代码清理工作是用 formatR::tidy_source() 自动做的吧?像这种断行明显不合理的地方最好是手工调整一下。

你把等号都替换为箭头了,这个我有点犹豫要不要换。我的个人风格是等号赋值。在书中可以用选项自动替换等号为箭头:https://github.com/dr-harper/rmarkdown-cookbook/blob/d2967bb489327d79ba23e7a204435ae09da93137/index.Rmd#L27 但在我自己的包中我倾向于用等号。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

手动断行和 styler 格式化的,现在应该改回来了,都是等号

本来确实要分几个 PR 的,手一抖就都推上去了

DESCRIPTION Outdated
Version: 0.3
Date: 2016-02-13
Version: 1.0
Date: 2019-08-21
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

版本号升为 0.3.1,日期去掉即可。

DESCRIPTION Outdated
plotrix (>= 3.7.6),
ggplot2 (>= 3.2.1),
grid (>= 3.6.0),
sna (>= 2.4)
URL: http://yihui.name/cn/publication
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

URL 改成 https://github.com/yihui/MSG 好了

@XiangyunHuang

This comment has been minimized.

Copy link
Owner

@yihui yihui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

多谢!

@yihui yihui merged commit 4ea93b4 into yihui:master Aug 21, 2019
@pzhaonet
Copy link
Contributor

pzhaonet commented Jul 14, 2020

@yihui @XiangyunHuang

统计词话的示例数据 SongWords.rda 在我这里载入时仍然是乱码呀。作为对照组,HighFreq100.rda 载入正常。

《现代统计图形》网页版的这一段话同样是乱码。

运行结果和系统信息如下:

> Sys.setlocale("LC_ALL", "Chinese")
[1] "LC_COLLATE=Chinese (Simplified)_China.936;LC_CTYPE=Chinese (Simplified)_China.936;LC_MONETARY=Chinese (Simplified)_China.936;LC_NUMERIC=C;LC_TIME=Chinese (Simplified)_China.936"
> load(system.file("extdata", "SongWords.rda", package = "MSG"))
> SongWords[1:6, 1:4]
       涓樺鏈<ba> 鍛ㄩ偊褰<a6> 濮滃 鏅忓嚑閬<93>
浜洪棿          11            3     7            0
绁炰粰          11            0     0            0
娲炲ぉ           9            1     0            0
鐗╁            10            0     0            0
鐧惧勾           8            0     2            0
閫嶉仴           9            0     0            1
> load(system.file("extdata", "HighFreq100.rda", package = "MSG"))
> HighFreq100[1:6, 1:4]
            自然        逍遥        人间        何处
自然 0.000000000 0.077464789 0.010000000 0.003236246
逍遥 0.077464789 0.000000000 0.010489510 0.006802721
人间 0.010000000 0.010489510 0.000000000 0.042704626
何处 0.003236246 0.006802721 0.042704626 0.000000000
物外 0.047058824 0.109649123 0.008064516 0.003906250
明月 0.011494253 0.037344398 0.012295082 0.028340081
> sessionInfo()
R version 4.0.1 (2020-06-06)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19041)

Matrix products: default

locale:
[1] LC_COLLATE=Chinese (Simplified)_China.936  LC_CTYPE=Chinese (Simplified)_China.936   
[3] LC_MONETARY=Chinese (Simplified)_China.936 LC_NUMERIC=C                              
[5] LC_TIME=Chinese (Simplified)_China.936    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] MSG_0.4.1

loaded via a namespace (and not attached):
[1] compiler_4.0.1     tools_4.0.1        RColorBrewer_1.1-2

yihui added a commit that referenced this pull request Jul 21, 2020
@yihui
Copy link
Owner

yihui commented Jul 21, 2020

@pzhaonet 现在应该修好了。

@yihui
Copy link
Owner

yihui commented Jul 21, 2020

不过我现在不太喜欢用 .rda 或 .RData 格式保存数据,而是用 .rds:https://yihui.org/en/2017/12/save-vs-saverds/ 要是觉得值得改就改,不值得改就算了。

@pzhaonet
Copy link
Contributor

好了!又了了一块心病。我觉得这次就这样就好,不用改成 .rds——给以后修订留点上升空间嘛。

@yihui
Copy link
Owner

yihui commented Jul 22, 2020

问题是这种改动是破坏性的,要么不做,要做就尽早(尤其是要赶在大规模广告之前)。改成 rds 之后就需要显式赋值,以前的 load('foo.rda') 要改成 foo = readRDS('foo.rds')

@pzhaonet
Copy link
Contributor

那就改!

@pzhaonet
Copy link
Contributor

pzhaonet commented Jul 24, 2020

你是担心 rda 文件悄悄占用了对象名称吗?但是 rds 文件不能被 data(package="MSG") 发现吧?要是这样,我觉得 rda 更好,可以更方便被调用,比如书里调用别人家的数据集的时候都是用 data() 函数,从来没有显式赋值。

要是担心对象名字冲突,那就给 rda 起个怪名字就行了。比如通通用 xx 开头,xxChinaPop.rda。

@yihui
Copy link
Owner

yihui commented Jul 24, 2020

我们这里说的是 https://github.com/yihui/MSG/tree/master/inst/extdata 目录下的数据。这些是不能被 data() 调用的,而我之所以要放在 inst/extdata 下面,就是因为里面有中文问题。否则就放 https://github.com/yihui/MSG/tree/master/data 下面了。

你是担心 rda 文件悄悄占用了对象名称吗?

是。这种存在悄悄覆盖对象的写代码方式要尽量避免。当然 data() 也存在这个问题,所以我也觉得 data() 本来应该是应该避免的,但它存在太久而且又被广泛使用,恐怕是不可能纠正这个恶习了。

@pzhaonet
Copy link
Contributor

这个属于细枝末节,两道菜各有口味,看食客的喜好了。

这书的读者是对作图感兴趣的人,方便他们作图就行。我觉得 rda 无伤大雅,作者省事,读者便利。他们是不会遇到对象名冲突的问题的。而凭一书之力也改不了 data() 的普遍使用现状。不如随他去吧。

@yihui
Copy link
Owner

yihui commented Jul 25, 2020

嗯,听你的。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants