-
Notifications
You must be signed in to change notification settings - Fork 2
/
19-R-intro.Rmd
278 lines (276 loc) · 24.1 KB
/
19-R-intro.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
# Appendix B Invoking R
<p>Users of R on Windows or macOS should read the OS-specific section first, but command-line use is also supported.</p>
<hr />
<p><a href="" id="Invoking-R-from-the-command-line"></a> <a href="" id="Invoking-R-from-the-command-line-1"></a></p>
<h3 id="b.1-invoking-r-from-the-command-line" class="section">B.1 Invoking R from the command line</h3>
<p>When working at a command line on UNIX or Windows, the command ‘R’ can be used both for starting the main R program in the form</p>
<div class="example">
<pre class="display"><code>R [options] [<infile] [>outfile],</code></pre>
</div>
<p>or, via the <code class="calibre2">R CMD</code> interface, as a wrapper to various R tools (e.g., for processing files in R documentation format or manipulating add-on packages) which are not intended to be called “directly”.</p>
<p>At the Windows command-line, <code class="calibre2">Rterm.exe</code> is preferred to <code class="calibre2">R</code>.</p>
<p>You need to ensure that either the environment variable <code class="calibre2">TMPDIR</code> is unset or it points to a valid place to create temporary files and directories.</p>
<p>Most options control what happens at the beginning and at the end of an R session. The startup mechanism is as follows (see also the on-line help for topic ‘Startup’ for more information, and the section below for some Windows-specific details).</p>
<ul>
<li>Unless --no-environ was given, R searches for user and site files to process for setting environment variables. The name of the site file is the one pointed to by the environment variable <code class="calibre2">R_ENVIRON</code>; if this is unset, R_HOME/etc/Renviron.site is used (if it exists). The user file is the one pointed to by the environment variable <code class="calibre2">R_ENVIRON_USER</code> if this is set; otherwise, files .Renviron in the current or in the user’s home directory (in that order) are searched for. These files should contain lines of the form ‘name=value’. (See <code class="calibre2">help("Startup")</code> for a precise description.) Variables you might want to set include <code class="calibre2">R_PAPERSIZE</code> (the default paper size), <code class="calibre2">R_PRINTCMD</code> (the default print command) and <code class="calibre2">R_LIBS</code> (specifies the list of R library trees searched for add-on packages).</li>
<li>Then R searches for the site-wide startup profile unless the command line option --no-site-file was given. The name of this file is taken from the value of the <code class="calibre2">R_PROFILE</code> environment variable. If that variable is unset, the default R_HOME/etc/Rprofile.site is used if this exists.</li>
<li>Then, unless --no-init-file was given, R searches for a user profile and sources it. The name of this file is taken from the environment variable <code class="calibre2">R_PROFILE_USER</code>; if unset, a file called .Rprofile in the current directory or in the user’s home directory (in that order) is searched for.</li>
<li>It also loads a saved workspace from file .RData in the current directory if there is one (unless --no-restore or --no-restore-data was specified).</li>
<li>Finally, if a function <code class="calibre2">.First()</code> exists, it is executed. This function (as well as <code class="calibre2">.Last()</code> which is executed at the end of the R session) can be defined in the appropriate startup profiles, or reside in .RData.</li>
</ul>
<p>In addition, there are options for controlling the memory available to the R process (see the on-line help for topic ‘Memory’ for more information). Users will not normally need to use these unless they are trying to limit the amount of memory used by R.</p>
<p>R accepts the following command-line options.</p>
<dl>
<dt>--help<br />
-h</dt>
<dd><p>Print short help message to standard output and exit successfully.</p>
</dd>
<dt>--version</dt>
<dd><p>Print version information to standard output and exit successfully.</p>
</dd>
<dt>--encoding=enc</dt>
<dd><p>Specify the encoding to be assumed for input from the console or <code class="calibre2">stdin</code>. This needs to be an encoding known to <code class="calibre2">iconv</code>: see its help page. (<code class="calibre2">--encoding enc</code> is also accepted.) The input is re-encoded to the locale R is running in and needs to be representable in the latter’s encoding (so e.g. you cannot re-encode Greek text in a French locale unless that locale uses the UTF-8 encoding).</p>
</dd>
<dt>RHOME</dt>
<dd><p>Print the path to the R “home directory” to standard output and exit successfully. Apart from the front-end shell script and the man page, R installation puts everything (executables, packages, etc.) into this directory.</p>
</dd>
<dt>--save<br />
--no-save</dt>
<dd><p>Control whether data sets should be saved or not at the end of the R session. If neither is given in an interactive session, the user is asked for the desired behavior when ending the session with q(); in non-interactive use one of these must be specified or implied by some other option (see below).</p>
</dd>
<dt>--no-environ</dt>
<dd><p>Do not read any user file to set environment variables.</p>
</dd>
<dt>--no-site-file</dt>
<dd><p>Do not read the site-wide profile at startup.</p>
</dd>
<dt>--no-init-file</dt>
<dd><p>Do not read the user’s profile at startup.</p>
</dd>
<dt>--restore<br />
--no-restore<br />
--no-restore-data</dt>
<dd><p>Control whether saved images (file .RData in the directory where R was started) should be restored at startup or not. The default is to restore. (--no-restore implies all the specific --no-restore-* options.)</p>
</dd>
<dt>--no-restore-history</dt>
<dd><p>Control whether the history file (normally file .Rhistory in the directory where R was started, but can be set by the environment variable <code class="calibre2">R_HISTFILE</code>) should be restored at startup or not. The default is to restore.</p>
</dd>
<dt>--no-Rconsole</dt>
<dd><p>(Windows only) Prevent loading the Rconsole file at startup.</p>
</dd>
<dt>--vanilla</dt>
<dd><p>Combine --no-save, --no-environ, --no-site-file, --no-init-file and --no-restore. Under Windows, this also includes --no-Rconsole.</p>
</dd>
<dt>-f file<br />
--file=file</dt>
<dd><p>(not <code class="calibre2">Rgui.exe</code>) Take input from file: ‘-’ means <code class="calibre2">stdin</code>. Implies --no-save unless --save has been set. On a Unix-alike, shell metacharacters should be avoided in file (but spaces are allowed).</p>
</dd>
<dt>-e expression</dt>
<dd><p>(not <code class="calibre2">Rgui.exe</code>) Use expression as an input line. One or more -e options can be used, but not together with -f or --file. Implies --no-save unless --save has been set. (There is a limit of 10,000 bytes on the total length of expressions used in this way. Expressions containing spaces or shell metacharacters will need to be quoted.)</p>
</dd>
<dt>--no-readline</dt>
<dd><p>(UNIX only) Turn off command-line editing via <strong>readline</strong>. This is useful when running R from within Emacs using the ESS (“Emacs Speaks Statistics”) package. See <a href="#The-command_002dline-editor">The command-line editor</a>, for more information. Command-line editing is enabled for default interactive use (see --interactive). This option also affects tilde-expansion: see the help for <code class="calibre2">path.expand</code>.</p>
</dd>
<dt>--min-vsize=N<br />
--min-nsize=N</dt>
<dd><p>For expert use only: set the initial trigger sizes for garbage collection of vector heap (in bytes) and <em>cons cells</em> (number) respectively. Suffix ‘M’ specifies megabytes or millions of cells respectively. The defaults are 6Mb and 350k respectively and can also be set by environment variables <code class="calibre2">R_NSIZE</code> and <code class="calibre2">R_VSIZE</code>.</p>
</dd>
<dt>--max-ppsize=N</dt>
<dd><p>Specify the maximum size of the pointer protection stack as N locations. This defaults to 10000, but can be increased to allow large and complicated calculations to be done. Currently the maximum value accepted is 100000.</p>
</dd>
<dt>--max-mem-size=N</dt>
<dd><p>(Windows only) Specify a limit for the amount of memory to be used both for R objects and working areas. This is set by default to the smaller of the amount of physical RAM in the machine and for 32-bit R, 1.5Gb<a href="appendix-f-references.html#FOOT26" id="DOCF26"><sup>26</sup></a>, and must be between 32Mb and the maximum allowed on that version of Windows.</p>
</dd>
<dt>--quiet<br />
--silent<br />
-q</dt>
<dd><p>Do not print out the initial copyright and welcome messages.</p>
</dd>
<dt>--slave</dt>
<dd><p>Make R run as quietly as possible. This option is intended to support programs which use R to compute results for them. It implies --quiet and --no-save.</p>
</dd>
<dt>--interactive</dt>
<dd><p>(UNIX only) Assert that R really is being run interactively even if input has been redirected: use if input is from a FIFO or pipe and fed from an interactive program. (The default is to deduce that R is being run interactively if and only if stdin is connected to a terminal or <code class="calibre2">pty</code>.) Using -e, -f or --file asserts non-interactive use even if --interactive is given.</p>
<p>Note that this does not turn on command-line editing.</p>
</dd>
<dt>--ess</dt>
<dd><p>(Windows only) Set <code class="calibre2">Rterm</code> up for use by <code class="calibre2">R-inferior-mode</code> in ESS, including asserting interactive use (without the command-line editor) and no buffering of stdout.</p>
</dd>
<dt>--verbose</dt>
<dd><p>Print more information about progress, and in particular set R’s option <code class="calibre2">verbose</code> to <code class="calibre2">TRUE</code>. R code uses this option to control the printing of diagnostic messages.</p>
</dd>
<dt>--debugger=name<br />
-d name</dt>
<dd><p>(UNIX only) Run R through debugger name. For most debuggers (the exceptions are <code class="calibre2">valgrind</code> and recent versions of <code class="calibre2">gdb</code>), further command line options are disregarded, and should instead be given when starting the R executable from inside the debugger.</p>
</dd>
<dt>--gui=type<br />
-g type</dt>
<dd><p>(UNIX only) Use type as graphical user interface (note that this also includes interactive graphics). Currently, possible values for type are ‘X11’ (the default) and, provided that ‘Tcl/Tk’ support is available, ‘Tk’. (For back-compatibility, ‘x11’ and ‘tk’ are accepted.)</p>
</dd>
<dt>--arch=name</dt>
<dd><p>(UNIX only) Run the specified sub-architecture.</p>
</dd>
<dt>--args</dt>
<dd><p>This flag does nothing except cause the rest of the command line to be skipped: this can be useful to retrieve values from it with <code class="calibre2">commandArgs(TRUE)</code>.</p>
</dd>
</dl>
<p>Note that input and output can be redirected in the usual way (using ‘<’ and ‘>’), but the line length limit of 4095 bytes still applies. Warning and error messages are sent to the error channel (<code class="calibre2">stderr</code>).</p>
<p>The command <code class="calibre2">R CMD</code> allows the invocation of various tools which are useful in conjunction with R, but not intended to be called “directly”. The general form is</p>
<div class="example">
<pre class="example1"><code>R CMD command args</code></pre>
</div>
<p>where command is the name of the tool and args the arguments passed on to it.</p>
<p>Currently, the following tools are available.</p>
<dl>
<dt><code class="calibre2">BATCH</code></dt>
<dd><p>Run R in batch mode. Runs <code class="calibre2">R --restore --save</code> with possibly further options (see <code class="calibre2">?BATCH</code>).</p>
</dd>
<dt><code class="calibre2">COMPILE</code></dt>
<dd><p>(UNIX only) Compile C, C++, Fortran … files for use with R.</p>
</dd>
<dt><code class="calibre2">SHLIB</code></dt>
<dd><p>Build shared library for dynamic loading.</p>
</dd>
<dt><code class="calibre2">INSTALL</code></dt>
<dd><p>Install add-on packages.</p>
</dd>
<dt><code class="calibre2">REMOVE</code></dt>
<dd><p>Remove add-on packages.</p>
</dd>
<dt><code class="calibre2">build</code></dt>
<dd><p>Build (that is, package) add-on packages.</p>
</dd>
<dt><code class="calibre2">check</code></dt>
<dd><p>Check add-on packages.</p>
</dd>
<dt><code class="calibre2">LINK</code></dt>
<dd><p>(UNIX only) Front-end for creating executable programs.</p>
</dd>
<dt><code class="calibre2">Rprof</code></dt>
<dd><p>Post-process R profiling files.</p>
</dd>
<dt><code class="calibre2">Rdconv</code><br />
<code class="calibre2">Rd2txt</code></dt>
<dd><p>Convert Rd format to various other formats, including HTML, LaTeX, plain text, and extracting the examples. <code class="calibre2">Rd2txt</code> can be used as shorthand for <code class="calibre2">Rd2conv -t txt</code>.</p>
</dd>
<dt><code class="calibre2">Rd2pdf</code></dt>
<dd><p>Convert Rd format to PDF.</p>
</dd>
<dt><code class="calibre2">Stangle</code></dt>
<dd><p>Extract S/R code from Sweave or other vignette documentation</p>
</dd>
<dt><code class="calibre2">Sweave</code></dt>
<dd><p>Process Sweave or other vignette documentation</p>
</dd>
<dt><code class="calibre2">Rdiff</code></dt>
<dd><p>Diff R output ignoring headers etc</p>
</dd>
<dt><code class="calibre2">config</code></dt>
<dd><p>Obtain configuration information</p>
</dd>
<dt><code class="calibre2">javareconf</code></dt>
<dd><p>(Unix only) Update the Java configuration variables</p>
</dd>
<dt><code class="calibre2">rtags</code></dt>
<dd><p>(Unix only) Create Emacs-style tag files from C, R, and Rd files</p>
</dd>
<dt><code class="calibre2">open</code></dt>
<dd><p>(Windows only) Open a file via Windows’ file associations</p>
</dd>
<dt><code class="calibre2">texify</code></dt>
<dd><p>(Windows only) Process (La)TeX files with R’s style files</p>
</dd>
</dl>
<p>Use</p>
<div class="example">
<pre class="example1"><code>R CMD command --help</code></pre>
</div>
<p>to obtain usage information for each of the tools accessible via the <code class="calibre2">R CMD</code> interface.</p>
<p>In addition, you can use options --arch=, --no-environ, --no-init-file, --no-site-file and --vanilla between <code class="calibre2">R</code> and <code class="calibre2">CMD</code>: these affect any R processes run by the tools. (Here --vanilla is equivalent to --no-environ --no-site-file --no-init-file.) However, note that <code class="calibre2">R CMD</code> does not of itself use any R startup files (in particular, neither user nor site Renviron files), and all of the R processes run by these tools (except <code class="calibre2">BATCH</code>) use --no-restore. Most use --vanilla and so invoke no R startup files: the current exceptions are <code class="calibre2">INSTALL</code>, <code class="calibre2">REMOVE</code>, <code class="calibre2">Sweave</code> and <code class="calibre2">SHLIB</code> (which uses --no-site-file --no-init-file).</p>
<div class="example">
<pre class="example1"><code>R CMD cmd args</code></pre>
</div>
<p>for any other executable <code class="calibre2">cmd</code> on the path or given by an absolute filepath: this is useful to have the same environment as R or the specific commands run under, for example to run <code class="calibre2">ldd</code> or <code class="calibre2">pdflatex</code>. Under Windows cmd can be an executable or a batch file, or if it has extension <code class="calibre2">.sh</code> or <code class="calibre2">.pl</code> the appropriate interpreter (if available) is called to run it.</p>
<hr />
<p><a href="" id="Invoking-R-under-Windows"></a> <a href="" id="Invoking-R-under-Windows-1"></a></p>
<h3 id="b.2-invoking-r-under-windows" class="section">B.2 Invoking R under Windows</h3>
<p>There are two ways to run R under Windows. Within a terminal window (e.g. <code class="calibre2">cmd.exe</code> or a more capable shell), the methods described in the previous section may be used, invoking by <code class="calibre2">R.exe</code> or more directly by <code class="calibre2">Rterm.exe</code>. For interactive use, there is a console-based GUI (<code class="calibre2">Rgui.exe</code>).</p>
<p>The startup procedure under Windows is very similar to that under UNIX, but references to the ‘home directory’ need to be clarified, as this is not always defined on Windows. If the environment variable <code class="calibre2">R_USER</code> is defined, that gives the home directory. Next, if the environment variable <code class="calibre2">HOME</code> is defined, that gives the home directory. After those two user-controllable settings, R tries to find system defined home directories. It first tries to use the Windows "personal" directory (typically <code class="calibre2">My Documents</code> in recent versions of Windows). If that fails, and environment variables <code class="calibre2">HOMEDRIVE</code> and <code class="calibre2">HOMEPATH</code> are defined (and they normally are) these define the home directory. Failing all those, the home directory is taken to be the starting directory.</p>
<p>You need to ensure that either the environment variables <code class="calibre2">TMPDIR</code>, <code class="calibre2">TMP</code> and <code class="calibre2">TEMP</code> are either unset or one of them points to a valid place to create temporary files and directories.</p>
<p>Environment variables can be supplied as ‘name=value’ pairs on the command line.</p>
<p>If there is an argument ending .RData (in any case) it is interpreted as the path to the workspace to be restored: it implies --restore and sets the working directory to the parent of the named file. (This mechanism is used for drag-and-drop and file association with <code class="calibre2">RGui.exe</code>, but also works for <code class="calibre2">Rterm.exe</code>. If the named file does not exist it sets the working directory if the parent directory exists.)</p>
<p>The following additional command-line options are available when invoking <code class="calibre2">RGui.exe</code>.</p>
<dl>
<dt>--mdi<br />
--sdi<br />
--no-mdi</dt>
<dd><p>Control whether <code class="calibre2">Rgui</code> will operate as an MDI program (with multiple child windows within one main window) or an SDI application (with multiple top-level windows for the console, graphics and pager). The command-line setting overrides the setting in the user’s Rconsole file.</p>
</dd>
<dt>--debug</dt>
<dd><p>Enable the “Break to debugger” menu item in <code class="calibre2">Rgui</code>, and trigger a break to the debugger during command line processing.</p>
</dd>
</dl>
<p>Under Windows with <code class="calibre2">R CMD</code> you may also specify your own .bat, .exe, .sh or .pl file. It will be run under the appropriate interpreter (Perl for .pl) with several environment variables set appropriately, including <code class="calibre2">R_HOME</code>, <code class="calibre2">R_OSTYPE</code>, <code class="calibre2">PATH</code>, <code class="calibre2">BSTINPUTS</code> and <code class="calibre2">TEXINPUTS</code>. For example, if you already have latex.exe on your path, then</p>
<div class="example">
<pre class="example1"><code>R CMD latex.exe mydoc</code></pre>
</div>
<p>will run LaTeX on mydoc.tex, with the path to R’s share/texmf macros appended to <code class="calibre2">TEXINPUTS</code>. (Unfortunately, this does not help with the MiKTeX build of LaTeX, but <code class="calibre2">R CMD texify mydoc</code> will work in that case.)</p>
<hr />
<p><a href="" id="Invoking-R-under-macOS"></a> <a href="" id="Invoking-R-under-macOS-1"></a></p>
<h3 id="b.3-invoking-r-under-macos" class="section">B.3 Invoking R under macOS</h3>
<p>There are two ways to run R under macOS. Within a <code class="calibre2">Terminal.app</code> window by invoking <code class="calibre2">R</code>, the methods described in the first subsection apply. There is also console-based GUI (<code class="calibre2">R.app</code>) that by default is installed in the <code class="calibre2">Applications</code> folder on your system. It is a standard double-clickable macOS application.</p>
<p>The startup procedure under macOS is very similar to that under UNIX, but <code class="calibre2">R.app</code> does not make use of command-line arguments. The ‘home directory’ is the one inside the R.framework, but the startup and current working directory are set as the user’s home directory unless a different startup directory is given in the Preferences window accessible from within the GUI.</p>
<hr />
<p><a href="" id="Scripting-with-R"></a> <a href="" id="Scripting-with-R-1"></a></p>
<h3 id="b.4-scripting-with-r" class="section">B.4 Scripting with R</h3>
<p>If you just want to run a file foo.R of R commands, the recommended way is to use <code class="calibre2">R CMD BATCH foo.R</code>. If you want to run this in the background or as a batch job use OS-specific facilities to do so: for example in most shells on Unix-alike OSes <code class="calibre2">R CMD BATCH foo.R &</code> runs a background job.</p>
<p>You can pass parameters to scripts via additional arguments on the command line: for example (where the exact quoting needed will depend on the shell in use)</p>
<div class="example">
<pre class="example1"><code>R CMD BATCH "--args arg1 arg2" foo.R &</code></pre>
</div>
<p>will pass arguments to a script which can be retrieved as a character vector by</p>
<div class="example">
<pre class="example1"><code>args <- commandArgs(TRUE)</code></pre>
</div>
<p>This is made simpler by the alternative front-end <code class="calibre2">Rscript</code>, which can be invoked by</p>
<div class="example">
<pre class="example1"><code>Rscript foo.R arg1 arg2</code></pre>
</div>
<p>and this can also be used to write executable script files like (at least on Unix-alikes, and in some Windows shells)</p>
<div class="example">
<pre class="example1"><code>#! /path/to/Rscript
args <- commandArgs(TRUE)
...
q(status=<exit status code>)</code></pre>
</div>
<p>If this is entered into a text file runfoo and this is made executable (by <code class="calibre2">chmod 755 runfoo</code>), it can be invoked for different arguments by</p>
<div class="example">
<pre class="example1"><code>runfoo arg1 arg2</code></pre>
</div>
<p>For further options see <code class="calibre2">help("Rscript")</code>. This writes R output to stdout and stderr, and this can be redirected in the usual way for the shell running the command.</p>
<p>If you do not wish to hardcode the path to <code class="calibre2">Rscript</code> but have it in your path (which is normally the case for an installed R except on Windows, but e.g. macOS users may need to add /usr/local/bin to their path), use</p>
<div class="example">
<pre class="example1"><code>#! /usr/bin/env Rscript
...</code></pre>
</div>
<p>At least in Bourne and bash shells, the <code class="calibre2">#!</code> mechanism does <strong>not</strong> allow extra arguments like <code class="calibre2">#! /usr/bin/env Rscript --vanilla</code>.</p>
<p>One thing to consider is what <code class="calibre2">stdin()</code> refers to. It is commonplace to write R scripts with segments like</p>
<div class="example">
<pre class="example1"><code>chem <- scan(n=24)
2.90 3.10 3.40 3.40 3.70 3.70 2.80 2.50 2.40 2.40 2.70 2.20
5.28 3.37 3.03 3.03 28.95 3.77 3.40 2.20 3.50 3.60 3.70 3.70</code></pre>
</div>
<p>and <code class="calibre2">stdin()</code> refers to the script file to allow such traditional usage. If you want to refer to the process’s stdin, use <code class="calibre2">"stdin"</code> as a <code class="calibre2">file</code> connection, e.g. <code class="calibre2">scan("stdin", ...)</code>.</p>
<p>Another way to write executable script files (suggested by François Pinard) is to use a <em>here document</em> like</p>
<div class="example">
<pre class="example1"><code>#!/bin/sh
[environment variables can be set here]
R --slave [other options] <<EOF
R program goes here...
EOF</code></pre>
</div>
<p>but here <code class="calibre2">stdin()</code> refers to the program source and <code class="calibre2">"stdin"</code> will not be usable.</p>
<p>Short scripts can be passed to <code class="calibre2">Rscript</code> on the command-line <em>via</em> the -e flag. (Empty scripts are not accepted.)</p>
<p>Note that on a Unix-alike the input filename (such as foo.R) should not contain spaces nor shell metacharacters.</p>
<hr />
<p><a href="" id="The-command_002dline-editor"></a> <a href="" id="The-command_002dline-editor-1"></a></p>