# wch/r-source

### Subversion checkout URL

You can clone with
or
.
Fetching contributors…

Cannot retrieve contributors at this time

11760 lines (7786 sloc) 419.391 kB
 Dear Emacs, please make this -*-Text-*- mode! This file covers NEWS up to the release of R-2.0.0. See 'ONEWS' for subsequent changes. ************************************************** * * * 2.0 SERIES NEWS * * * ************************************************** CHANGES IN R VERSION 2.0.0 USER-VISIBLE CHANGES o The stub packages from 1.9.x have been removed: the library() function selects the new home for their code. o Lazy loading' of R code has been implemented, and is used for the standard and recommended packages by default. Rather than keep R objects in memory, they are kept in a database on disc and only loaded on first use. This accelerates startup (down to 40% of the time for 1.9.x) and reduces memory usage -- the latter is probably unimportant of itself, but reduces commensurately the time spent in garbage collection. Packages are by default installed using lazy loading if they have more than 25Kb of R code and did not use a saved image. This can be overridden by INSTALL --[no-]lazy or via a field in the DESCRIPTION file. Note that as with --save, any other packages which are required must be already installed. As the lazy-loading databases will be consulted often, R will be slower if run from a slow network-mounted disc. o All the datasets formerly in packages 'base' and 'stats' have been moved to a new package 'datasets'. data() does the appropriate substitution, with a warning. However, calls to data() are not normally needed as the data objects are visible in the 'datasets' package. Packages can be installed to make their data objects visible via R CMD INSTALL --lazy-data or via a field in the DESCRIPTION file. o Package 'graphics' has been split into 'grDevices' (the graphics devices shared between base and grid graphics) and 'graphics' (base graphics). Each of the 'graphics' and 'grid' packages load 'grDevices' when they are attached. Note that ps.options() has been moved to grDevices and user hooks may need to be updated. o The semantics of data() have changed (and were incorrectly documented in recent releases) and the function has been moved to package 'utils'. Please read the help page carefully if you use the 'package' or 'lib.loc' arguments. data() now lists datasets, and not just names which data() accepts. o Dataset 'phones' has been renamed to 'WorldPhones'. o Datasets 'sunspot.month' and 'sunspot.year' are available separately but not via data(sunspot) (which was used by package lattice to retrieve a dataset 'sunspot'). o Packages must have been re-installed for this version, and library() will enforce this. o Package names must now be given exactly in library() and require(), regardless of whether the underlying file system is case-sensitive or not. So 'library(mass)' will not work, even on Windows. o R no longer accepts associative use of relational operators. That is, 3 < 2 < 1 (which used to evalute as TRUE!) now causes a syntax error. If this breaks existing code, just add parentheses -- or braces in the case of plotmath. o The R parser now allows multiline strings, without escaping the newlines with backslashes (the old method still works). Patch by Mark Bravington. NEW FEATURES o There is a new atomic vector type, class "raw". See ?raw for full details including the operators and utility functions provided. o The default barplot() method by default uses a gamma-corrected grey palette (rather than the heat color palette) for coloring its output when given a matrix. o The 'formula' method for boxplot() has a 'na.action' argument, defaulting to NULL. This is mainly useful if the response is a matrix when the previous default of 'na.omit' would omit entire rows. (Related to PR#6846.) boxplot() and bxp() now obey global 'par' settings and also allow the specification of graphical options in more detail, compatibly with S-PLUS (fulfilling wishlist entry PR#6832) thanks to contributions from Arni Magnusson. For consistency, 'boxwex' is not an explicit argument anymore. o chull() has been moved to package graphics (as it uses xy.coords). o There is now a coef() method for summaries of "nls" objects. o compareVersion(), packageDescription() and read.00Index() have been moved to package 'utils'. o convolve(), fft(), mvfft() and nextn() have been moved to package stats. o coplot() now makes use of cex.lab and font.lab par() settings. o cumsum/prod/max/min() now preserve names. o data(), .path.packages() and .find.packages() now interpret package = NULL to mean all loaded packages. o data.frame() and its replacement methods remove the names from vector columns. Using I() will ensure that names are preserved. o data.frame(check.names = TRUE) (the default) enforces unique names, as S does. o .Defunct() now has 'new' and 'package' arguments like those of .Deprecated(). o The plot() method for "dendrogram" objects now respects many more nodePar and edgePar settings and for edge labeling computes the extents of the diamond more correctly. o deparse(), dput() and dump() have a new 'control' argument to control the level of detail when deparsing. dump() defaults to the most detail, the others default to less. See ?.deparseOpts for the details. They now evaluate promises by default: see ?dump for details. o dir.create() now expands '~' in filenames. o download.file() has a new progress meter (under Unix) if the length of the file is known -- it uses 50 equals signs. o dyn.load() and library.dynam() return an object describing the DLL that was loaded. For packages with namespaces, the DLL objects are stored in a list within the namespace. o New function eapply() - apply for environments. The supplied function is applied to each element of the environment; the order of application is not specified. o edit() and fix() use the object name in the window caption on some platforms (e.g. Windows). o Function file.edit() function added: like file.show(), but allows editing. o Function file.info() can return file sizes > 2G if the underlying OS supports such. o fisher.test(*, conf.int=FALSE) allows the confidence interval computation to be skipped. o formula() methods for classes "lm" and "glm" used the expanded formula (with '.' expanded) from the terms component. o The formula' method for ftable() now looks for variables in the environment of the formula before the usual search path. o A new function getDLLRegisteredRoutines() returns information about the routines available from a DLL that were explicitly registered with R's dynamic loading facilities. o A new function getLoadedDLLs() returns information about the DLLs that are currently loaded within this session. o The package element returned by getNativeSymbolInfo() contains reference to both the internal object used to resolve symbols with the DLL, and the internal DllInfo structure used to represent the DLL within R. o help() now returns information about available documentation for a given topic, and notifies about multiple matches. It has a separate print() method. If the latex help files were not installed, help() will offer to create a latex file on-the-fly from the installed .Rd file. o heatmap() has a new argument 'reorderfun'. o Most versions of install.packages() have an new optional argument 'dependencies = TRUE' which will not only fetch the packages but also their uninstalled dependencies and their dependencies .... The Unix version of install.packages() attempts to install packages in an order that reflects their dependencies. (This is not needed for binary installs as used under Windows.) o interaction() has new argument 'sep'. o interaction.plot() allows 'type = "b"' and doesn't give spurious warnings when passed a matplot()-only argument such as 'main'. o is.integer() and is.numeric() always return FALSE for a factor. (Previously they were true and false respectively for well-formed factors, but it is possible to create factors with non-integer codes by underhand means.) o New functions is.leaf(), dendrapply() and a labels() method for dendrogram objects. o legend() has an argument 'pt.lwd' and setting 'density' now works because 'angle' now defaults to 45 (mostly contributed by Uwe Ligges). o library() now checks the version dependence (if any) of required packages mentioned in the Depends: field of the DESCRIPTION file. o load() now detects and gives a warning (rather than an error) for empty input, and tries to detect (but not correct) files which have had LF replaced by CR. o ls.str() and lsf.str() now return an object of class "ls_str" which has a print method. o make.names() has a new argument allow_, which if false allows its behaviour in R 1.8.1 to be reproduced. o The 'formula' method for mosaicplot() has a 'na.action' argument defaulting to 'na.omit'. o model.frame() now warns if it is given data = newdata and it creates a model frame with a different number of rows from that implied by the size of 'newdata'. Time series attributes are never copied to variables in the model frame unless na.action = NULL. (This was always the intention, but they sometimes were as the result of an earlier bug fix.) o There is a new 'padj' argument to mtext() and axis(). Code patch provided by Uwe Ligges (fixes PR#1659 and PR#7188). o Function package.dependencies() has been moved to package 'tools'. o The 'formula' method for pairs() has a 'na.action' argument, defaulting to 'na.pass', rather than the value of getOption("na.action"). o There are five new par() settings: 'family' can be used to specify a font family for graphics text. This is a device-independent family specification which gets mapped by the graphics device to a device-specific font specification (see, for example, postscriptFonts()). Currently, only PostScript, PDF, X11, Quartz, and Windows respond to this setting. 'lend', 'ljoin', and 'lmitre' control the cap style and join style for drawing lines (only noticeable on thick lines or borders). Currently, only PostScript, PDF, X11, and Quartz respond to these settings. 'lheight' is a multiplier used in determining the vertical spacing of multi-line text. All of these settings are currently only available via par() (i.e., not in-line as arguments to plot(), lines(), ...) o PCRE (as used by grep etc) has been updated to version 5.0. o A 'version' argument has been added to pdf() device. If this is set to "1.4", the device will support transparent colours. o plot.xy(), the workhorse function of points(), lines() and plot.default() now has 'lwd' as explicit argument instead of implicitly in '...', and now recycles lwd where it makes sense, i.e. for line-based plot symbols. o The png() and jpeg() devices (and the bmp() device under Windows) now allow a nominal resolution to be recorded in the file. o New functions to control mapping from device-independent graphics font family to device-specific family: postscriptFont() and postscriptFonts() (for both postscript() and pdf()); X11Font() and X11Fonts(); windowsFont() and windowsFonts(); quartzFont() and quartzFonts(). o power (x^y) has optimised code for y == 2. o prcomp() is now generic, with a formula method (based on an idea of Jari Oksanen). prcomp() now has a simple predict() method. o printCoefmat() has a new logical argument 'signif.legend'. o quantile() has the option of several methods described in Hyndman & Fan (1996). (Contributed by Rob Hyndman.) o rank() has two new 'ties.method's, "min" and "max". o New function read.fortran() reads Fortran-style fixed-format specifications. o read.fwf() reads multiline records, is faster for large files. o read.table() now accepts "NULL", "factor", "Date" and "POSIXct" as possible values of colClasses, and colClasses can be a named character vector. o readChar() can now read strings with embedded nuls. o The "dendrogram" method for reorder() now has a 'agglo.FUN' argument for specification of a weights agglomeration function. o New reorder() method for factors, slightly extending that in lattice. Contributed by Deepayan Sarkar. o Replaying a plot (with replayPlot() or via autoprinting) now automagically opens a device if none is open. o replayPlot() issues a warning if an attempt is made to replay a plot that was recorded using a different R version (the format for recorded plots is not guaranteed to be stable across different R versions). The Windows-menu equivalent (History...Get from variable) issues a similar warning. o reshape() can handle multiple 'id' variables. o It is now possible to specify colours with a full alpha transparency channel via the new 'alpha' argument to the rgb() and hsv() functions, or as a string of the form "#RRGGBBAA". NOTE: most devices draw nothing if a colour is not opaque, but PDF and Quartz devices will render semitransparent colours. A new argument 'alpha' to the function col2rgb() provides the ability to return the alpha component of colours (as well as the red, green, and blue components). o save() now checks that a binary connection is used. o seek() on connections now accepts and returns a double for the file position. This allows >2Gb files to be handled on a 64-bit platform (and some 32-bit platforms). o source() with 'echo = TRUE' uses the function source attribute when displaying commands as they are parsed. o setClass() and its utilities now warn if either superclasses or classes for slots are undefined. (Use setOldClass to register S3 classes for use as slots) o str(obj) now displays more reasonably the STRucture of S4 objects. It is also improved for language objects and lists with promise components. The method for class "dendrogram" has a new argument 'stem' and indicates when it's not printing all levels (as typically when e.g., 'max.level = 2'). Specifying 'max.level = 0' now allows to suppress all but the top level for hierarchical objects such as lists. This is different to previous behavior which was the default behavior of giving all levels is unchanged. The default behavior is unchanged but now specified by 'max.level = NA'. o system.time() has a new argument 'gcFirst' which, when TRUE, forces a garbage collection before timing begins. o tail() of a matrix now displays the original row numbers. o The default method for text() now coerces a factor to character and not to its internal codes. This is incompatible with S but seems what users would expect. It now also recycles (x,y) to the length of 'labels' if that is longer. This is now compatible with grid.text() and S. (See also PR#7084.) o TukeyHSD() now labels comparisons when applied to an interaction in an aov() fit. It detects non-factor terms in 'which' and drops them if sensible to do so. o There is now a replacement method for window(), to allow a range of values of time series to be replaced by specifying the start and end times (and optionally a frequency). o If writeLines() is given a connection that is not open, it now attempts to open it in mode = "wt" rather than the default mode specified when creating the connection. o The screen devices x11(), windows() and quartz() have a new argument 'bg' to set the default background colour. o Subassignments involving NAs and with a replacement value of length > 1 are now disallowed. (They were handled inconsistently in R < 2.0.0, see PR#7210.) For data frames they are disallowed altogether, even for logical matrix indices (the only case which used to work). o The way the comparison operators handle a list argument has been rationalized so a few more cases will now work -- see ?Comparison. o Indexing a vector by a character vector was slow if both the vector and index were long (say 10,000). Now hashing is used and the time should be linear in the longer of the lengths (but more memory is used). o Printing a character string with embedded nuls now prints the whole string, and non-printable characters are represented by octal escape sequences. o Objects created from a formally defined class now include the name of the corresponding package as an attribute in the object's class. This allows packages with namespaces to have private (non-exported) classes. o Changes to package 'grid': - Calculation of number of circles to draw in circleGrob now looks at length of y and r as well as length of x. - Calculation of number of rectangles to draw in rectGrob now looks at length of y, w, and h as well as length of x. - All primitives (rectangles, lines, text, ...) now handle non-finite values (NA, Inf, -Inf, NaN) for locations and sizes. Non-finite values for locations, sizes, and scales of viewports result in error messages. There is a new vignette ("nonfinite") which describes this new behaviour. - Fixed (unreported) bug in drawing circles. Now checks that radius is non-negative. - downViewport() now reports the depth it went down to find a viewport. Handy for "going back" to where you started, e.g., ... depth <- downViewport("vpname") upViewport(depth) - The "alpha" gpar() is now combined with the alpha channel of colours when creating a gcontext as follows: (internal C code) finalAlpha = gpar("alpha")*(R_ALPHA(col)/255) This means that gpar(alpha=) settings now affect internal colours so grid alpha transparency settings now are sent to graphics devices. The alpha setting is also cumulative. For example, ... grid.rect(width=0.5, height=0.5, gp=gpar(fill="blue")) # alpha = 1 pushViewport(viewport(gp=gpar(alpha=0.5))) grid.rect(height=0.25, gp=gpar(fill="red")) # alpha = 0.5 pushViewport(viewport(gp=gpar(alpha=0.5))) grid.rect(width=0.25, gp=gpar(fill="red")) # alpha = 0.25 ! - Editing a gp slot in a grob is now incremental. For example ... grid.lines(name="line") grid.edit("line", gp=gpar(col="red")) # line turns red grid.edit("line", gp=gpar(lwd=3)) # line becomes thick # AND STAYS red - The "cex" gpar is now cumulative. For example ... grid.rect(height=unit(4, "char")) # cex = 1 pushViewport(viewport(gp=gpar(cex=0.5))) grid.rect(height=unit(4, "char")) # cex = 0.5 pushViewport(viewport(gp=gpar(cex=0.5))) grid.rect(height=unit(4, "char")) # cex = 0.125 !!! - New childNames() function to list the names of children of a gTree. - The "grep" and "global" arguments have been implemented for grid.[add|edit|get|remove]Grob() functions. The "grep" argument has also been implemented for the grid.set() and setGrob(). - New function grid.grab() which creates a gTree from the current display list (i.e., the current page of output can be converted into a single gTree object with all grobs on the current page as children of the gTree and all the viewports used in drawing the current page in the childrenvp slot of the gTree). - New "lineend", "linejoin", and "linemitre" gpar()s: line end can be "round", "butt", or "square". line join can be "round", "mitre", or "bevel". line mitre can be any number larger than 1 (controls when a mitre join gets turned into a bevel join; proportional to angle between lines at join; very big number means that conversion only happens for lines that are almost parallel at join). - New grid.prompt() function for controlling whether the user is prompted before starting a new page of output. Grid no longer responds to the par(ask) setting in the "graphics" package. o The tcltk package has had the tkcmd() function renamed as tcl() since it could be used to invoke commands that had nothing to do with Tk. The old name is retained, but will be deprecated in a future release. Similarly, we now have tclopen(), tclclose(), tclread(), tclputs(), tclfile.tail(), and tclfile.dir() replacing counterparts starting with "tk", with old names retained for now. UTILITIES o R CMD check now checks for file names in a directory that differ only by case. o R CMD check now checks Rd files using R code from package tools, and gives refined diagnostics about "likely" Rd problems (stray top-level text which is silently discarded by Rdconv). o R CMD INSTALL now fails for packages with incomplete/invalid DESCRIPTION metadata, using new code from package tools which is also used by R CMD check. o list_files_with_exts (package tools) now handles zipped directories. o Package 'tools' now provides Rd_parse(), a simple top-level parser/analyzer for R documentation format. o tools::codoc() (and hence R CMD check) now checks any documentation for registered S3 methods and unexported objects in packages with namespaces. o Package 'utils' contains several new functions: - Generics toBibtex() and toLatex() for converting R objects to BibTeX and LaTeX (but almost no methods yet). - A much improved citation() function which also has a package argument. By default the citation is auto-generated from the package DESCRIPTION, the file 'inst/CITATION' can be used to override this, see help(citation) and help(citEntry). - sessionInfo() can be used to include version information about R and R packages in text or LaTeX documents. DOCUMENTATION o The DVI and PDF manuals are now all made on the paper specified by R_PAPERSIZE (default 'a4'), even the .texi manuals which were made on US letter paper in previous versions. o The reference manual now omits 'internal' help pages. o There is a new help page shown by help("Memory-limits") which documents the current design limitations on large objects. o The format of the LaTeX version of the documentation has changed. The old format is still accepted, but only the new resolves cross-references to object names containing _, for example. o HTML help pages now contain a reference to the package and version in the footer, and HTML package index pages give their name and version at the top. o All manuals in the 2.x series have new ISBN numbers. o The 'R Data Import/Export' manual has been revised and has a new chapter on Reading Excel spreadsheets'. C-LEVEL FACILITIES o The PACKAGE argument for .C/.Call/.Fortran/.External can be omitted if the call is within code within a package with a namespace. This ensures that the native routine being called is found in the DLL of the correct version of the package if multiple versions of a package are loaded in the R session. Using a namespace and omitting the PACKAGE argument is currently the only way to ensure that the correct version is used. o The header Rmath.h contains a definition for R_VERSION_STRING which can be used to track different versions of R and libRmath. o The Makefile in src/nmath/standalone now has 'install' and 'uninstall' targets -- see the README file in that directory. o More of the header files, including Rinternals.h, Rdefines.h and Rversion.h, are now suitable for calling directly from C++. o Configure looks to a suitable option for inlining C code which made available as macro R_INLINE: see Writing R Extensions' for further details. DEPRECATED & DEFUNCT o Direct use of R INSTALL|REMOVE|BATCH|COMPILE|SHLIB has been removed: use R CMD instead. o La.eigen(), tetragamma(), pentagamma(), package.contents() and package.description() are defunct. o The undocumented function newestVersion() is no longer exported from package utils. (Mainly because it was not completely general.) o C-level entry point ptr_R_GetX11Image has been removed, as it was replaced by R_GetX11Image at 1.7.0. o The undocumented C-level entry point R_IsNaNorNA has been removed. It was used in a couple of packages, and should be replaced by a call to the documented macro ISNAN. o The gnome/GNOME graphics device is now defunct. INSTALLATION CHANGES o Arithmetic supporting +/-Inf, NaNs and the IEC 60559 (aka IEEE 754) standard is now required -- the partial and often untested support for more limited arithmetic has been removed. The C99 macro isfinite is used in preference to finite if available (and its correct functioning is checked at configure time). Where isfinite or finite is available and works, it is used as the substitution value for R_FINITE. On some platforms this leads to a performance gain. (This applies to compiled code in packages only for isfinite.) o The dynamic libraries libR and libRlapack are now installed in R_HOME/lib rather than R_HOME/bin. o When --enable-R-shlib is specified, the R executable is now a small executable linked against libR: see the R-admin manual for further discussion. The 'extra' libraries bzip2, pcre, xdr and zlib are now compiled in a way that allows the code to be included in a shared library only if this option is specified, which might improve performance when it is not. o The main R executable is now R_HOME/exec/R not R_HOME/R.bin, to ease issues on MacOS X. (The location is needed when debugging core dumps, on other platforms.) o Configure now tests for 'inline' and alternatives, and the src/extra/bzip2 code now (potentially) uses inlining where available and not just under gcc. o The XPG4 sed is used on Solaris for forming dependencies, which should now be done correctly. o Makeinfo 4.5 or later is now required for building the HTML and Info versions of the manuals. However, binary distributions need to be made with 4.7 or later to ensure some of the links are correct. o f2c is not allowed on 64-bit platforms, as it uses longs for Fortran integers. o There are new options on how to make the PDF version of the reference manual -- see the 'R Administration and Installation Manual' section 2.2. o The concatenated Rd files in the installed 'man' directory are now compressed and the R CMD check routines can read the compressed files. o There is a new configure option --enable-linux-lfs that will build R with support for > 2Gb files on suitably recent 32-bit Linux systems. PACKAGE INSTALLATION CHANGES o The DESCRIPTION file of packages may contain a 'Imports:' field for packages whose namespaces are used but do not need to be attached. Such packages should no longer be listed in 'Depends:'. o There are new optional fields 'SaveImage', 'LazyLoad' and 'LazyData' in the DESCRIPTION file. Using 'SaveImage' is preferred to using an empty file 'install.R'. o A package can contain a file 'R/sysdata.rda' to contain system datasets to be lazy-loaded into the namespace/package environment. o The packages listed in 'Depends' are now loaded before a package is loaded (or its image is saved or it is prepared for lazy loading). This means that almost all uses of R_PROFILE.R and install.R are now unnecessary. o If installation of any package in a bundle fails, R CMD INSTALL will back out the installation of all of the bundle, not just the failed package (on both Unix and Windows). BUG FIXES o Complex superassignments were wrong when a variable with the same name existed locally, and were not documented in R-lang. o rbind.data.frame() dropped names/rownames from columns in all but the first data frame. o The dimnames<- method for data.frames was not checking the validity of the row names. o Various memory leaks reported by valgrind have been plugged. o gzcon() connections would sometimes read the crc bytes from the wrong place, possibly uninitialized memory. o Rd.sty contained a length \middle that was not needed after a revision in July 2000. It caused problems with LaTeX systems based on e-TeX which are starting to appear. o save() to a connection did not check that the connection was open for writing, nor that non-ascii saves cannot be made to a text-mode connection. o phyper() uses a new algorithm based on Morten Welinder's bug report (PR#6772). This leads to faster code for large arguments and more precise code, e.g. for phyper(59, 150,150, 60, lower=FALSE). This also fixes bug (PR#7064) about fisher.test(). o print.default(*, gap = ) now in principle accepts all non-negative values . o smooth.spline(...)$pen.crit had a typo in its computation; note this was printed in print.smooth.spline(*) but not used in other "smooth.spline" methods. o write.table() handles zero-row and zero-column inputs correctly. o debug() works on trivial functions instead of crashing. (PR#6804) o eval() could alter a data.frame/list second argument, so with(trees, Girth[1] <- NA) altered 'trees' (and any copy of 'trees' too). o cor() could corrupt memory when the standard deviation was zero. (PR#7037) o inverse.gaussian() always printed 1/mu^2 as the link function. o constrOptim() now passes ... arguments through optim to the objective function. o object.size() now has a better estimate for character vectors: it was in general too low (but only significantly so for very short character strings) but over-estimated NA and duplicated elements. o quantile() now interpolates correctly between finite and infinite values (giving +/-Inf rather than NaN). o library() now gives more informative error messages mentioning the package being loaded. o Building the reference manual no longer uses roman upright quotes in typewriter output. o model.frame() no longer builds invalid data frames if the data contains time series and rows are omitted by na.action. o write.table() did not escape quotes in column names. (PR#7171) o Range checks missing in recursive assignments using [[ ]]. (PR#7196) o packageStatus() reported partially-installed bundles as installed. o apply() failed on an array of dimension >=3 when for each iteration the function returns a named vector of length >=2. (PR#7205) o The GNOME interface was in some circumstances failing if run from a menu -- it needed to always specify that R be interactive. o depMtrxToStrings (part of pkgDepends) applied nrow() to a non-matrix and aborted on the result. o Fix some issues with nonsyntactical names in modelling code (PR#7202), relating to backquoting. There are likely more. o Support for S4 classes that extend basic classes has been fixed in several ways. as() methods and x@.Data should work better. o hist() and pretty() accept (and ignore) infinite values. (PR#7220) o It is no longer possible to call gzcon() more than once on a connection. o t.test() now detects nearly-constant input data. (PR#7225) o mle() had problems if ndeps or parscale was supplied in the control arguments for optim(). Also, the profiler is now more careful to reevaluate modified mle() calls in its parent environment. o Fix to rendering of accented superscripts and subscripts e.g., expression((b[dot(a)])). (Patch from Uwe Ligges.) o attach(*, pos=1) now gives a warning (and will give an error). o power.*test() now gives an error when 'sig.level' is outside [0,1]. (PR#7245) o Fitting a binomial glm with a matrix response lost the names of the response, which should have been transferred to the residuals and fitted values. o print.ts() could get the year wrong because rounding issue (PR#7255) ************************************************** * * * 1.9 SERIES NEWS * * * ************************************************** CHANGES IN R VERSION 1.9.1 Patched INSTALLATION ISSUES o Installation will now work even in Norwegian and Danish locales which sort AA at the end (for package stats4 which has AAA.R). BUG FIXES o Various memory leaks have been plugged and uses of strcpy() with overlapping src and dest corrected. o R CMD INSTALL now also works for /bin/sh's such as the one from Solaris 8 which fail when a function has the same name as a variable. o The Date method for trunc() failed. o window() failed if both start and end were outside the time range of the original series (possible if extend = TRUE). o coplot(..) doesn't give an extraneous warning anymore when called on a fresh device. o hasArg() used wrong logic to get the parent function. (sys.function() behaves differently from what is documented.) o prompt(f) now gives proper \usage{..} for f <- function(x, g = function(u) { v <- u^2 ; sin(v)/v }) { g(x) } o package.skeleton() now uses the supplied name in the DESCRIPTION file. o options(list('digits', 'scipen')) no longer seg.faults, the problem being the misuse of a list (PR#7078). o summary.Date() now has a more sensible default for 'digits'. o list.files(all.files = TRUE, recursive = TRUE) died on infinite recursion. (PR#7100) o cor(as.array(c(a=1,b=2)), cbind(1:2)) no longer seg.faults (PR#7116). cor(), cov() and var() no longer accidentally work with list() arguments as if they were unlist()ed. o as.matrix(data.frame(d=as.POSIXct("2004-07-20"))) doesn't give a wrong warning anymore. o gsub(perl=TRUE) code got the R length of return strings wrong in some circumstances (when the string was shortened). (Fixed also PR#7108) o summaryRprof() was ignoring functions whose name begins with dot, e.g. .C, .Call, .Fortran. (PR#7137) o loglin() could segfault if 'start' was of the wrong length. (PR#7123) o model.tables(type="means") could fail in a design where a projection gave all zeros. (PR#7132) o Applying attributes() to a pairlist, e.g. .Options, could segfault. o The checking of R versions incorrectly assumed 1.9.1 >= 1.50. o str(Surv(..)) failed for type = "counting" Surv objects and for promises. o approx(c(1,2),c(NA,NA),1.5,rule=2) does not segfault anymore (PR#7177), but gives an error. o nls(model = TRUE) was broken. o Subsetted assignments of the form A[i1, i2, i3] <- B stopped as soon as an NA was encountered in an index so subsequent non-NA indices were ignored. (PR#7210) o Fixed bug in handling of lwd=NA in contour(). o is.na() was returning undefined results on nested lists. CHANGES IN R VERSION 1.9.1 NEW FEATURES o as.Date() now has a method for "POSIXlt" objects. o mean() has a method for "difftime" objects and so summary() works for such objects. o legend() has a new argument 'pt.cex'. o plot.ts() has more arguments, particularly 'yax.flip'. o heatmap() has a new 'keep.dendro' argument. o The default barplot method now handles vectors and 1-d arrays (e.g., obtained by table()) the same, and uses grey instead of heat color palettes in these cases. (Also fixes PR#6776.) o nls() now looks for variables and functions in its formula in the environment of the formula before the search path, in the same way lm() etc look for variables in their formulae. INSTALLATION ISSUES o src/modules/X11/dataentry.c would not build on some XFree 4.4.0 systems. (This is a bug in their header files but we have added a workaround.) o Building with gcc/g77 3.4.0 on ix86 platforms failed to produce a working build: the critical LAPACK routines are now compiled with -ffloat-store. o Added patches to enable 64-bit builds on AIX 5.1: see the R-admin manual for details. o Added some patches to allow non-IEEE-754 installations to work reasonably well. (Infs and NAs are still not handled properly in complex arithmetic and functions such as sin(). See also Deprecated, as support for non-IEEE-754 installations is about to be removed.) o Installation will now work in Estonian (et_EE*) locales, which sort z before u. (PR#6958) DEPRECATED & DEFUNCT o Support for non-IEEE-754 arithmetic (which has been untested for some time) will be removed in the next full release. o Direct use of R INSTALL|REMOVE|BATCH|COMPILE|SHLIB is deprecated: use R CMD instead. o The gnome/GNOME graphics device is deprecated and will be removed in the next full release. BUG FIXES o pbinom(q, N, prob) is now more accurate when prob is close to 0. (PR#6757) o pcauchy(x, .., log.p) is now more accurate for large x, particularly when log.p = TRUE. (PR#6756) o pgeom(q, prob, lower.tail, log.p) is now (sometimes much) more accurate when prob is very small. (PR#6792) The code for pgeom(prob=1) assumed IEEE 754 arithmetic, and gave NaNs under gcc 3.4.0 -fPIC, for example. o makeARIMA() was not handling an ARMA(0, 0) model correctly. o as.Date() was failing on factors. (PR#6779) o min(), max() and range() were failing on "difftime" objects. o as.data.frame.list() could fail on some unusual list names. (PR#6782) o type.convert() ignored na.strings when no conversion was done. (PR#6781, not needed for its primary use in read.table.) o Fixed a clipping problem in the quartz() device. o Subsetting a factor swapped the order of the attributes, which identical() cares about. (PR#6799) o The L-BFGS-B option of optim() apparently needs part of its workspace zeroed. (PR#6720) o extractAIC.survreg() needed updating. o When using the header Rmath.h in standalone mode, the case where TRUE, FALSE are already defined is now handled correctly. o Package utils now exports several functions that are needed for writing Sweave drivers. o Comparison of two lists/expressions was giving nonsensical (and often random) answers, and is now an error. o The C-level function ncols was returning a random answer (often 0) for a 1D array. This caused model.matrix to misbehave (perhaps segfault) if a term was a 1D array. (PR#6838) o The configure script now finds the pdf viewers ggv and gpdf. o Workaround for the problems strptime on MacOS X has with dates before 1900. o 'R CMD build' works in a directory whose path contains spaces. (PR#6830 under Unix/Linux: it already worked under Windows.) Also 'R CMD check'. o mosaicplot() stops cleanly if given a table containing missing values. o install.packages() from a local CRAN was broken. o bxp() fixed for e.g., boxplot(..., border=2:4) o approx(list(x=rep(NaN,9), y=1:9), xout=NaN) does not seg.fault anymore (PR#6809). o plot(1, pch=NA) does not give an error anymore and plot(1:2, pch=c("o",NA)) only prints one symbol (PR#6876). o diffinv(matrix(3, 7,0)) now works. o plot.ts(z) for multivariate 'z' now properly draws all 'nc' xlabs when nc > 1 and obeys 'ann=FALSE' or 'axes=FALSE'. o aggregate(.data.frame) failed if the answer would have had one row. o recordPlot() and replayPlot() failed to duplicate the display list, so further plotting altered the saved or replayed object. o Assignments of the form adf[i,j] <- value now accept a data-frame value as well as a list value. o dir.create() sometimes erroneously continued to report a directory already existed after the first instance. (PR#6892) o arima.sim() allows a null model. o which.min() & which.max()'s C code now PROTECT()'s its result. o Building standalone nmath did not support some of the DEBUG options. o mle() got confused if start value list was not in same order as arguments of likelihood function (reported by Ben Bolker) o backsolve(r, x, k) now allows k < nrow(x) - as its documentation always claimed. o update.packages("mgcv") and old.packages(*) now give a better error message; and installed.packages("mgcv") properly returns . o stats:::as.dendrogram.hclust() is documented and no longer re-sorts the two children at each node. This fixes as.dendrogram(hh) for the case where hh is a "reordered" hclust object. plot.dendrogram(x) now draws leaves 'x' more sensibly. reorder.dendrogram() now results in a dendrogram with correct "midpoint"s, and hence reordered dendrograms are plotted correctly. stats:::midcache.dendrogram() and hence the reorder() and rev() dendrogram methods do not return bloated dendrograms. o heatmap(*, labRow=., labCol=.) now also reorders the labels when specified---not only when using default labels. o Copying lattice (grid) output to another device now works again (There were intermittent problems in 1.9.0 - PR#6915, #6947/8.) o hist() uses a more robust choice of its 'diddle' factor, used to detect if an observation is on a bin boundary. (PR#6931) o jitter(x) now returns x when length(x) == 0. o Under some rare circumstances the locale-specific tables used by the perl=TRUE option to grep() etc were being corrupted and so matches were missed. o qbinom(*, prob = 0, lower.tail = FALSE) now properly gives 0. (PR#6972) o Class "octmode" needed a "[" method to preserve the class: see example(file.info) for an example. CHANGES IN R VERSION 1.9.0 USER-VISIBLE CHANGES o Underscore '_' is now allowed in syntactically valid names, and make.names() no longer changes underscores. Very old code that makes use of underscore for assignment may now give confusing error messages. o Package 'base' has been split into packages 'base', 'graphics', 'stats' and 'utils'. All four are loaded in a default installation, but the separation allows a 'lean and mean' version of R to be used for tasks such as building indices. Packages ctest, eda, modreg, mva, nls, stepfun and ts have been merged into stats, and lqs has been returned to MASS. In all cases a stub has been left that will issue a warning and ensure that the appropriate new home is loaded. All the time series datasets have been moved to package stats. Sweave has been moved to utils. Package mle has been moved to stats4 which will become the central place for statistical S4 classes and methods distributed with base R. Package mle remains as a stub. Users may notice that code in .Rprofile is run with only the new base loaded and so functions may now not be found. For example, ps.options(horizontal = TRUE) should be preceded by library(graphics) or called as graphics::ps.options or, better, set as a hook -- see ?setHook. o There has been a concerted effort to speed up the startup of an R session: it now takes about 2/3rds of the time of 1.8.1. o A warning is issued at startup in a UTF-8 locale, as currently R only supports single-byte encodings. NEW FEATURES o$, $<-, [[, [[<- can be applied to environments. Only character arguments are allowed and no partial matching is done. The semantics are basically that of get/assign to the environment with inherits=FALSE. o There are now print() and [ methods for "acf" objects. o aov() will now handle singular Error() models, with a warning. o arima() allows models with no free parameters to be fitted (to find log-likelihood and AIC values, thanks to Rob Hyndman). o array() and matrix() now allow 0-length data' arguments for compatibility with S. o as.data.frame() now has a method for arrays. o as.matrix.data.frame() now coerces an all-logical data frame to a logical matrix. o New function assignInNamespace() parallelling fixInNamespace. o There is a new function contourLines() to produce contour lines (but not draw anything). This makes the CRAN package clines (with its clines() function) redundant. o D(), deriv(), etc now also differentiate asin(), acos(), atan(), (thanks to a contribution of Kasper Kristensen). o The package' argument to data() is no longer allowed to be a (unquoted) name and so can be a variable name or a quoted character string. o There is a new class "Date" to represent dates (without times) plus many utility functions similar to those for date-times. See ?Date. o Deparsing (including using dump() and dput()) an integer vector now wraps it in as.integer() so it will be source()d correctly. (Related to PR#4361.) o .Deprecated() has a new argument package' which is used in the warning message for non-base packages. o The print() method for "difftime" objects now handles arrays. o dir.create() is now an internal function (rather than a call to mkdir) on Unix as well as on Windows. There is now an option to suppress warnings from mkdir, which may or may not have been wanted. o dist() has a new method to calculate Minkowski distances. o expand.grid() returns appropriate array dimensions and dimnames in the attribute "out.attrs", and this is used by the predict() method for loess to return a suitable array. o factanal(), loess() and princomp() now explicitly check for numerical inputs; they might have silently coded factor variables in formulae. o New functions factorial(x) defined as gamma(x+1) and for S-PLUS compatibility, lfactorial(x) defined as lgamma(x+1). o findInterval(x, v) now allows +/-Inf values, and NAs in x. o formula.default() now looks for a "terms" component before a 'formula' argument in the saved call: the component will have .' expanded and probably will have the original environment set as its environment. And what it does is now documented. o glm() arguments etastart' and mustart' are now evaluated via the model frame in the same way as subset' and weights'. o Functions grep(), regexpr(), sub() and gsub() now coerce their arguments to character, rather than give an error. The perl=TRUE argument now uses character tables prepared for the locale currently in use each time it is used, rather than those of the C locale. o New functions head() and tail() in package utils'. (Based on a contribution by Patrick Burns.) o legend() has a new argument 'text.col'. o methods(class=) now checks for a matching generic, and so no longer returns methods for non-visible generics (and eliminates various mismatches). o A new function mget() will retrieve multiple values from an environment. o model.frame() methods, for example those for "lm" and "glm", pass relevant parts of ... onto the default method. (This has long been documented but not done.) The default method is now able to cope with model classes such as "lqs" and "ppr". o nls() and ppr() have a model' argument to allow the model frame to be returned as part of the fitted object. o "POSIXct" objects can now have a "tzone" attribute that determines how they will be converted and printed. This means that date-time objects which have a timezone specified will generally be regarded as in their original time zone. o postscript() device output has been modified to work around rounding errors in low-precision calculations in gs >= 8.11. (PR#5285, which is not a bug in R.) It is now documented how to use other Computer Modern fonts, for example italic rather than slanted. o ppr() now fully supports categorical explanatory variables, ppr() is now interruptible at suitable places in the underlying FORTRAN code. o princomp() now warns if both x' and covmat' are supplied, and returns scores only if the centring used is known. o psigamma(x, deriv=0), a new function generalizes, digamma() etc. All these (psigamma, digamma, trigamma,...) now also work for x < 0. o pchisq(*, ncp > 0) and hence qchisq() now work with much higher values of ncp; it has become much more accurate in the left tail. o read.table() now allows embedded newlines in quoted fields. (PR#4555) o rep.default(0-length-vector, length.out=n) now gives a vector of length n and not length 0, for compatibility with S. If both each' and length.out' have been specified, it now recycles rather than fills with NAs for S compatibility. If both times' and length.out' have been specified, times' is now ignored for S compatibility. (Previously padding with NAs was used.) The "POSIXct" and "POSIXlt" methods for rep() now pass ... on to the default method (as expected by PR#5818). o rgb2hsv() is new, an R interface the C API function with the same name. o User hooks can be set for onLoad, library, detach and onUnload of packages/namespaces: see ?setHook. o save() default arguments can now be set using option "save.defaults", which is also used by save.image() if option "save.image.defaults" is not present. o New function shQuote() to quote strings to be passed to OS shells. o sink() now has a split= argument to direct output to both the sink and the current output connection. o split.screen() now works for multiple devices at once. o On some OSes (including Windows and those using glibc) strptime() did not validate dates correctly, so we have added extra code to do so. However, this cannot correct scanning errors in the OS's strptime (although we have been able to work around these on Windows). Some examples are now tested for during configuration. o strsplit() now has fixed' and perl' arguments and split="" is optimized. o subset() now allows a drop' argument which is passed on to the indexing method for data frames. o termplot() has an option to smooth the partial residuals. o varimax() and promax() add class "loadings" to their loadings component. o Model fits now add a "dataClasses" attribute to the terms, which can be used to check that the variables supplied for prediction are of the same type as those used for fitting. (It is currently used by predict() methods for classes "lm", "mlm", "glm" and "ppr", as well as methods in packages MASS, rpart and tree.) o New command-line argument --max-ppsize allows the size of the pointer protection stack to be set higher than the previous limit of 10000. o The fonts on an X11() device (also jpeg() and png() on Unix) can be specified by a new argument fonts' defaulting to the value of a new option "X11fonts". o New functions in the tools package: pkgDepends, getDepList and installFoundDepends. These provide functionality for assessing dependencies and the availability of them (either locally or from on-line repositories). o The parsed contents of a NAMESPACE file are now stored at installation and if available used to speed loading the package, so packages with namespaces should be reinstalled. o Argument asp' although not a graphics parameter is accepted in the ... of graphics functions without a warning. It now works as expected in contour(). o Package stats4 exports S4 generics for AIC() and BIC(). o The Mac OS X version now produces an R framework for easier linking of R into other programs. As a result, R.app is now relocatable. o Added experimental support for conditionals in NAMESPACE files. o Added as.list.environment to coerce environments to lists (efficiently). o New function addmargins() in the stats package to add marginal summaries to tables, e.g. row and column totals. (Based on a contribution by Bendix Carstensen.) o dendrogam edge and node labels can now be expressions (to be plotted via stats:::plotNode called from plot.dendrogram). The diamond frames around edge labels are more nicely scaled horizontally. o Methods defined in the methods package can now include default expressions for arguments. If these arguments are missing in the call, the defaults in the selected method will override a default in the generic. See ?setMethod. o Changes to package 'grid': - Renamed push/pop.viewport() to push/popViewport(). - Added upViewport(), downViewport(), and seekViewport() to allow creation and navigation of viewport tree (rather than just viewport stack). - Added id and id.lengths arguments to grid.polygon() to allow multiple polygons within single grid.polygon() call. - Added vpList(), vpStack(), vpTree(), and current.vpTree() to allow creation of viewport "bundles" that may be pushed at once (lists are pushed in parallel, stacks in series). current.vpTree() returns the current viewport tree. - Added vpPath() to allow specification of viewport path in downViewport() and seekViewport(). See ?viewports for an example of its use. NOTE: it is also possible to specify a path directly, e.g., something like "vp1::vp2", but this is only advised for interactive use (in case I decide to change the separator :: in later versions). - Added "just" argument to grid.layout() to allow justification of layout relative to parent viewport *IF* the layout is not the same size as the viewport. There's an example in help(grid.layout). - Allowed the "vp" slot in a grob to be a viewport name or a vpPath. The interpretation of these new alternatives is to call downViewport() with the name or vpPath before drawing the grob and upViewport() the appropriate amount after drawing the grob. Here's an example of the possible usage: pushViewport(viewport(w=.5, h=.5, name="A")) grid.rect() pushViewport(viewport(w=.5, h=.5, name="B")) grid.rect(gp=gpar(col="grey")) upViewport(2) grid.rect(vp="A", gp=gpar(fill="red")) grid.rect(vp=vpPath("A", "B"), gp=gpar(fill="blue")) - Added engine.display.list() function. This allows the user to tell grid NOT to use the graphics engine display list and to handle ALL redraws using its own display list (including redraws after device resizes and copies). This provides a way to avoid some of the problems with resizing a device when you have used grid.convert(), or the gridBase package, or even base functions such as legend(). There is a document discussing the use of display lists in grid on the grid web site (http://www.stat.auckland.ac.nz/~paul/grid/grid.html) - Changed the implementation of grob objects. They are no longer implemented as external references. They are now regular R objects which copy-by-value. This means that they can be saved/loaded like normal R objects. In order to retain some existing grob behaviour, the following changes were necessary: + grobs all now have a "name" slot. The grob name is used to uniquely identify a "drawn" grob (i.e., a grob on the display list). + grid.edit() and grid.pack() now take a grob name as the first argument instead of a grob. (Actually, they take a gPath - see below) + the "grobwidth" and "grobheight" units take either a grob OR a grob name (actually a gPath - see below). Only in the latter case will the unit be updated if the grob "pointed to" is modified. In addition, the following features are now possible with grobs: + grobs now save()/load() like any normal R object. + many grid.*() functions now have a *Grob() counterpart. The grid.*() version is used for its side-effect of drawing something or modifying something which has been drawn; the *Grob() version is used for its return value, which is a grob. This makes it more convenient to just work with grob objects without producing any graphical output (by using the *Grob() functions). + there is a gTree object (derived from grob), which is a grob that can have children. A gTree also has a "childrenvp" slot which is a viewport which is pushed and then "up"ed before the children are drawn; this allows the children of a gTree to place themselves somewhere in the viewports specified in the childrenvp by having a vpPath in their vp slot. + there is a gPath object, which is essentially a concatenation of grob names. This is used to specify the child of (a child of ...) a gTree. + there is a new API for creating/accessing/modifying grob objects: grid.add(), grid.remove(), grid.edit(), grid.get() (and their *Grob() counterparts can be used to add, remove, edit, or extract a grob or the child of a gTree. NOTE: the new grid.edit() API is incompatible with the previous version. - Added stringWidth(), stringHeight(), grobWidth(), and grobHeight() convenience functions (they produce "strwidth", "strheight", "grobwidth", and "grobheight" unit objects, respectively). - Allowed viewports to turn off clipping altogether. Possible settings for viewport clip arg are now: "on" = clip to the viewport (was TRUE) "inherit" = clip to whatever parent says (was FALSE) "off" = turn off clipping Still accept logical values (and NA maps to "off") UTILITIES o R CMD check now runs the (Rd) examples with default RNGkind (uniform & normal) and set.seed(1). example(*, setRNG = TRUE) does the same. o undoc() in package tools' has a new default of use.values = NULL' which produces a warning whenever the default values of function arguments differ between documentation and code. Note that this affects "R CMD check" as well. o Testing examples via massage-examples.pl (as used by R CMD check) now restores the search path after every help file. o checkS3methods() in package 'tools' now also looks for generics in the loaded namespaces/packages listed in the Depends fields of the package's DESCRIPTION file when testing an installed package. o The DESCRIPTION file of packages may contain a 'Suggests:' field for packages that are only used in examples or vignettes. o Added an option to package.dependencies() to handle the 'Suggests' levels of dependencies. o Vignette dependencies can now be checked and obtained via vignetteDepends. o Option 'repositories' to list URLs for package repositories added. o package.description() has been replaced by packageDescription(). o R CMD INSTALL/build now skip Subversion's .svn directories as well as CVS directories. C-LEVEL FACILITIES o arraySubscript and vectorSubscript take a new argument which is a function pointer that provides access to character strings (such as the names vector) rather than assuming these are passed in. o R_CheckUserInterrupt is now described in Writing R Extensions' and there is a new equivalent subroutine rchkusr for calling from FORTRAN code. o hsv2rgb and rgb2hsv are newly in the C API. o Salloc and Srealloc are provided in S.h as wrappers for S_alloc and S_realloc, since current S versions use these forms. o The type used for vector lengths is now R_len_t rather than int, to allow for a future change. o The internal header nmath/dpq.h has slightly improved macros R_DT_val() and R_DT_Cval(), a new R_D_LExp() and improved R_DT_log() and R_DT_Clog(); this improves accuracy in several [dpq]-functions {for "extreme" arguments}. DEPRECATED & DEFUNCT o print.coefmat() is defunct, replaced by printCoefmat(). o codes() and codes<-() are defunct. o anovalist.lm (replaced in 1.2.0) is now defunct. o glm.fit.null(), lm.fit.null() and lm.wfit.null() are defunct. o print.atomic() is defunct. o The command-line arguments --nsize and --vsize are no longer recognized as synonyms for --min-nsize and --min-vsize (which replaced them in 1.2.0). o Unnecessary methods coef.{g}lm and fitted.{g}lm have been removed: they were each identical to the default method. o La.eigen() is deprecated now eigen() uses LAPACK by default. o tetragamma() and pentagamma() are deprecated, since they are equivalent to psigamma(, deriv=2) and psigamma(, deriv=3). o LTRUE/LFALSE in Rmath.h have been removed: they were deprecated in 1.2.0. o package.contents() and package.description() have been deprecated. INSTALLATION CHANGES o The defaults for configure are now --without-zlib --without-bzlib --without-pcre. The included PCRE sources have been updated to version 4.5 and PCRE >= 4.0 is now required if --with-pcre is used. The included zlib sources have been updated to 1.2.1, and this is now required if --with-zlib is used. o configure no longer lists bzip2 and PCRE as additional capabilities' as all builds of R have had them since 1.7.0. o --with-blas=goto to use K. Goto's optimized BLAS will now work. BUG FIXES o When lm.{w}fit() disregarded arguments in ... they reported the values and not the names. o lm(singular.ok = FALSE) was looking for 0 rank, not rank < p. o The substitution code for strptime in the sources no longer follows glibc in silently correcting' invalid inputs. o The cor() function did not remove missing values in the non-Pearson case. o [l]choose() use a more accurate formula which also slightly improves p- and qhyper(); choose(n, k) now returns 0 instead of NaN for k < 0 or > n. o find(simple.words=TRUE) (the default) was still using regular expressions for e.g. "+" and "*". Also, it checked the mode only of the first object matching a regular expression found in a package. o Memory leaks in [dpq]wilcox and [dqr]signrank have been plugged. These only occurred when multiple values of m or n > 50 were used in a single call. (PR#5314, plus another potential leak.) o Non-finite input values to eigen(), La.eigen(), svd() and La.svd() are now errors: they often caused infinite looping. (PR#5406, PR#4366, PR#3723: the fix for 3723/4366 returned a vector of NAs, not a matrix, for the eigenvectors.) o stepfun(x,y) now gives an error when x' has length 0 instead of an invalid result (that could lead to a segmentation fault). o buildVignettes() uses file.remove() instead of unlink() to remove temporary files. o methods(class = "lqs") does not produce extraneous entries anymore. o Directly calling a method that uses NextMethod() no longer produces the erroneous error message 'function is not a closure'. o chisq.test(x, simulate.p.value = TRUE) could hang in an infinite loop or segfault, as r2dtable() did, when the entries in x where large. (PR#5701) o fisher.test(x) could give a P-value of 'Inf' in similar cases which now result in an error (PR#4688). It silently truncated non-integer 'x' instead of rounding. o cutree(a, h=h) silently gave wrong results when 'a' was an agnes object; now gives an error and reminds of as.hclust(). o postscript() could crash if given a font value outside the valid range 1...5. o qchisq(1-e, .., ncp=.) did not terminate for small e. (PR#6421 (PR#875)) o contrasts() turns a logical variable into a factor. This now always has levels c("FALSE", "TRUE") even if only one (or none) of these occur in the variable. o model.frame()'s lm and glm methods had 'data' and 'na.action' arguments which they ignored and have been removed. o The defaults data=list() in lm() and glm() could never be used and have been removed. glm had na.action=na.fail, again never used. o The internal tools function for listing all internal S3 generics was omitting all the members of the S3 group generics, which also accept methods for members. o Some BLASes were returning NA %*% 0 as 0 and some as NA. Now slower but more careful code is used if NAs are present. (PR#4582) o package.skeleton() no longer generates invalid filenames for code and help files. Also, care is taken not to generate filenames that differ only by case. o pairs() now respects axis graphical parameters such as cex.main, font.main and las. o Saving images of packages with namespaces (such as mle) was not compressing the image. o When formula.default() returned a terms object, it returned a result of class c("terms", "formula") with different subsetting rules from an object of class "formula". o The standalone Rmath library did not build correctly on systems with inaccurate log1p. o Specifying asp is now respected in calls like plot(1, 10, asp=1) with zero range on both axes. o outer() called rep() with an argument the generic does not have, and discarded the class of the answer. o object.size() now returns a real (not integer) answer and so can cope with objects occupying more than 2Gb. o Lookups base:: and ::: were not confining their search to the named package/namespace. o qbinom() was returning NaN for prob = 0 or 1 or size = 0 even though the result is well-defined. (In part, PR#5900.) o par(mgp)[2] was being interpreted as relative to par(mgp)[3]. (PR#6045) o Versioned install was broken both with and without namespaces: no R code was loaded. o methods(), getS3method() and the registration of S3 methods in namespaces were broken if the S3 generic was converted into an S4 generic by setting an S4 method. o Title and copyright holder of the reference manual are now in sync with the citation() command. o The validation code for POSIXlt dates and hence seq(, by="DSTdays") now works for large mday values (not just those in -1000...1000). (PR#6212) o The print() method for data frames now copes with data frames containing arrays (other than matrices). o texi2dvi() and buildVignettes() use clean=FALSE as default because the option is not supported on some Solaris machines. For buildVignettes() this makes no difference as it uses an internal cleanup mechanism. o The biplot() method for "prcomp" was not registered nor exported. (PR#6425) o Latex conversion of .Rd files was missing newline before \end{Section} etc which occasionally gave problems, as fixed for some other \end{Foo} in 1.8.1. (PR#5645) o Work around a glibc bug to make the %Z format usable in strftime(). o The glm method for rstandard() was wrongly scaled for cases where summary(model)$dispersion != 1. o Calling princomp() with a covariance matrix (rather than a list) failed to predict scores rather than predict NA as intended. (PR#6452) o termplot() is more tolerant of variables not in the data= argument. (PR#6327) o isoreg() could segfault on monotone input sequences. (PR#6494) o Rdconv detected \link{\url{}} only very slowly. (PR#6496) o aov() with Error() term and no intercept incorrectly assigned terms to strata. (PR#6510) o ftable() incorrectly handled arguments named "x". (PR#6541) o vector(), matrix(), array() and their internal equivalents report correctly that the number of elements specified was too large (rather than reporting it as negative). o Minor copy-paste error in example(names). (PR#6594) o length<-() now works correctly on factors (and is now generic with a method for factors). o x <- 2^32; x:(x+3) no longer generates an error (but gives a result of type "double"). o pgamma(30, 100, lower=FALSE, log=TRUE) is not quite 0, now. pgamma(x, alph) now only uses a normal approximation for alph > 1e5 instead of alph > 1000. This also improves the accuracy of ppois(). o qgamma() now does one or more final Newton steps, increasing accuracy from around 2e-8 to 3e-16 in some cases. (PR#2214). It allows values p close to 1 not returning Inf, with accuracy for 'lower=FALSE', and values close to 0 not returning 0 for 'log=TRUE'. These also apply to qchisq(), e.g., qchisq(1e-13, 4, lower=FALSE) is now finite and qchisq(1e-101, 1) is positive. o gamma(-n) now gives NaN for all negative integers -n. o The Unix version of browseURL() now protects the URL from the shell, for example allowing & and $to occur in the URL. It was incorrectly attempting to use -remote "openURL()" for unknown browsers. o extractAIC.coxph() works around an inconsistency in the$loglik output from coxph. (PR#6646) o stem() was running into integer overflows with nearly-constant inputs, and scaling badly for constant ones. (Partly PR#6645) o system() under Unix was losing the 8095th char if the output was split. (PR#6624) o plot.lm() gave incorrect results if there were zero weights. (PR#6640) o Binary operators warned for inconsistent lengths on vector op vector operations, but not on vector op matrix ones. (PR#6633 and more.) Comparison operators did not warn about inconsistent lengths for real vectors, but did for integer, logical and character vectors. o spec.pgram(x, ..., pad, fast, ...) computed the periodogram with a bias (downward) whenever 'pad > 0' (non-default) or 'fast = TRUE' (default) and nextn(n) > n where n = length(x); similarly for 'df' (approximate degrees of freedom for chisq). o dgamma(0, a) now gives Inf for a < 1 (instead of NaN), and so does dchisq(0, 2*a, ncp). o pcauchy() is now correct in the extreme tails. o file.copy() did not check that any existing from' file had been truncated before appending the new contents. o The QC files now check that their file operations succeeded. o replicate() worked by making the supplied expression the body of an anonymous function(x), leading to a variable capture issue. Now, function(...) is used instead. o chisq.test(simulate.p.value = TRUE) was returning slightly incorrect p values, notably p = 0 when the data gave the most extreme value. o terms.formula(simplify = TRUE) was losing offset terms. Multiple offset terms were not being removed correctly if two of them appeared first or last in the formula. (PR#6656) o Rd conversion to latex did not add a new line before \end{Section} in more cases than were corrected in 1.8.1. o split.default() dropped NA levels in its internal code but returned them as NA in all components in the interpreted code for factors. (PR#6672) o points.formula() had problems if there was a subset argument and no data argument. (PR#6652) o as.dist() does a bit more checking of its first argument and now warns when applied to non-square matrices. o mle() gives a more understandable error message when its 'start' argument is not ok. o All uses of dir.create() check the return value. download.packages() checks that destdir exists and is a directory. o Methods dispatch corrects an error that failed to find methods for classes that extend sealed classes (class unions that contain basic classes, e.g.). o Sweave no longer wraps the output of code chunks with echo=false and results=tex in Schunk environments. o termplot() handles models with missing data better, especially with na.action=na.exclude. o 1:2 * 1e-100 now prints with correct number of spaces. o Negative subscripts that were out of range or NA were not handled correctly. Mixing negative and NA subscripts is now caught as an error: it was not caught on some platforms and segfaulted on others. o gzfile() connections had trouble at EOF when used on uncompressed file. o The Unix version of dataentry segfaulted if the Copy' button was used. (PR#6605) o unlist on lists containing expressions now works (PR#5628) o D(), deriv() and deriv3() now also can deal with gamma and lgamma. o The X11 module can now be built against XFree86 4.4.0 headers (still with some warnings). o seq.POSIXt(from, to, by="DSTdays") was shorter than expected for rare times in the UK time zone. (PR#4558) o c/rbind() did not support vectors/matrices of mode "list". (PR#6702) o summary() methods for POSIX[cl]t and Date classes coerced the number of NAs to a date on printing. o KalmanSmooth would sometimes return NA values with NA inputs. (PR#6738) o fligner.test() worked correctly only if data were already sorted by group levels. (PR#6739) ************************************************** * * * 1.8 SERIES NEWS * * * ************************************************** CHANGES IN R VERSION 1.8.1 NEW FEATURES o There is now a "Complex" S3 group generic (a side-effect of fixing up the corresponding S4 group generic). o help("regex") now gives a description of the regular expressions used in R. o The startup message now shows the R Foundation as copyright holder, and includes the R ISBN number and a pointer to the new citation() function. o The solve() function now uses the tol' argument for all non-complex cases. The default tolerance for LINPACK is 1e-7, as before. For LAPACK it currently is .Machine$double.eps but may be changed in later versions of R. o help.search() now defaults to agrep = FALSE when keyword= is specified, since no one wants fuzzy matching of categories. o Function texi2dvi() in package tools can be used to compile latex files from within R, provided the OS has a command texi2dvi or texify. o Objects with formal S4 classes saved in pre-1.8 versions and loaded into the current version have incompatible class attributes (no package information). A new function, fixPre1.8() in package methods, will fix the class attributes. See the help for this function. o heatmap() allows Rowv/Colv = NA, suppressing the corresponding dendrogram. o An "antifeature": Tcl 8.0 is now officially unsupported. In 1.8.0 it just didn't work. This very old version lacks several features that are needed for the new version of the tcltk package. R will still build the tcltk package against Tcl 8.0 but the resulting package will not load. BUG FIXES o symnum(x) now behaves as documented when length(x) == 0 and uses lower.triangular = FALSE for logical arrays. o c() now has a method for "noquote" objects and hence works as expected. o split(1:10, c(1,2)) no longer gives a spurious warning. o The "Complex" S4 group generic now works. o abbreviate() doesn't get into infinite loops on input that differs only by leading/trailing space o Added check for user interrupt in Rvprintf to allow printing to be interrupted. o Fixed bug that would cause segfault on protect stack overflow. o crossprod() on matrices with zero extents would return an uninitialized matrix (rather than one filled with zeros). o DF[[i,j]] for a data frame used row j and col i, not as intended row i and col j. o Even more user errors in recursive indexing of lists are now caught. (PR#4486) o cor(, use = "pairwise") gave wrong result in 1.8.0 (only). (PR#4646) o merge.data.frame() could give incorrect names when one of the arguments had only one column. (PR#4299) o Subsetting a one-dimensional array dropped dimensions even when they were not of length one. (Related to PR#4110) o The plot() method for ecdf' objects, plot.ecdf(), now allows to set a ylab' argument (different from the default). o cor.test(*, method="spearman") gave wrong results randomly' (because of a wrong type passed to C; PR#4718). o dist() objects with NA's didn't print these, now do. (PR#4866). o regexpr(fixed = TRUE) returned 0-based indices. o df[, length_1_index] <- value did not recycle short rhs. (PR#4820) o median() no longer works' for odd-length factor variables. o packageStatus() is more robust to failing to contact a repository, and contacts the correct paths in the repositories under Windows. o .setOldIs (methods) contained a typo stopping POSIXct objects (etc) being included in formal classes. o terms() sometimes removed offset() terms incorrectly, since it counted by variables and not terms. Its "offset" attribute was incorrectly documented as referring to terms not variables. (Related to PR#4941) o buildVignettes() and pkgVignettes() in package tools are now documented. The call to texi2dvi is wrapped in the new function texi2dvi() which also works on Windows. o hclust() was sometimes not finding the correct inter-cluster distances with non-monotone methods. (PR#4195) o plot.hclust() now tolerates mode changes on dumped objects. (PR#4361) o prompt() no longer insists files are in the current directory. (PR#4978) o filter() did not use init in reverse order as documented. (PR#5017) o contrasts<-() and model.matrix() now have sanity checks that factors having at least 2 levels (or one level and a contrast matrix): model.matrix() gave nonsensical results for 0-level factors. o writeChar() could segfault if more characters were requested than exist. (PR#5090) o round() and signif() dropped attributes with 0-length inputs, only. (PR#4710) o The default graphics device in the GNOME interface was gtk, which is no longer in the base package. It is now X11. o The print button on the toolbar of the GNOME graphics device did not work. o The example code on the man page for TkWidgetcmds had not been updated after the change that made tkget (et al.) return tclObj objects, so the "Submit" button didn't work. o Rd conversion to latex did not add a new line before \end{Section} for the section environments, which caused problems if the last thing in a section was \preformatted{} (and potentially elsewhere). o Under some circumstances mosaicplot() failed if main was supplied as it was passed on to model.frame.default(). o Conversion to POSIXlt (including printing) of POSIXct dates before 1902 and after 2038 computed which were leap years from (year-1900) so got some xx00 years wrong. CHANGES IN R VERSION 1.8.0 MACOS CHANGES o As from this release there is only one R port for the Macintosh, which runs only on Mac OS X. (The Carbon' port has been discontinued, and the Darwin' port is part of the new version.) The current version can be run either as a command-line application or as an Aqua' console. There is a Quartz' device quartz(), and the download and installation of both source and binary packages is supported from the Aqua console. Those CRAN and BioC packages which build under Mac OS X have binary versions updated daily. USER-VISIBLE CHANGES o The defaults for glm.control(epsilon=1e-8, maxit=25) have been tightened: this will produce more accurate results, slightly slower. o sub, gsub, grep, regexpr, chartr, tolower, toupper, substr, substring, abbreviate and strsplit now handle missing values differently from "NA". o Saving data containing name space references no longer warns about name spaces possibly being unavailable on load. o On Unix-like systems interrupt signals now set a flag that is checked periodically rather than calling longjmp from the signal handler. This is analogous to the behavior on Windows. This reduces responsiveness to interrupts but prevents bugs caused by interrupting computations in a way that leaves the system in an inconsistent state. It also reduces the number of system calls, which can speed up computations on some platforms and make R more usable with systems like Mosix. CHANGES TO THE LANGUAGE o Error and warning handling has been modified to incorporate a flexible condition handling mechanism. See the online documentation of 'tryCatch' and 'signalCondition'. Code that does not use these new facilities should remain unaffected. o A triple colon operator can be used to access values of internal variables in a name space (i.e. a:::b is the value of the internal variable b in name space a). o Non-syntactic variable names can now be specified by inclusion between backticks Like This. The deparse() code has been changed to output non-syntactical names with this convention, when they occur as operands in expressions. This is controlled by a backtick' argument, which is by default TRUE for composite expressions and FALSE for single symbols. This should give minimal interference with existing code. o Variables in formulae can be quoted by backticks, and such formulae can be used in the common model-fitting functions. terms.formula() will quote (by backticks) non-syntactic names in its "term.labels" attribute. [Note that other code using terms objects may expect syntactic names and/or not accept quoted names: such code will still work if the new feature is not used.] NEW FEATURES o New function bquote() does partial substitution like LISP backquote. o capture.output() takes arbitrary connections for file' argument. o contr.poly() has a new scores' argument to use as the base set for the polynomials. o cor() has a new argument method = c("pearson","spearman","kendall")' as cor.test() did forever. The two rank based measures do work with all three missing value strategies. o New utility function cov2cor() {Cov -> Corr matrix}. o cut.POSIXt() now allows breaks' to be more general intervals as allowed for the by' argument to seq.POSIXt(). o data() now has an 'envir' argument. o det() uses an LU decomposition and LAPACK. The method' argument to det() no longer has any effect. o dev.control() now accepts "enable" as well as "inhibit". (Wishlist PR#3424) o *, - and / work more generally on "difftime" objects, which now have a diff() method. o dt(*, ncp = V) is now implemented, thanks to Claus Ekstrom. o dump() only quotes object names in the file where necessary. o eval() of a promise forces the promise o file.path() now returns an empty character vector if given at least one zero-length argument. o format() and hence print() make an effort to handle corrupt data frames, with a warning. o format.info() now also works with nsmall' in analogy with format.default(). o gamma(n) is very slightly more precise for integer n in 11:50. o ? and help() will accept more un-quoted arguments, e.g. NULL. o The "?" operator has new forms for querying documentation on S4 methods. See the online documentation. o New argument frame.plot = axes (== TRUE) for filled.contour(). o New argument fixed = TRUE for grep() and regexpr() to avoid the need to escape strings to match. o grep(x, ..., value = TRUE) preserves names of x. o hist.POSIXt() can now pass arguments to hist.default() o legend() and symbols() now make use of xy.coords() and accept a wider range of coordinate specifications. o Added function library.dynam.unload() to call dyn.unload() on a loaded DLL and tidy up. This is called for all the standard packages in namespaces with DLLs if their namespaces are unloaded. o lm(singular.ok = FALSE) is now implemented. o Empty lm() and glm() fits are now handled by the normal code: there are no methods for classes "lm.null" and "glm.null". Zero-rank fits are handled consistently. o make.names() has improvements, and there is a new auxiliary function make.unique(). (Based on code contributed by Tom Minka, since converted to a .Internal function.) In particular make.names() now recognises that names beginning with a dot are valid and that reserved words are not. o methods() has a print method which asterisks functions which are not user-visible. methods(class = "foo") now lists non-visible functions, and checks that there is a matching generic. o model.matrix() now warns when it removes the response from the rhs of the formula: that this happens is now documented on its help page. o New option locatorBell' to control the confirmation beep during the use of locator() and identify(). o New option("scipen") provides some user control over the printing of numbers in fixed-point or exponential notation. (Contributed by David Brahm.) o plot.formula() now accepts horizontal=TRUE and works correctly when boxplots are produced. (Wishlist PR#1207) The code has been much simplified and corrected. o polygon() and rect() now interpret density < 0 or NA to mean filling (by colour) is desired: this allows filling and shading to be mixed in one call, e.g. from legend(). o The predict() methods for classes lm, glm, mlm and lqs take a na.action' argument that controls how missing values in newdata' are handled (and defaults to predicting NA). [Previously the value of getOption("na.action") was used and this by default omitted cases with missing values, even if set to na.exclude'.] o print.summary.glm() now reports omitted coefficients in the same way as print.summary.lm(), and both show them as NAs in the table of coefficients. o print.table() has a new argument zero.print' and is now documented. o rank(x, na.last = "keep") now preserves NAs in x', and the argument ties.method' allows to use non-averaging ranks in the presence of ties. o read.table()'s 'as.is' argument can be character, naming columns not to be converted. o rep() is now a generic function, with default, POSIXct and POSIXlt methods. For efficiency, the base code uses rep.int() rather than rep() where possible. o New function replicate() for repeated evaluation of expression and collection of results, wrapping a common use of sapply() for simulation purposes. o rev() is now a generic function, with default and dendrogram methods. o serialize() and unserialize() functions are available for low-level serialization to connections. o socketSelect() allows waiting on multiple sockets. o sort(method = "quick", decreasing = TRUE) is now implemented. o sort.list() has methods "quick" (a wrapper for sort(method = "quick", index.return = TRUE) and "radix" (a very fast method for small integers). The default "shell" method works faster on long vectors with many ties. o stripchart() now has log', add' and at' arguments. o strsplit(x, *) now preserves names() but won't work for non-character x' anymore {formerly used as.character(x), destroying names(x)}. o textConnection() now has a local argument for use with output connections. local = TRUE means the variable containing the output is assigned in the frame of the caller. o Using UseMethod() with more than two arguments now gives a warning (as R-lang.texi has long claimed it did). o New function vignette() for viewing or listing vignettes. o which.min(x) and which.max(x) now preserve names. o xy.coords() coerces "POSIXt" objects to "POSIXct", allowing lines etc to added to plot.POSIXlt() plots. o .Machine has a new entry, sizeof.pointer. o .Random.seed is only looked for and stored in the user's workspace. Previously the first place a variable of that name was found on the search path was used. o Subscripting for data.frames has been rationalized: - Using a single argument now ignores any drop' argument (with a warning). Previously using drop' inhibited list-like subscripting. - adf$name <- value now checks for the correct length of value', replicating a whole number of times if needed. - adf[j] <- value and adf[[j]] <- value did not convert character vectors to factors, but adf[,j] <- value did. Now none do. Nor is a list value' coerced to a data frame (thereby coercing character elements to factors). - Where replicating the replacement value a whole number of times will produce the right number of values, this is always done (rather than some times but not others). - Replacement list values can include NULL elements. - Subsetting a data frame can no longer produce duplicate column names. - Subsetting with drop=TRUE no longer sometimes drops dimensions on matrix or data frame columns of the data frame. - Attributes are no longer stripped when replacing part of a column. - Columns added in replacement operations will always be named, using the names of a list value if appropriate. - as.data.frame.list() did not cope with list names such as check.rows', and formatting/printing data frames with such column names now works. - Row names in extraction are still made unique, but without forcing them to be syntactic names. - adf[x] <- list() failed if x was of length zero. o Setting dimnames to a factor now coerces to character, as S does. (Earlier versions of R used the internal codes.) o When coercion of a list fails, a meaningful error message is given. o Adding to NULL with [[ ]] generates a list if more than one element is added (as S does). o There is a new command-line flag --args that causes the rest of the command line to be skipped (but recorded in commandArgs() for further processing). o S4 generic functions and method dispatch have been modified to make the generic functions more self-contained (e.g., usable in apply-type operations) and potentially to speed dispatch. o The data editor is no longer limited to 65535 rows, and will be substantially faster for large numbers of columns. o Standalone Rmath now has a get_seed function as requested (PR#3160). o GC timing is not enabled until the first call to gc.time(); it can be disabled by calling gc.time(FALSE). This can speed up the garbage collector and reduce system calls on some platforms. STANDARD PACKAGES o New package 'mle'. This is a simple package to find maximum likelihood estimates, and perform likelihood profiling and approximate confidence limits based upon it. A well-behaved likelihood function is assumed, and it is the responsibility of the user to gauge the applicability of the asymptotic theory. This package is based on S4 methods and classes. o Changes in package 'mva': - factanal() now returns the test statistic and P-value formerly computed in the print method. - heatmap() has many more arguments, partly thanks to Wolfgang Huber and Andy Liaw. - Arguments unit' and hmin' of plclust() are now implemented. - prcomp() now accepts complex matrices, and there is biplot() method for its output (in the real case). - dendrograms are slightly better documented, methods working with "label", not "text" attribute. New rev() method for dendrograms. - plot.dendrogram() has an explicit frame.plot' argument defaulting to FALSE (instead of an implicit one defaulting to TRUE). o Changes in package 'tcltk': - The package is now in a namespace. To remove it you will now need to use unloadNamespace("tcltk"). - The interface to Tcl has been made much more efficient by evaluating Tcl commands via a vector of Tcl objects rather than by constructing the string representation. - An interface to Tcl arrays has been introduced. - as.tclObj() has gained a drop' argument to resolve an ambiguity for vectors of length one. o Changes in package 'tools': - Utilities for testing and listing files, manipulating file paths, and delimited pattern matching are now exported. - Functions checkAssignFuns(), checkDocArgs() and checkMethods() have been renamed to checkReplaceFuns(), checkDocFiles(), and checkS3methods, to given better descriptions of what they do. - R itself is now used for analyzing the markup in the \usage sections. Hence in particular, replacement functions or S3 replacement methods are no longer ignored. - checkDocFiles() now also determines 'over-documented' arguments which are given in the \arguments section but not in \usage. - checkDocStyle() and checkS3Methods() now know about internal S3 generics and S3 group generics. - S4 classes and methods are included in the QC tests. Warnings will be issued from undoc() for classes and methods defined but not documented. Default methods automatically generated from nongeneric functions do not need to be documented. - New (experimental) functions codocClasses() and codocData() for code/documentation consistency checking for S4 classes and data sets. o Changes in package 'ts': - arima.sim() now checks for inconsistent order specification (as requested in PR#3495: it was previously documented not to). - decompose() has a new argument filter'. - HoltWinters() has new arguments optim.start' and optim.control', and returns more components in the fitted values. The plot method allows ylim' to be set. - plot.ts() has a new argument nc' controlling the number of columns (with default the old behaviour for plot.mts). - StructTS() now allows the first value of the series to be missing (although it is better to omit leading NAs). (PR#3990) USING PACKAGES o library() has a pos argument, controlling where the package is attached (defaulting to pos=2 as before). o require() now maintains a list of required packages in the toplevel environment (typically, .GlobalEnv). Two features use this: detach() now warns if a package is detached that is required by an attached package, and packages that install with saved images no longer need to use require() in the .First as well as in the main source. o Packages with name spaces can now be installed using --save. o Packages that use S4 classes and methods should now work with or without saved images (saved images are still recommended for efficiency), writing setMethod(), etc. calls with the default for argument where'. The topenv() function and sys.source() have been changed correspondingly. See the online help. o Users can specify in the DESCRIPTION file the collation order for files in the R source directory of a package. DOCUMENTATION CHANGES o Changes in R documentation format: - New logical markup commands for emphasizing (\strong) and quoting (\sQuote and \dQuote) text, for indicating the usage of an S4 method (\S4method), and for indicating specific kinds of text (\acronym, \cite, \command, \dfn, \env, \kbd, \option, \pkg, \samp, \var). - New markup \preformatted for pre-formatted blocks of text (like \example but within another section). (Based on a contribution by Greg Warnes.) - New markup \concept for concept index entries for use by help.search(). o Rdconv now produces more informative output from the special \method{GENERIC}{CLASS} markup for indicating the usage of S3 methods, providing the CLASS info in a comment. o \dontrun sections are now marked within comments in the user-readable versions of the converted help pages. o \dontshow is now the preferred name for \testonly. INSTALLATION CHANGES o The zlib code in the sources is used unless the external version found is at least version 1.1.4 (up from 1.1.3). o The regression checks now have to be passed exactly, except those depending on recommended packages (which cannot be assumed to be present). o The target make check-all now runs R CMD check on all the recommended packages (and not just runs their examples). o There are new macros DYLIB_* for building dynamic libraries, and these are used for the dynamic Rmath library (which was previously built as a shared object). o If a system function log1p is found, it is tested for accuracy and if inadequate the substitute function in src/nmath is used, with name remapped to Rlog1p. (Apparently needed on OpenBSD/NetBSD.) C-LEVEL FACILITIES o There is a new installed header file R_ext/Parse.h which allows R_ParseVector to be called by those writing extensions. (Note that the interface is changed from that used in the unexported header Parse.h in earlier versions, and is not guaranteed to remain unchanged.) o The header R_ext/Mathlib.h has been removed. It was replaced by Rmath.h in R 1.2.0. o PREXPR has been replaced by two macros, PREXPR for obtaining the expression and PRCODE for obtaining the code for use in eval. The macro BODY_EXPR has been added for use with closures. For a closure with a byte compiled body, the macro BODY_EXPR returns the expression that was compiled; if the body is not compiled then the body is returned. This is to support byte compilation. o Internal support for executing byte compiled code has been added. A compiler for producing byte compiled code will be made available separately and should become part of a future R release. o On Unix-like systems calls to the popen() and system() C library functions now go through R_popen and R_system. On Mac OS X these suspend SIGALRM interrupts around the library call. (Related to PR#1140.) UTILITIES o R CMD check accepts "ORPHANED" as package maintainer. Package maintainers can now officially orphan a package, i.e., resign from maintaining a package. o R CMD INSTALL (Unix only) is now 'safe': if the attempt to install a package fails, leftovers are removed. If the package was already installed, the old version is restored. o R CMD build excludes possible (obsolete) data and vignette indices in DCF format (and hence also no longer rebuilds them). o R CMD check now tests whether file names are valid across file systems and supported operating system platforms. There is some support for code/documentation consistency checking for data sets and S4 classes. Replacement functions and S3 methods in \usage sections are no longer ignored. o R CMD Rdindex has been removed. DEPRECATED & DEFUNCT o The assignment operator _' has been removed. o printNoClass() is defunct. o The classic Mac OS port is no longer supported, and its files have been removed from the sources. o The deprecated argument 'white' of parse() has been removed. o Methods pacf/plot.mts() have been removed and their functionality incorporated into pacf.default/plot.ts(). o print.coefmat() is deprecated in favour of printCoefmat() (which is identical apart from the default for na.print which is changed from "" to "NA", and better handling of the 0-rank case where all coefficients are missing). o codes() and codes<-() are deprecated, as almost all uses misunderstood what they actually do. o The use of multi-argument return() calls is deprecated: use a (named) list instead. o anovalist.lm (replaced in 1.2.0) is now deprecated. o - and Ops methods for POSIX[cl]t objects are removed: the POSIXt methods have been used since 1.3.0. o glm.fit.null(), lm.fit.null() and lm.wfit.null() are deprecated. o Classes "lm.null" and "glm.null" are deprecated and all of their methods have been removed. o Method weights.lm(), a copy of weights.default(), has been removed. o print.atomic() is now deprecated. o The back-compatibility entry point Rf_log1p in standalone Rmath has been removed. BUG FIXES o ARMAacf() sometimes gave too many results or failed if lag.max' was used. o termplot() with a subset of terms now gives correct partial residuals o Functions anova.glm(), contrasts(), getS3method(), glm() and make.tables() were applying get() without asking for a function and/or not starting the search in the environment of the caller. o as.data.frame.matrix() ignored the row.names' argument. o as.data.frame.list(optional = TRUE) was converting names, and hence data.frame(list(...), check.names = FALSE) was. (PR#3280) o as.dist(m) {mva} now obeys diag=TRUE' or upper=TRUE' in all cases. o as.double(list()) etc was regarded as an error, because of a bug in isVectorizable. o On some platforms the wday component of the result of as.POSIXlt() was corrupted when trying to guess the DST offset at dates the OS was unable to handle. o ave(x, g) didn't work when g' had unused levels. o biplot.default() allows xlim and ylim to be set. (PR#3168) o bgroup with a null (.) delimiter was setting font to Greek. (PR#3099) o body() and formals() were looking for named functions in different places: they now both look starting at the environment in which they are called. Several documentation errors for these functions have been corrected. o boxplot() was ignoring cex.axis. (PR#2628) o cut.POSIXt() now passes on ... to cut.default(), as documented. o crossprod() now works for 1d arrays with unnamed dimnames (PR#4092). o data() sometimes failed with multiple files, as the paths variable got corrupted. o data.frame() failed with a nonsensical error message if it grabbed row names from an argument that was subsequently recycled. Now they are discarded, with a warning. o data.matrix() was documented to replace factors by their codes, but in fact re-coded into the alphabetical ordering of the levels. o decompose() with even frequency used an asymmetric moving average window. o demo() was using topic' as a regexp rather than an exact match. o dotchart() now does recycle the color' argument and better documents the bg' one (PR#4343). o getAnywhere() didn't not correctly check for S3 methods, when the generic or the class name contains a "." (PR#4275). o file.copy() ignored the overwrite argument. (PR#3529) o filter(method="recursive") was unnecessarily requiring the time series to be longer than the filter. o format(*, nsmall = m) with m > 0 now returns exponential format less often. o get() and exists() were ignoring the mode' argument for variables in base. The error message for get() now mentions the mode requested if not "any". A bug in setting the NAMED field in do_get was fixed. o getS3method(f, cl, optional=TRUE) now returns NULL if f' does not exist. o HoltWinters() would segfault if only gamma was optimized, and not converge if gamma=0 and seasonal="mult". o hyperref.cfg now contains definitions for colors it uses. o identify.default() detects zero-length arguments. (PR#4057) o legend() allows shading without filling again. o legend(x, y, leg) doesn't triple leg' anymore when it is a call. o Corrected many problems with 0-rank (but not necessarily empty model) lm() and glm() fits. o lm.influence() now handles 0-rank models, and names its output appropriately. It also ensures that hat values are not greater than one, and rounds values within rounding error of one. o The method' argument to loess() did not work. (PR#3332) o lsfit() was returning incorrect residuals for 0-rank fits. o methods("$") and methods("$<-") were failing to find methods. o methods() and getS3method() find methods if the generic dispatches on a name other than its own. (The cases of coefficients() and fitted.values() were fixed in 1.7.1.) o model.matrix.default() was throwing an error on 0-term models, but now handles them correctly. o Printing nls' objects misbehaved when data' was a composite expression. o .NotYetImplemented() gave "Error in .NotYet...(): .." o numericDeriv() was failing if the first argument was a name rather than a call. (PR#3746) o pacf() was failing if called on a one-column matrix. o paste() applied to 0-length vectors gave "" not a 0-length vector. o The length of a string specification of par(lty=) is now checked: it should be 2, 4, 6 or 8. o Using lty=as.integer(NA) and as.double(NA) were being accepted but giving nonsensical results. Those are not documented valid values for lty. (PR#3217) o Erroneously calling par(new=TRUE) with no plot was not caught and so could result in invalid graphics files. (PR#4037) o par(tck=) was being interpreted incorrectly. It is now documented in the same way as S, and now behaves as documented. (PR#3504) o plclust() [and hence plot.hclust()] sometimes now uses correct ylim's also in unusual cases. (PR#4197) o plot.POSIX[cl]t no longer passes col, lty, lwd to axis.POSIXt. o The png(), jpeg(), png() and win.metafile() devices now enforce the length limit on the filename. (PR#3466) o pnorm(x, 1, 0) does not give NaN anymore; also, pnorm(x, m, s=Inf) == lim{s -> Inf} pnorm(x,m,s). Similar changes for dnorm(), cf PR#1218. o On some machines the internal rounding used in postscript() was imperfect, causing unnecessarily verbose output (-0.00 instead of 0) and problems with make check. o qqnorm()'s result now keeps NAs from its input. (PR#3750) o rank() sometimes preserved and sometimes dropped names. o readBin(what = "foo") didn't convert what' to its type. (PR#4043) o reorder.dendrogram() now properly resets the "midpoint" attributes such that reorder()ed dendrograms now plot properly. o rmultinom(1,100, c(3, 4, 2, 0,0))[3] was NA. (PR#4431) o sapply() for matrix result does not return list(NULL,NULL) dimnames anymore. o scan() now interprets quoting in fields to be skipped. (PR#4128) o seq.POSIXt(from, to, by="DSTday") was failing or calculating the length incorrectly. o sort() and unique.default() were failing on 0-level factors. o step() adds a fuzz for reduction in AIC for 0-df terms. (PR#3491) o str(x) gives better output when x is of mode "(". Its "dendrogram" method obeys the give.attr' argument which now defaults to FALSE. o strwidth(f) and strheight(f) could seg.fault when f' was a function. The fix [to C-level coerceVector()] now gives an error instead of passing through. This may catch other potential problems. o Sweave() reports the chunk number rather than the driver call when a try error gets caught. o trunc.POSIXt(x) for 0-length x does not return invalid structures anymore. (PR#3763). o warnings() now returns NULL instead of an error when no warnings have occured yet. (PR#4389) o Using write.table() setting the dec' argument and with no numeric columns failed. (PR#3532) o $<- did not duplicate when it needed to. o Recursive indexing of lists had too little error-checking. (related to PR#3324) o Removed warning about names in persistent strings when a namespace is saved. o Fixed some malformed error messages in the methods package. o pipes were not opening properly when profiling on a Mac OS. (PR#1140) o Lapack error messages (PR#3494) and call to DGEQP3 (PR#2867) are corrected. o Rd conversion was limiting a file to 1000 pairs of braces, without any warning. Now the limit is 10000, with a warning. (PR#3400) o In the tcltk package, the tkimage.*() commands were defined nonsensically as widget commands. They have been redefined to be more useful now. o Registered group generics were not being used. (PR#3536) o Subsetting data frames did not always correctly detect that non-existent columns were specified. o There are many more checks for over-running internal buffers, almost always reporting errors. o Added some buffer overflow checking in gram.y. o Internals for complex assignment did not check that function name was a symbol, which could cause a segfault. o Fixed bug in S4 methods dispatch that made local variables in the generic visible when executing the body of a method, thus violating lexical scope. ************************************************** * * * 1.7 SERIES NEWS * * * ************************************************** CHANGES IN R VERSION 1.7.1 NEW FEATURES o The help pages give appropriate references to the Blue, White or Green books for functions based on the descriptions of S functions given there. (E&OE) o Function getAnywhere() can find non-exported objects, for namespaces or registered methods. DEPRECATED & DEFUNCT o The (unimplemented) argument 'white' of parse() is deprecated. o The tkfilefind demo in the tcltk library is deprecated, since it never worked well, and apparently not at all with Tcl/Tk 8.4. BUG FIXES o print.table() used too much white space in some cases in 1.7.0. o selectMethod() failed if f' was a non-generic and optional=TRUE, and gave a confusing error message if optional=FALSE. o pchisq(*, ncp) and qchisq(*, ncp) work in more cases for large ncp or quantile and give warning or error messages otherwise. o str(x) now also works when x is an "externalptr" (or "weakref"). o rbeta(), rf(), and rt() now support infinite parameter values; other distributions return NaN instead of NA for such. o Redefining a class is now safer if the new definition generates an error (previously some invalid metadata could be left behind). o A number of errors are now caught in setClass() that previously either went unchecked or waited until new() to appear: - classes may not contain themselves, directly or indirectly; - classes appearing either as slots or as superclasses must themselves be defined; - slot names (direct or inherited) must be unique. In related changes, prototype() now works as documented, and is the recommended way to provide prototype objects. o Sorting an ordered factor would return an unordered one. This caused some trouble with panel.superpose (PR#974). o methods() could return duplicates if a method in a namespace was both exported and registered. o The internal zip.unpack() could crash if more than 500 files were to be extracted. (PR#2818) o The "r+" and "r+b" modes of opening file connections disallowed writing. o library() now warns the user if the chosen package name doesn't match the internal package name, and corrects the error. (PR#2816) o qr(LAPACK=TRUE) (and qr for complex arguments) might have failed to pivot for rank-deficient inputs. (PR#2867) o Only re-mapped symbols are exported by regex.o, to avoid problems with embedded R on RedHat 9. o arima() did not set transform.pars to FALSE if AR parameters were fixed, although it claimed to. o pnorm() was slower than necessary in the outer tails in some cases due to a typo in the improvements from PR#699. (PR#2883) o setGeneric() and setMethod() now catch some examples where the generic and the method have different argument lists; the evaluator checks for internal consistency of these argument lists. o expand.grid(x) {the rare case of one argument} now treats factor levels as in the typical case of two or more arguments. o Some implicit coercions to lists could cause segfaults, e.g. x <- matrix(nrow=20000, ncol=20); x$any <- numeric(0) due to a PROTECT bug. (PR#2923) o The replacement functions for colnames() and rownames() did not work for arrays with more than two dimensions. They could create dimnames of the form list(NULL, NULL) rather than remove the dimnames attribute. o termplot() gave incorrect answers with rug=TRUE or partial=TRUE for factors whose levels were not in lexicographical order. o A serious performance flaw in as() computations was fixed (the methods were not being cached properly.) o model.frame(~1, data) always returned 1 row. (PR#2958) o The data editor was truncating objects to 65535 rows. Pro tem, editing objects with more than 65535 rows is an error, and objects cannot be extended beyond that row. This restriction will be removed in 1.8.0. (PR#2962) o A bug could produce apparent loops in formal method selection when inheritance was restricted (used for the as() function). A related problem sometimes broke attaching a package that had methods for basic functions, such as names(), used in method selection. o Empty expressions as in return(x,) could generate subsequent segfaults: they are now errors. (PR#2880) o The Kinderman-Ramage Normal Random Generator had several problems leading to not-quite normally distributed variates (PR#2846). One problem was traced to an error in the original 1976 JASA paper! Thanks to Josef Leydold and his team for investigating this. The old generator has been retained for reproducibility of older results, under the name "Buggy Kinderman-Ramage". A warning is issued if you select it (also indirectly via RNGversion()). o promptMethods() now puts the \alias lines for methods in the normal place, near the top of the file, and quotes class names in signatures. o getS3method() and methods() were not finding methods for coefficients() and fitted.values() (which dispatch on "coef" and "fitted" respectively). o scan() (and hence read.table) was not finding matches for separator chars with the upper bit set. (PR#3035) o lm.(w)fit failed if the fit had rank 0. o lqs() did not report explicitly that it had failed if all samples gave singular fits. o predict.lm(*, se=TRUE) {w/ weights, w/o newdata} now gives correct SE's. (PR#3043) o cor.test(x, y, method="spearman") now also works for length(x) > 1290. o Matrices were printed mis-aligned if right=TRUE and na.print was specified. (PR#3058) o R CMD check gives now a clearer message when latex produces errors on the package manual. (PR#3070) o isSeekable() was incorrectly returning FALSE on all file connections. o tkpager() wasn't quite using its title and header arguments in the way prescribed by file.show() o legend(*, pch=p, lty=l) now works better when p' or l' have NAs. o All braces in regular expressions used by Sweave() are now escaped by a backslash. o unloadNamespace() failed because getNamespaceImports() now coerces a string argument to a name space. o deriv3 gave incorrect Hessians for some very simple expressions such as expression(x*y) (since the comments in the C code were incorrect). (PR#2577) o power.t.test(..., delta=NULL,alternative='two.sided') failed. (PR#2993) o Lines on postscript() plots with thousands of segments might have been plotted inaccurately in 1.7.0. (PR#3132) Solid lines in postscript() output are split into groups of 1000 segments to help some PostScript interpreters (typically old level-1 interpreters). o cut.POSIXt failed when the breaks were date/time objects. (PR#3181) o Usage of methods in dist.Rd is now correctly documented (as.matrix.dist() is not an exported symbol). o The predict() method for ar fits was not retrieving the series from the parent environment. o eigen() and La.eigen() were not returning a matrix of eigenvectors for a 1x1 input. o hsv() and rgb() now return character(0) when one of their args has length 0. This also fixes terrain.color(1). (PR#3233) o [[<-.data.frame checked if a replacment was too short, but not if it was too long. (related to PR#3229) o qt(x, df) was quite inaccurate for df=1+epsilon; it is now much more accurate for df in (1,2) and more precise for other df. (PR#2991) o qbeta() now has slightly improved C code in two places, as suggested in the 2nd followup to PR#2894. CHANGES IN R VERSION 1.7.0 USER-VISIBLE CHANGES o solve(), chol(), eigen() and svd() now use LAPACK routines unless a new back-compatibility option is turned on. The signs and normalization of eigen/singular vectors may change from earlier versions. o The methods', modreg', mva', nls' and ts' packages are now attached by default at startup (in addition to ctest'). The option "defaultPackages" has been added which contains the initial list of packages. See ?Startup and ?options for details. Note that .First() is no longer used by R itself. class() now always (not just when methods' is attached) gives a non-null class, and UseMethod() always dispatches on the class that class() returns. This means that methods like foo.matrix and foo.integer will be used. Functions oldClass() and oldClass<-() get and set the "class" attribute as R without methods' used to. o The default random number generators have been changed to Mersenne-Twister' and Inversion'. A new RNGversion() function allows you to restore the generators of an earlier R version if reproducibility is required. o Namespaces can now be defined for packages other than base': see Writing R Extensions'. This hides some internal objects and changes the search path from objects in a namespace. All the base packages (except methods and tcltk) have namespaces, as well as the recommended packages KernSmooth', MASS', boot', class', nnet', rpart' and spatial'. o Formulae are not longer automatically simplified when terms() is called, so the formulae in results may still be in the original form rather than the equivalent simplified form (which may have reordered the terms): the results are now much closer to those of S. o The tables for plotmath, Hershey and Japanese have been moved from the help pages (example(plotmath) etc) to demo(plotmath) etc. o Errors and warnings are sent to stderr not stdout on command-line versions of R (Unix and Windows). o The R_X11 module is no longer loaded until it is needed, so do test that x11() works in a new Unix-alike R installation. NEW FEATURES o if() and while() give a warning if called with a vector condition. o Installed packages under Unix without compiled code are no longer stamped with the platform and can be copied to other Unix-alike platforms (but not to other OSes because of potential problems with line endings and OS-specific help files). o The internal random number generators will now never return values of 0 or 1 for runif. This might affect simulation output in extremely rare cases. Note that this is not guaranteed for user-supplied random-number generators, nor when the standalone Rmath library is used. o When assigning names to a vector, a value that is too short is padded by character NAs. (Wishlist part of PR#2358) o It is now recommended to use the 'SystemRequirements:' field in the DESCRIPTION file for specifying dependencies external to the R system. o Output text connections no longer have a line-length limit. o On platforms where vsnprintf does not return the needed buffer size the output line-length limit for fifo(), gzfile() and bzfile() has been raised from 10k to 100k chars. o The Math group generic does not check the number of arguments supplied before dispatch: it used to if the default method had one argument but not if it had two. This allows trunc.POSIXt() to be called via the group generic trunc(). o Logical matrix replacement indexing of data frames is now implemented (interpreted as if the lhs was a matrix). o Recursive indexing of lists is allowed, so x[[c(4,2)]] is shorthand for x[[4]][[2]] etc. (Wishlist PR#1588) o Most of the time series functions now check explicitly for a numeric time series, rather than fail at a later stage. o The postscript output makes use of relative moves, and so is somewhat more compact. o %*% and crossprod() for complex arguments make use of BLAS routines and so may be much faster on some platforms. o arima() has coef(), logLik() (and hence AIC) and vcov() methods. o New function as.difftime() for time-interval data. o basename() and dirname() are now vectorized. o biplot.default() {mva} allows xlab' and ylab' parameters to be set (without partially matching to xlabs' and ylabs'). (Thanks to Uwe Ligges.) o New function capture.output() to send printed output from an expression to a connection or a text string. o ccf() (pckage ts) now coerces its x and y arguments to class "ts". o chol() and chol2inv() now use LAPACK routines by default. o as.dist(.) is now idempotent, i.e., works for "dist" objects. o Generic function confint() and lm' method (formerly in package MASS, which has glm' and nls' methods). o New function constrOptim() for optimisation under linear inequality constraints. o Add difftime' subscript method and methods for the group generics. (Thereby fixing PR#2345) o download.file() can now use HTTP proxies which require basic' username/password authentication. o dump() has a new argument envir'. The search for named objects now starts by default in the environment from which dump() is called. o The edit.matrix() and edit.data.frame() editors can now handle logical data. o New argument local' for example() (suggested by Andy Liaw). o New function file.symlink() to create symbolic file links where supported by the OS. o New generic function flush() with a method to flush connections. o New function force() to force evaluation of a formal argument. o New functions getFromNamespace(), fixInNamespace() and getS3method() to facilitate developing code in packages with namespaces. o glm() now accepts etastart' and mustart' as alternative ways to express starting values. o New function gzcon() which wraps a connection and provides (de)compression compatible with gzip. load() now uses gzcon(), so can read compressed saves from suitable connections. o help.search() can now reliably match individual aliases and keywords, provided that all packages searched were installed using R 1.7.0 or newer. o hist.default() now returns the nominal break points, not those adjusted for numerical tolerances. To guard against unthinking use, include.lowest' in hist.default() is now ignored, with a warning, unless breaks' is a vector. (It either generated an error or had no effect, depending how prettification of the range operated.) o New generic functions influence(), hatvalues() and dfbeta() with lm and glm methods; the previously normal functions rstudent(), rstandard(), cooks.distance() and dfbetas() became generic. These have changed behavior for glm objects -- all originating from John Fox' car package. o interaction.plot() has several new arguments, and the legend is not clipped anymore by default. It internally uses axis(1,*) instead of mtext(). This also addresses "bugs" PR#820, PR#1305, PR#1899. o New isoreg() function and class for isotonic regression (modreg' package). o La.chol() and La.chol2inv() now give interpretable error messages rather than LAPACK error codes. o legend() has a new plot' argument. Setting it FALSE' gives size information without plotting (suggested by U.Ligges). o library() was changed so that when the methods package is attached it no longer complains about formal generic functions not specific to the library. o list.files()/dir() have a new argument recursive'. o lm.influence() has a new do.coef' argument allowing *not* to compute casewise changed coefficients. This makes plot.lm() much quicker for large data sets. o load() now returns invisibly a character vector of the names of the objects which were restored. o New convenience function loadURL() to allow loading data files from URLs (requested by Frank Harrell). o New function mapply(), a multivariate lapply(). o New function md5sum() in package tools to calculate MD5 checksums on files (e.g. on parts of the R installation). o medpolish() {package eda} now has an na.rm' argument (PR#2298). o methods() now looks for registered methods in namespaces, and knows about many objects that look like methods but are not. o mosaicplot() has a new default for main', and supports the las' argument (contributed by Uwe Ligges and Wolfram Fischer). o An attempt to open() an already open connection will be detected and ignored with a warning. This avoids improperly closing some types of connections if they are opened repeatedly. o optim(method = "SANN") can now cover combinatorial optimization by supplying a move function as the gr' argument (contributed by Adrian Trapletti). o PDF files produced by pdf() have more extensive information fields, including the version of R that produced them. o On Unix(-alike) systems the default PDF viewer is now determined during configuration, and available as the 'pdfviewer' option. o pie(...) has always accepted graphical pars but only passed them on to title(). Now pie(, cex=1.5) works. o plot.dendrogram (mva' package) now draws leaf labels if present by default. o New plot.design() function as in S. o The postscript() and PDF() drivers now allow the title to be set. o New function power.anova.test(), contributed by Claus Ekstrom. o power.t.test() now behaves correctly for negative delta in the two-tailed case. o power.t.test() and power.prop.test() now have a strict' argument that includes rejections in the "wrong tail" in the power calculation. (Based in part on code suggested by Ulrich Halekoh.) o prcomp() is now fast for n x m inputs with m >> n. o princomp() no longer allows the use of more variables than units: use prcomp() instead. o princomp.formula() now has principal argument formula', so update() can be used. o Printing an object with attributes now dispatches on the class(es) of the attributes. See ?print.default for the fine print. (PR#2506) o print.matrix() and prmatrix() are now separate functions. prmatrix() is the old S-compatible function, and print.matrix() is a proper print method, currently identical to print.default(). prmatrix() and the old print.matrix() did not print attributes of a matrix, but the new print.matrix() does. o print.summary.{lm,glm} now default to symbolic.cor = FALSE, but symbolic.cor can be passed to the print methods from the summary methods. print.summary.{lm,glm} print correlations to 2 decimal places, and the symbolic printout avoids abbreviating labels. o If a prompt() method is called with 'filename' as 'NA', a list-style representation of the documentation shell generated is returned. New function promptData() for documenting objects as data sets. o qqnorm() and qqline() have an optional logical argument datax' to transpose the plot (S-PLUS compatibility). o qr() now has the option to use LAPACK routines, and the results can be used by the helper routines qr.coef(), qr.qy() and qr.qty(). The LAPACK-using versions may be much faster for large matrices (using an optimized BLAS) but are less flexible. o QR objects now have class "qr", and solve.qr() is now just the method for solve() for the class. o New function r2dtable() for generating random samples of two-way tables with given marginals using Patefield's algorithm. o rchisq() now has a non-centrality parameter ncp', and there's a C API for rnchisq(). o New generic function reorder() with a dendrogram method; new order.dendrogram() and heatmap(). o require() has a new argument, character.only, -- to make it align with library. o New functions rmultinom() and dmultinom(), the first one with a C API. o New function runmed() for fast runnning medians (modreg' package). o New function slice.index() for identifying indexes with respect to slices of an array. o solve.default(a) now gives the dimnames one would expect. o stepfun() has a new right' argument for right-continuous step function construction. o str() now shows ordered factors different from unordered ones. It also differentiates "NA" and as.character(NA), also for factor levels. o symnum() has a new logical argument abbr.colnames'. o summary() now mentions NA's as suggested by Goran Brostrom. o summaryRprof() now prints times with a precision appropriate to the sampling interval, rather than always to 2dp. o New function Sys.getpid() to get the process ID of the R session. o table() now allows exclude= with factor arguments (requested by Michael Friendly). o The tempfile() function now takes an optional second argument giving the directory name. o The ordering of terms for terms.formula(keep.order=FALSE) is now defined on the help page and used consistently, so that repeated calls will not alter the ordering (which is why delete.response() was failing: see the bug fixes). The formula is not simplified unless the new argument simplify' is true. o added "[" method for terms objects. o New argument silent' to try(). o ts() now allows arbitrary values for y in start/end = c(x, y): it always allowed y < 1 but objected to y > frequency. o unique.default() now works for POSIXct objects, and hence so does factor(). o Package tcltk now allows return values from the R side to the Tcl side in callbacks and the R_eval command. If the return value from the R function or expression is of class "tclObj" then it will be returned to Tcl. o A new HIGHLY EXPERIMENTAL graphical user interface using the tcltk package is provided. Currently, little more than a proof of concept. It can be started by calling "R -g Tk" (this may change in later versions) or by evaluating tkStartGUI(). Only Unix-like systems for now. It is not too stable at this point; in particular, signal handling is not working properly. o Changes to support name spaces: - Placing base in a name space can no longer be disabled by defining the environment variable R_NO_BASE_NAMESPACE. - New function topenv() to determine the nearest top level environment (usually .GlobalEnv or a name space environment). - Added name space support for packages that do not use methods. o Formal classes and methods can be sealed', by using the corresponding argument to setClass or setMethod. New functions isSealedClass() and isSealedMethod() test sealing. o packages can now be loaded with version numbers. This allows for multiple versions of files to be installed (and potentially loaded). Some serious testing will be going on, but it should have no effect unless specifically asked for. INSTALLATION CHANGES o TITLE files in packages are no longer used, the Title field in the DESCRIPTION file being preferred. TITLE files will be ignored in both installed packages and source packages. o When searching for a Fortran 77 compiler, configure by default now also looks for Fujitsu's frt and Compaq's fort, but no longer for cf77 and cft77. o Configure checks that mixed C/Fortran code can be run before checking compatibility on ints and doubles: the latter test was sometimes failing because the Fortran libraries were not found. o PCRE and bzip2 are built from versions in the R sources if the appropriate library is not found. o New configure option --with-lapack to allow high-performance LAPACK libraries to be used: a generic LAPACK library will be used if found. This option is not the default. o New configure options --with-libpng, --with-jpeglib, --with-zlib, --with-bzlib and --with-pcre, principally to allow these libraries to be avoided if they are unsuitable. o If the precious variable R_BROWSER is set at configure time it overrides the automatic selection of the default browser. It should be set to the full path unless the browser appears at different locations on different client machines. o Perl requirements are down again to 5.004 or newer. o Autoconf 2.57 or later is required to build the configure script. o Configure provides a more comprehensive summary of its results. o Index generation now happens when installing source packages using R code in package tools. An existing 'INDEX' file is used as is; otherwise, it is automatically generated from the \name and \title entries in the Rd files. Data, demo and vignette indices are computed from all available files of the respective kind, and the corresponding index information (in the Rd files, the 'demo/00Index' file, and the \VignetteIndexEntry{} entries, respectively). These index files, as well as the package Rd contents data base, are serialized as R objects in the 'Meta' subdirectory of the top-level package directory, allowing for faster and more reliable index-based computations (e.g., in help.search()). For vignettes an HTML index is generated and linked into the HTML help system. o The Rd contents data base is now computed when installing source packages using R code in package tools. The information is represented as a data frame without collapsing the aliases and keywords, and serialized as an R object. (The 'CONTENTS' file in Debian Control Format is still written, as it is used by the HTML search engine.) o A NAMESPACE file in root directory of a source package is copied to the root of the package installation directory. Attempting to install a package with a NAMESPACE file using --save signals an error; this is a temporary measure. o The defaults for configure for Darwin systems is --with-blas='-framework vecLib' --with-lapack --with-aqua that by default builds R as a framework and installs it in /Library/Frameworks as R.framework. Then, make install just installs the R.framework in /Library/Frameworks unless specified at configure time using the -enable-R-framework=[DIR] or using the --prefix flag at installation time. DEPRECATED & DEFUNCT o The assignment operator _' will be removed in the next release and users are now warned on every usage: you may even see multiple warnings for each usage. If environment variable R_NO_UNDERLINE is set to anything of positive length then use of _' becomes a syntax error. o machine(), Machine() and Platform() are defunct. o restart() is defunct. Use try(), as has long been recommended. o The deprecated arguments pkg' and lib' of system.file() have been removed. o printNoClass() {methods} is deprecated (and moved to base, since it was a copy of a base function). o Primitives dataClass() and objWithClass() have been replaced by class() and class<-(); they were internal support functions for use by package methods. o The use of SIGUSR2 to quit a running R process under Unix is deprecated, the signal may need to be reclaimed for other purposes. UTILITIES o R CMD check more compactly displays the tests of DESCRIPTION meta-information. It now reports demos and vignettes without available index information. Unless installation tests are skipped, checking is aborted if the package dependencies cannot be resolved at run time. Rd files are now also explicitly checked for empty \name and \title entries. The examples are always run with T and F redefined to give an error if used instead of TRUE and FALSE. o The Perl code to build help now removes an existing example file if there are no examples in the current help file. o R CMD Rdindex is now deprecated in favor of function Rdindex() in package tools. o Sweave() now encloses the Sinput and Soutput environments of each chunk in an Schunk environment. This allows to fix some vertical spacing problems when using the latex class slides. C-LEVEL FACILITIES o A full double-precision LAPACK shared library is made available as -lRlapack. To use this include $(LAPACK_LIBS)$(BLAS_LIBS) in PKG_LIBS. o Header file R_ext/Lapack.h added. C declarations of BLAS routines moved to R_ext/BLAS.h and included in R_ext/Applic.h and R_ext/Linpack.h for backward compatibility. o R will automatically call initialization and unload routines, if present, in shared libraries/DLLs during dyn.load() and dyn.unload() calls. The routines are named R_init_ and R_unload_, respectively. See the Writing R Extensions Manual for more information. o Routines exported directly from the R executable for use with .C(), .Call(), .Fortran() and .External() are now accessed via the registration mechanism (optionally) used by packages. The ROUTINES file (in src/appl/) and associated scripts to generate FFTab.h and FFDecl.h are no longer used. o Entry point Rf_append is no longer in the installed headers (but is still available). It is apparently unused. o Many conflicts between other headers and R's can be avoided by defining STRICT_R_HEADERS and/or R_NO_REMAP -- see Writing R Extensions' for details. o New entry point R_GetX11Image and formerly undocumented ptr_R_GetX11Image are in new header R_ext/GetX11Image. These are used by package tkrplot. BUG FIXES o The redefinition of the internal do_dataentry by both the aqua and X11 modules, casued a bus error when launching R without the --gui=aqua option under X11 using a version of R built to use the aqua module. This has now been fixed. (PR#6438) o Sys.sleep() on Unix was having trouble with waits of less than 0.5s o The fix to PR#2396 broke read.table() on files with CR line endings. (PR#2469) Separate problem with this on Carbon Mac OS build fixed as well. o Converting Sweave files to noweb syntax using SweaveSyntConv() was broken. o Printing numbers near the minimum could get the number of significant figures wrong due to underflow: for example 4e-308 might print as 4.00000e-308. (Seen on some Windows builds, and also on numbers around 1e-317 on Linux.) o wilcox.test() could give integer overflow warnings on very long vectors. Also added tests for numeric inputs, as per the help page. (PR#2453) o Printing unquoted character vectors containing escape characters was computing the wrong length and hence misaligning names. This was due to a bug in Rstrlen which might have had other effects. o if(logical(0)) and while(logical(0)) now report zero length, not missing value where logical is needed'. o The gaussian() and inverse.gaussian() families were documented to allow only one link, which has not been true in R for at least four years. o prmatrix() forced conversion to character if na.print' was used, and that conversion neither respected digits' nor quote'. o Rprof() might give misleading results for too small values of interval' and in practice the default 20ms was about as small as is advisable on Linux. Now the interval is forced to be at least one clock tick. o summary.data.frame() was not giving interpretable results when the data frame contained a data frame as a column. (PR#1891) o delete.response() might re-order the rhs terms so prediction might fail or even give incorrect results. (PR#2206) o StructTS() now accepts numeric time series of integer storage mode. o all(), any() now handle NAs as documented. o Subsetting arrays to a result with 0 dimension(s) failed if the array had dimnames. (PR#2507) o If the call to data.frame() included 0-row arguments, it tried to replicate them to the maximum number of rows, and failed if this was 1 or more. o replicate() now understands data frames to which na.omit() has been applied. o is.ts() was too liberal: a time series must have at least one point. o methods() was sorting by package, not by name. o symbols(thermometers=) was often giving a spurious warning about the range. o tcltk was using deprecated internals of the Tcl library when accessing error messages. Not likely to be a user-visible change. o The automatic search for BLAS libs now tries Sun's libsunperf the way the latest versions require. (PR#2530) o str(array(1)) now does show the array. str(Surv(...)) now works again. o step(), add1.default() and drop1.default() now work somewhat better if called from a function. o page() was searching from the wrong environment, and so searching base before the workspace. o crossprod(Z) for complex Z was returning nonsense. o La.chol2inv() gave incorrect results unless the matrix was square. o When the POSIXt date functions were required to guess DST, they sometimes guessed correctly that DST was in force but converted a POSIXlt time as if standard time was given. o c/rbind were not handling zero col/row matrices correctly. (PR#2541 was one symptom.) o approx() and approxfun() now work with 1 knot if method = "constant". stepfun(), ecdf() and plot.stepfun() do so as well. o AIC.lm/default was failing if multiple objects and k were specified. (PR#2518) o removeMethods{methods} was broken. (PR#2519) o summary.glm() had two aic' components in the returned object. o autoload() was returning the value of its last command, a promise, even though it was documented to have no value. As a result some packages (e.g. nlme) were loading packages they meant to autoload. o Fixes to methods and classes: - show() is consistent with using setOldClass for S3 classes. - several problems with the coerce and replace methods generated by setIs have been fixed. - more thorough tests & informative messages for invalid def' arguments to setGeneric - setGeneric will now create the generic function even when a generic of the same name already exists (it does issue a warning). o unz() connections could no longer be opened. (PR#2579) o unique(ordered factor) returned an unordered factor. (PR#2591) o x[] <- value coerced x to the mode of value if and only if x had length 0! (Should only happen if x is null: PR#2590) o lm() mislabelled the cols of the qr decomposition. (cause of PR#2586) o data() looks for file extensions in an order prescribed in the help file: previously whether foo.R or foo.csv was used was locale-dependent. o sys.function() now returns the actual function being evaluated in the specified frame rather than one inferred from the call. o match.call() now uses the definition of the actual function being evaluated rather than one inferred from the call. o abbreviate(*, dot = TRUE) now only adds a "." where abbreviations did happen. o Changing timezones in the POSIXt functions was not working on some Linux systems, and this has been corrected. o ks.test() in package ctest had numerical problems in the lower tail of the asymptotic distribution (PR#2571). o Sweave() now handles empty chunks at the end of files correctly. o [<-() lost the object bit if coercion was involved. o package::object wasn't being deparsed properly. o seq.POSIXt() with by' an object of class "difftime" ignored the units. o rank(c("B", NA)) no longer returns character. o reference to by() added in ?tapply o ?lm describes what happens with matrix response o The X11 device has improved event handling. In particular it used to often miss the last of a series of resize events. o lm.influence() and related functions now work again for the multivariate case and when there are zero weights. o format( ) now always keeps names and dimnames. o table(factor(c(2,NA), exclude=NULL)) prints better now. o predict(foo, type = "terms") and hence residuals(foo, type = "partial") now work for lm and glm objects with weights zero. Further, model.matrix() is now only called once. o R CMD config now works correctly when called from a Makefile using GNU make. o The data.frame method for rbind() was - converting character columns to factors, - converting ordered factor columns to unordered factors, - failing to append correctly a factor to a character column and vice versa. o as.hclust.twins() now does provide proper labels', method' and call' components. o cycle() sometimes failed on a time series which started at a cycle other than 1. o read.dcf() read incorrectly files which did not end in a new line. o read.socket() dropped certain non-alphanumeric characters. (PR#2639) o termplot() handles missing data better (PR#2687, ) o Corrected MacRoman encoding for Icircumflex etc. ************************************************** * * * 1.6 SERIES NEWS * * * ************************************************** CHANGES IN R VERSION 1.6.2 BUG FIXES o plot.stepfun() now obeys a ylim=.' specification. o removeClass() does a better job of removing inheritance information. o setIs() will not allow mismatched representations between two classes (without an explicit coerce method). o The code underlying polygon drawing contained a memory leak. This showed up in persp, but did not affect other graphics functions. It is now possible to draw big DEMs. o logLik.nls() gave wrong df. (PR#2295) o rbind() with a mixture of data frames and matrices treated the matrices as vectors. (PR#2266) o stripchart(method="stack") was not handling missing values. (PR#2018) o Arithmetic functions such as log() lost the object bit from classed objects if coercion was needed. (PR#2315) o exp_rand would go into an infinite loop if unif_rand returned 0. o formatC(x, format="fg") could return exponential format if rounding pushed x over a positive power of 10. (PR#2299) o attr(x, foo) used partial matching for foo' (even though not documented to do so), and failed to find foo' if there were two or more partial matches before the exact match in the list of attributes. o Rdconv now creates direct HTML hyperlinks when linking to documentation in the same package. The code now ensures that links which can be resolved within the package are so resolved, even when there are possible resolutions in other packages. o If readBin(what=character()) is used incorrectly on a file which does not contain C-style character strings, warnings (usually many) are now given. o Building libR.so with the zlib in the R sources was not finding the local zlib headers. o system(intern=TRUE) has an undocumented line length limit of 119 chars both on Unix and Windows. The limit is now 8096 and documented. On Unix (only) every 120th character used to be discarded. o plot.POSIX[cl]t were not passing graphics parameters on to axis.POSIXct. o On some HP-UX systems, installed scripts were not executable when using the BSD-compatible install system program found by configure. We now always use install-sh on HP-UX. (PR#2091) o c() was converting NA names to "NA": now proper NA strings are used wherever possible. (PR#2358) o Checks in the C code prevent possible memory faults when standardGeneric is called invalidly. o Macros NEW_OBJECT (aka NEW) and MAKE_CLASS added; required by the .Call interface to generate arbitrary objects. o A typo was causing segfaults when using data.entry under SuSE. o mostattributes<-() was failing to copy across dimnames when one component was NULL, affecting pmax() and pmin() when the first argument was a matrix. (root cause of PR#2357) o The pdf() device now initialises graphical parameters properly. (PR#2281) o Checks in the C code prevent possible memory faults when standardGeneric is called invalidly. o Macros NEW_OBJECT (aka NEW) and MAKE_CLASS added; required by the .Call interface to generate arbitrary objects. o Problem that prevented package tcltk from working with Tcl/Tk 8.4 (crash on initialization) resolved. (Notice that binaries may still require an older Tcl/Tk, for example on Windows). o type.convert() was not getting the levels right if passed a character vector containing s, and na.strings' did not contain "NA". This affected read.table(). o Internal match function did not check for nor handle 0-length vectors. (The R function match() did.) This could cause type.convert() to segfault. o The line length limit in output text connections has been raised to 8095 chars. o Sweave now uses anonymous file rather than text connections to avoid the limits of the latter (see previous item). o parsing did not work on connections when pushback was used (as it had never been implemented). (PR#2396) o max.col() only found NAs in the first column (typo). o Added a workaround for recent versions of glibc (e.g. RedHat 8.0) with inconsistent mktime/localtime functions which caused conversion to/from POSIXct times prior to 1970-01-01 to be inconsistent. On such platforms this is a run-time test to allow e.g. R compiled on RH7.2 to run on RH8.0. o Clipping was not being reset properly between plots on the gtk() device (the default under the GNOME interface). (PR#2366) o axis(*, fg= cc) now works (again) the same as axis(*, col = cc). CHANGES IN R VERSION 1.6.1 NEW FEATURES o Added a few "trivial and obviously missing" functions to tcltk: tkchooseDirectory, tkpopup, tkdialog, tkread o barplot() has a new argument axis.lty', which if set to 1 allows the pre-1.6.0 behaviour of plotting the axis and tick marks for the categorical axis. (This was apparently not intentional, but axis() used to ignore lty=0.) The argument border' is no longer ".NotYetUsed". BUG FIXES o hist(, cex.axis = f) now works for x-axis too. o prompt() gave wrong \usage{.} for long argument default expressions. o summary(x) gives more information when x' is a logical (or a data frame with a logical column which is now quite customary). o seq.POSIXt(from, to, length.out= . ) could give too long results o summaryRprof() was counting nested calls to the same function twice. o Printing of objects of mode "expression" did strange things if there were "%" characters in the deparsed expression (PR#2120). o as.matrix.data.frame converted missings to "NA" not character NA. (PR#2130) o spec.pgram() was only interpolating zero freq for one series. (PR#2123) o help(randu) had % unescaped in the example. (PR#2141) o Making html links would fail if packages-head.html was not writable. (PR#2133) o Sweave.sty was not installed to $R_HOME/share/texmf when builddir != srcdir. On Windows backslashes in latex paths have to be replaced by slashes. o A memory leak in deparsing was introduced when eliminating static variables (thanks to Achim Zeileis for spotting this). A similar problem in loading workspaces has been corrected. o TclInterface.Rd incorrectly used \synopsis for \usage so that the usage section wasn't output. o Readline stack off-by-one error. (PR#2165) o R_ExpandFileName had a memory leak in the case libreadline was used under Unix-alikes. o sys.save.image() now closes all connections so it will work even if the connection list has become full. o loess() had an unstated limit of four predictors: this is now documented and enforced. o${R_HOME}/etc/Renviron.site is now not read if R_ENVIRON is set, as documented. Previously it was read unless R_ENVIRON pointed to an actual file. o Startup.Rd described the processing under Unix-alikes but incorrectly implied it happened that way on the Windows and Mac OS ports. Neither use Renviron.site, for example. o besselK(x,*) now returns 0 instead of Inf for large x. (PR#2179) o The Tcl console code didn't work with Tcl/TK 8.0, and has been #ifdef'd out. (PR#2090) o format.AsIs() was not handling matrices. o sd() was not passing na.rm to var() for matrices and data frames. o dist() {mva} silently treated +/-Inf as NA. o setwd() now returns NULL invisibly. o basename() and dirname() did not check the length of their input and ignored elements after the first. This affected undoc {tools}. o If A had dimnames, eigen(A) had inappropriate dimnames. (PR#2116) o as.POSIXct.dates had a sign error for the origin (PR#2222) o The claim that pie charts should be avoided (in pie.Rd) is now supported by a quote from Cleveland (1985). o The vsnprintf() functions supplied for systems that don't supply their own had a bug in the output of fractional parts, corrupting data if using save() with ascii=TRUE. (PR#2144) o pretty() values close to 0 in some cases which are now 0 (PR#1032 and D.Brahm's mails).