CRAN Task View: Archaeological Science
|Maintainer: Ben Marwick||Contact: benmarwick at gmail.com||Version: 2018-05-01|
This CRAN Task View contains a list of packages useful for scientific work in Archaeology, grouped by topic. Note that this is not an official CRAN Task View, just one I have prepared for my own convenience, so it includes some packages only on GitHub and other non-CRAN resources I find useful. Many of the most highly recommended packages listed here can be installed in a single step by installing the tidyverse package.
Besides these packages, a very wide variety of functions suitable for scientific work in Archaeology is provided by both the basic R system (and its set of recommended core packages), and a number of other packages on the Comprehensive R Archive Network (CRAN) and GitHub. Consequently, several of the other CRAN Task Views may contain suitable packages, in particular the Social Sciences, Spatial, Spatio-temporal, Cluster analysis, Multivariate Statistics, Bayesian inference, Visualization, and Reproducible research Task Views.
Contributions to this Task View are always welcome, and encouraged. The source file for this particular task view file resides in a GitHub repository (see below), and pull requests are the preferred method for contributions.
- The ideal method is to export your spreadsheets from Excel (or whatever program you made them) as CSVs (comma-separated-values, a simple, non-proprietary plain-text-based format that is very transparent, being human-readable, easily machine-processable and suitable for archival storage) and read them into R using the base function
read.csv(). Data in other types of plain text files can be read in with
- To read Microsoft Excel files into R there are a number of packages: readxl (requires Rcpp, which in turn requires Rtools for Windows or XCode for OSX), gdata (requires Perl), openxlsx (requires Rcpp, which in turn requires Rtools for Windows or XCode for OSX), XLConnect (requires rJava and Java), xlsx (also requires rJava and Java). OpenDocument Spreadsheet files can be read into R using readODS.
- For working with untidy and/or complex Excel spreadsheets (i.e. many tables in one sheet, coloured cells, cells with formulas, etc.), use jailbreakr, xlsxtractr, tidyxl and unpivotr
- Text data (as in sentences and paragraphs) can be read in and analysed with the tm package or tidytext. If the text is in a Microsoft Word file, use textreadr, or if it's a PDF file, use pdftools. Text in image files can be extracted with tesseract
- For quickly reading in a very large number of CSV files, or very large CSV files, use
fread()from the data.table package or functions in the readr package. For plain text (CSV, TSV, etc) files that are too big to fit in memory, use chunked to read and operate on them in small chunks.
- RODBC, RMySQL, RPostgreSQL, RSQLite for connecting R to SQL databases. RODBC can connect to Microsoft Access databases.
- The haven and foreign packages can be used for reading and writing files from certain versions of Minitab, S, SAS, SPSS, Stata, Systat and Weka.
- ESRI shapefiles can be read using rgdal or maptools
- R can receive data directly from the web using rvest, httr, XML, jsonlite, RSelenium (requires Selenium 2.0 Remote WebDriver). R can be programmed to be a web-scraper using rvest and/or rselenium. The Web Technologies task view gives more details.
- Google spreadsheets can be read into R using the googlesheets package
- Tables can be read directly from Microsoft Word documents with docxtractr, and from PDF documents with tabulizer
- Datasets from the Open Context repository can be browsed and read into R using the opencontext package
- dplyr and data.table for splitting the data up by groups, applying some common or custom functions, and combining the output back into a convenient form (ie. typical aggregation, splitting and summarising operations). Both packages are fast on very large datasets.
- For cleaning, examining, and making quick summaries of data, janitor, skimr, statar, and xda maybe useful. tabplot has functions for exploratory data visualisation of tables
- tidyr for rearranging the data from long to wide forms, and more complex reshaping.
- purrr simplifies working with lists, applying functions to list elements and collecting the results
- broom takes the output of many built-in functions in R, such as
t.test, and turns them into tidy data frames.
- measurements, convertr, and units converts between metric and imperial units, or calculate a dimension's unknown value from other dimensions' measurements.
- ggplot2 produces a very wide variety of attractive plots with a highly flexible and logical syntax.
- To combine and align multiple ggplot2 plots (and other image types) in one panel, use cowplot, patchwork, egg, gridExtra, or multipanelfigure (the most versatile for combining different elements)
- Extensions include ggbiplot (PCA biplots with ellipses), GGally (plot matrices), ggtern (ternary plots), ggfortify (many methods for plotting PCA, clustering, linear model output, etc., using ggplot2),ggalt (more geoms, coords, stats, scales and fonts, including splines, 1d and 2d densities), waffle (for square pie charts), ggraph for treemaps, ggfan for fanplots, tidybayes for plotting output of Bayesian analysies, ggridges for ridge plots, ggalt for many additional geoms, and ggrepel for moving overlapping text labels away from each other.
- For showing distributions across several categories: ggforce, ggbeeswarm, vipor, sinaplot
- plotly and ggiraph make ggplots interactive with mouse-over pop-ups, zooming, click-actions, etc. scatterD3 makes highly interactive scatter plots
- circlize implements Circos in R for circular and chord plots. Rose plots can be made with ggplot2, Schmidt diagrams can be made with the
netfunction in RFOC or the
Stereo*functions in RockFab
- plotrix has the function
battleship.plot()to make Ford's battleship diagrams.
- ggmap combines the spatial information of static maps from Google Maps, OpenStreetMap, Stamen Maps or CloudMade Maps with the layered grammar of graphics implementation of ggplot2
- Stratigraphic data plots can be drawn using tidypaleo, the
Stratiplot()function in analogue and functions
strat.plot()and strat.plot.simple in the rioja package. The rioja package also includes
chclust()for stratigraphically constrained clustering, and related dendrogram plotting methods.
- rgl and plotly for interactive 3D plots,and scatterplot3d also draws 3D point clouds.
- tabplot for exploratory data visualisation of tables
- For schematic diagrams, such as Harris matrices, DiagrammeR is useful.
- For colour schemes in plots: viridis for perfectly perceptually-uniform colours, RColorBrewer, wesanderson, and munsell for exploring and using the Munsell colour system, and for some extra themes for ggplot2, including some Tufte-inspired themes, see ggthemes.
Analysis in general
- Base R, especially the stats package, has a lot of functionality useful for analysing archaeological data. For example,
power.anova.test()among many others. Hmisc includes bootstrapping, setting confidence intervals, and power analysis functions, see also psych for useful descriptive statistics and visualizations.
- corrr contains many convenient functions for exploring correlations.
- Bayesian and resampling variants of these also exist, for example in the MCMCpack, BEST (requires JAGS), Bayesian First Aid (also requires JAGS) packages (see the Bayesian task view for more) and the coin, boot, and bootstrap packages (see also msm).
- For analysing change over time, bcp, changepoint, ecp, AnomalyDetection and BreakoutDetection provide functions for detecting distributional changes within time-ordered observations.
- abc provides functions for parameter estimation, model selection, and goodness-of-fit.
Analysis of categorical and count data
table()function in the base package and the
ftable()functions in the stats package construct contingency tables.
fisher.test()functions in the stats package may be used to test for independence in two-way contingency tables
assocstats()function in the vcd package computes the Pearson chi-Squared test, the Likelihood Ratio chi-Squared test, the phi coefficient, the contingency coefficient and Cramer's V for plain or stratified contingency tables.
Linear, generalized linear models, and non-linear models
- Linear models can be fitted (via OLS) with
lm()(from stats). The modelr package has helper functions for pipeable modelling (e.g. cross-validation, bootstrapping). For data from a non-normal population, or when there are apparent outliers, lmPerm computes linear models using permutation tests.
- Bayesian fitting of linear and non-linear models is possible with rstanarm, brms, and rethinking.
nls()function (from stats) as well as the package minpack.lm allow the solution of nonlinear least squares problems.
- Correlated and/or unequal variances can be modeled using the
gnls()function of the nlme package and by nlreg. The nlme package is supported by Pinheiro & Bates (2000) Mixed-effects Models in S and S-PLUS, Springer, New York.
- The generic
anova()function in the stats package constructs sequential analysis of variance and analysis of deviance tables, and can compute F and likelihood-ratio tests for nested models. (It is typical for other classes of statistical models in R to have anova methods as well.) The generic anova function in the car package (associated with Fox, An R and S-PLUS Companion to Applied Regression, Sage, 2002) constructs so-called "Type-II" and "Type-III" tests for linear and generalized linear models.
- cerUB for multivariate statistic protocols for integrating archaeometric data (geochemical, mineralogical, petrographic)
- The Cluster task view provides a more detailed discussion of available cluster analysis methods and appropriate R functions and packages.
- caret and FactoMiner are popular packages with a suite of multivariate methods
- aplpack provides
qda()within MASS provide linear and quadratic discrimination respectively.
Model Testing and Validation
- caret provides many functions for model training, testing, and validation. There is also Max Kuhn's excellent companion book called "Applied Predictive Modeling" (Springer, also available as an ebook). The mlr package also provides classification, regression, and machine learning methods.
- Other packages that enable tuning and evaluation of models include bootstrap and Hmisc
- The CVtools and DAAG packages include cross-validation functions for evaluating the optimality of tuning parameters such as sample sizes or number of predictors etc., in statistical models
- rsample creates different types of resamples and corresponding classes for their analysis
Hierarchical cluster analysis
- The package cluster provides functions for cluster analysis following the methods described in Kaufman and Rousseeuw (1990) Finding Groups in data: an introduction to cluster analysis, Wiley, New York
- There are also
hclust()in the stats package and
- pvclust is a package for assessing the uncertainty in hierarchical cluster analysis. It provides approximately unbiased p-values as well as bootstrap p-values. Enhanced plotting is also available through the dendextend package.
- dendextend Offers a set of functions for extending dendrogram objects in R. It allows to both adjust a tree's graphical parameters - the color, size, type, etc of its branches, nodes and labels - as well as visually (and statistically) compare different dendrograms to one another.
Other partitioning methods
kmeans()in stats provides k-means clustering and cmeans() in e1071 implements a fuzzy version of the k-means algorithm. The recommended package cluster also provides functions for various partitioning methodologies.
- To compute the optimum number of clusters there is the
pamk()function in the fpc package,
- Self-organising maps can be produced with the kohonen
Mixture models and model-based cluster analysis
Principle components and other projection, scaling, and ordination methods
- Principal Components (PCA) is available via the
prcomp()function (based on svd),
rda()(in package vegan),
pca()(in package labdsv) and
dudi.pca()(in package ade4), provide more ecologically-orientated implementations. Plotting of PCA output is available in ggbiplot and ggfortify.
- Redundancy Analysis (RDA) is available via
rda()in vegan and
- Canonical Correspondence Analysis (CCA) is implemented in
cca()in both vegan and ade4.
- Detrended Correspondence Analysis (DCA) is implemented in
- Principal coordinates analysis (PCO) is implemented in
pco()in ecodist, and
cmdscale()in package MASS.
- Non-Metric multi-Dimensional Scaling (NMDS) is provided by
isoMDS()in package MASS and
nmds(), a wrapper function for
isoMDS(), is also provided by package labdsv. vegan provides helper function
isoMDS(), implementing random starts of the algorithm and standardised scaling of the NMDS results. The approach adopted by vegan with
metaMDS()is the recommended approach for ecological data.
mca()in MASS provide simple and multiple correspondence analysis respectively. ca also provides single, multiple and joint correspondence analysis.
mca()in ade4 provide correspondence and multiple correspondence analysis respectively, as well as adding homogeneous table analysis with
hta(). Further functionality is also available within vegan co-correspondence is available from cocorresp. FactoMineR provides
MCA()which also enable simple and multiple correspondence analysis as well as associated graphical routines. CAinterprTools has functions for correspondence analysis and diagnostics.
- Seriation methods are available in seriation, which includes
bertinplot()for producing battleship plots, and CAseriation which also has a battleship plotting function.
- tabula provides methods to analyse and visualise archaeological count data (artifacts, faunal remains, etc.) using diversity measures and Ford (1962) and Bertin (1977) diagrams.
dist()in standard package stats,
daisy()in recommended package cluster,
distance()in ecodist, a suite of functions in ade4, and rdist for very fast distance computations.
- simba provides functions for the calculation of similarity and multiple plot similarity measures with binary data (for instance presence/absence data)
distance()in the analogue package can be used to calculate dissimilarity between samples of one matrix and those of a second matrix. The same function can be used to produce pair-wise dissimilarity matrices, though the other functions listed above are faster.
distance()can also be used to generate matrices based on Gower's coefficient for mixed data (mixtures of binary, ordinal/nominal and continuous variables). Function
daisy()in package cluster provides a faster implementation of Gower's coefficient for mixed-mode data than
distance()if a standard dissimilarity matrix is required. Function
gowdis()in package FD also computes Gower's coefficient and implements extensions to ordinal variables.
- DistatisR provides functions for three-way multidimensional scaling for the analysis of multiple distance/covariance matrices collected on the same set of observations.
- Simple and partial Mantel tests to compute the Mantel statistic as a matrix correlation between two dissimilarity matrices are available in vegan and ecodist
Making maps and using R as a Geographical Information System
- Making maps: ggspatial for easy plotting of most kinds of spatial data, see also: maps, rworldmap, mapdata, maptools, mapproj, ggplot2, ggmap, RgoogleMaps, cartography RColorBrewer
- Scale bars and North arrows can be added to maps made with ggplot and ggmap using GISTools, ggsn or legendMap.
- Interactive mapping of spatial objects with zooming and panning is possible with leaflet and geomapview
- To interactively create and edit spatial objects (points, lines, polygons), use mapedit, and to smooth drawn polygons, use smoothr
- R has many packages that enable it to be used as a GIS for spatial analysis: sf, sp, raster, rasterVis, shapefiles, spatial, spatstat, splancs, ipdw, geoR, argosfilter, ads, spdep, gstat, GISTools
- spgrass6 and rgrass7 provides facilities for using all GRASS geographical information system commands from the R command line. RQGIS establishes an interface between R and QGIS, i.e. it allows the user to access QGIS functionalities from within R.
- spdply provides methods for dplyr verbs for 'sp' and 'Spatial' class objects.
- rgdal uses the GDAL (Geospatial Data Abstraction Library) (raster) and OGR (vector) data I/O library, as well as PROJ.4 for CRS (coordinate reference systems) (re)projections
- rgeos uses the GEOS (Geometry Open Source) library, which powers PostGIS: does the 'usual' geometry operations for features
- The Spatial and Spatio Temporal task views have more details.
- recexcavAAR 3D Reconstruction of Archaeological Excavations
- SiteExploitationTerritories implements the Tobler Hiking Function for spatial time-cost analysis of rasters
- klrfome provides functions to model a single scalar outcome (e.g. presence/absence of an archaeological site) to a distribution of features (such as landscape and environmental variables).
Environmental & geological analysis
- Transfer function models including weighted averaging (WA), modern analogue technique (MAT), Locally-weighted WA, & maximum likelihood (aka Gaussian logistic) regression (GLR) are provided by analogue, vegan, and rioja for stratigraphic analyses
- G2Sd gives full descriptive statistics and a physical description of sediments based on grain-size distributions, soiltexture and ggtern for ternary plots of soil texture
- Constrained clustering of stratigraphic data is provided by function
chclust()in the form of constrained hierarchical clustering in rioja.
- Stratigraphic columns can be plotted and analysed with the the SDAR package.
- Benn diagrams can be drawn with plotrix and Woodcock diagrams with RFOC.
- Function for circular statistics such as the Rayleigh test and many others, can be found in CircStats, RFOC, circular, Directional, and heR.Misc
- The siar package takes data on organism isotopes and fits a Bayesian model to their dietary habits based upon a Gaussian likelihood with a mixture dirichlet-distributed prior on the mean
- The zooaRch package has functions for survival analysis of zooarchaeological datasets
- Functions for tree ring analysis can be found in dplR
- See the Environmetrics task view for more.
- Radiocarbon dates can be calibrated using Bchron with various calibration curves (including user generated ones); also does Age-depth modelling, relative sea level rate estimation incorporating time uncertainty in polynomial regression models; and non-parametric phase modelling via Gaussian mixtures as a means to determine the activity of a site (and as an alternative to the Oxcal function SUM). Some of these methods can also be found in rcarbon.
- Bayesian age-depth modelling of radiocarbon dates is also available in Bacon, and clam contains functions for "classical", non-Bayesian age-depth modelling. These are not R packages, but clam has been packaged for easy use.
- The oxcAAR package allows you to use R to connect to a local installation of the OxCal software to calibrate radiocarbon dates and a variety of other OxCal operations.
- ArchaeoPhases provides statistical tools to analyze and to estimate archaeological phases from the posterior distribution (i.e. MCMC samples) of a sequence of dates. Includes testing procedures to check the presence of a gap between two successive phases or periods.
- Various R functions for Luminescence Dating data analysis are in the Luminescence package (including radial plotting) and in the numOSL package, including equivalent dose calculation, annual dose rate determination, growth curve fitting, decay curve decomposition, statistical age model optimization, and statistical plot visualization.
- The archSeries makes chronologies from information from multiple entities with varying chronological resolution and overlapping date ranges
- For time series analysis using calendar dates, zoo and padr are useful.
Phylogenetics, morphometrics, evolution and shape analysis
- The Phylogenetics task view provides more detailed coverage of the subject area and related functions within R.
- Packages specifically tailored for the analysis of phylogenetic and evolutionary data include: ape, phytools, phangorn, Rphylip (requires PHYLIP), ouch, and pegas.
- For plotting trees most of these packages include their own modifications of the base
plot()function, and there are also ggtree, ggdendro, dendextend, and ggphylo
- Morphometric and shape analysis methods are provided by shapes, geomorph, paleomorph and Momocs. Related packages include shapeR Anthropometry and Morpho.
- StereoMorph allows users to collect 3D landmarks and curves from objects using two standard digital cameras.
- pixmap provides methods for creating, plotting and converting bitmapped images in three different formats: RGB, grey and indexed pixmaps. Similarly, jpeg provides an easy and simple way to read, write and display bitmap images stored in the JPEG format.
- EBImage (requires ImageMagick) provides general purpose functionality for the reading, writing, processing and analysis of images (and is very well documented). Various functions for image processing and analysis can also be found in ripa and imager
- magick provides bindings to the ImageMagick image-processing library, the most comprehensive open-source image processing package available.
- NetLogoR provides fucntions to easily create agent-based models in R following the NetLogo framework
- RNetLogo links R and NetLogo
- simecol for simulating ecological (and other) dynamic systems. It can be used for differential equations, individual-based (or agent-based) and other models as well.
- One-dimensional cellular automata are also possible to model with the package CellularAutomaton.
- The two major packages are igraph, which is a generic network analysis and visualisation package, and sna, which performs social analysis of networks.
- Other packages include statnet, intergraph, network (manipulates and displays network objects), ergm (a set of tools to analyze and simulate networks based on exponential random graph models exponential random graph models), hergm (implements hierarchical exponential random graph models), and RSiena (allows the analyses of the evolution of social networks using dynamic actor-oriented models)
- mortAAR Analysis of Archaeological Mortality Data
Writing Reproducible manuscripts
- rrtools, is a package that provides instructions, templates, and functions for making a basic compendium suitable for writing reproducible research reports and articles with R.
- The rticles package includes templates for converting markdown documents into PDF files formatted ready for submission for publication, such as in the PLOS journals, the Frontiers In journals, and Elsevier journals. Similarly, the papaja packge contains a template for converting a markdown file into an APA-formatted PDF. These packages depend on pandoc, a universal document format converter (not an R package). In this context it is used to convert rmarkdown or LaTeX to PDF, MS Word or HTML files. It is included with RStudio but can also be used stand-alone from the command line.
- RStudio is an integrated development environment that simplfies developing R code with numerous built-in conveniences, including vim keyboard shortcuts. There are also packages that make scholarly writing in RStudio easy: wordcountaddin, citr. And several for making nice tables: kableExtra, carpenter), htmlTable), pixiedust, pander, simpletable stargazer
- The rmarkdown package implements the simple markdown document formatting language with some minor customizations to recognize R code blocks and inline code. The bookdown package provides tools for single file and multi-chapter rmarkdown documents with all the usual scholarly accessories: citations, figures, tables, captions and cross-referencing.
- redoc is an experimental package to enable a two-way R-Markdown ⟷ Microsoft Word workflow.
- Emacs is a highly flexible text editor (easiest to use as the spacemacs distribution), which when used with the Emacs Speaks Statistics package, is a comprehensive writing and R development environment. Org-mode provides a literate programming environment in Emacs similar to knitr.
- knitr enables R code and text with formatting instructions (eg. markdown or LaTeX) to be combined in a single document and executed to produce a document that contains rendered plots, analysed data and formatted text. The remake package has functions that enable declarative workflows so that each time an analysis is run, it updates only the parts of the workflow that have changed.
- The rrrpkg essay explains why the R package is a suitable file-and-folder structure for almost any research project, with real-world examples, manuscriptPackage, template, template (yes, that's two slightly different packages with the same name), and prodigenr, makeProject, and ProjectTemplate are packages that give templates for organising an analysis as an R package (eg. where the manuscript is the package vignette, or similarly bundled with the package). rlp is a package that lets you write an analysis as a Rmd file and then converts it into a package
- packrat supports the development of isolated, stand-alone projects that include all the packages used and their dependencies. miniCRAN has functions to create a local repository to install packages (and their dependancies) from without internet access. Related packages include rbundler for package development which manages dependencies listed in a package's DESCRIPTION file by storing them in a local project-specific library for installation, and pkgsnap for creating a snapshot of your installed CRAN packages with 'snap', and then using 'restore' on another system to recreate exactly the same environment.
- checkpoint allows you to install R packages from a specific snapshot date in the past, ensuring that you use the same package version that you started with, not a more recent one (related: gRAN can retrieve and build sources for any version of any non-base package that has ever been released on CRAN or BioConductor).
- rocker is a project that provides Docker containers to run R in a lightweight virtual environment, the hadleyverse container includes dplyr, ggplot2, etc., as well as RStudio server and LaTeX. The package harbor provides functions for controlling docker containers on local and remote hosts. The analogsea package has functions for deploying R and RStudio quickly & easily on DigitalOcean clusters using Docker images for cloud computing. The dockertest package contains functions for generating Dockerfiles from R packages and other R projects, and building Docker containers that contains all the package dependencies. liftr helps with persistent reproducible reporting by containerization of R Markdown documents.
Developing R code and packages
- devtools (requires Rtools for Windows or Xcode for OSX) for easily creating R packages. usethis automates many tasks surrouding package-making, including
use_travis()and related functions for easily adding continuous integration for automated building and testing during package development. Mason helps you to quickly build R packages using an interactive Q&A to generate metadata files, READMEs with badges, git repositories, etc.
- Goodpractice gives advice about good practices when building R packages. Advice includes functions and syntax to avoid, package structure, code complexity, code formatting, etc.
- badgecreatr generates badges for your readme file to signal the quality and current status of your package.
- roxygen2 for simplifying the creation of documentation for packages,
- testthat for developing tests of functions in packages
- Rcpp enables the use of C++ code in R packages for high performance computing, requires Rtools for Windows or Xcode for OSX
- editR is a basic Rmarkdown editor with instant previewing of your document. It allows you to create and edit Rmarkdown documents while instantly previewing the result of your writing and coding.
- Style guide for writing R code by Hadley Wickham, and the packages formatR and rfmt which are designed to reformat R code to improve readability. The lintr package analyses code to check that it conforms to Hadley Wickham's style guide (this package is built into RStudio)
- Idioms of R are discussed in the vignette of the rockchalk package, and Pat Burn's essay the R Inferno.
- archdata contains eleven archaeological datasets from around the world reported in published studies. These represent typical forms of archaeological data (and so are useful for teaching)
- binford contains more than 200 variables coding aspects of hunter-gatherer subsistence, mobility, and social organization for 339 ethnographically documented groups of hunter-gatherers, as used in Binford (2001) Constructing Frames of Reference: An Analytical Method for Archaeological Theory Building Using Ethnographic and Environmental Data Sets
- BSDA contains a dataset of 60 radiocarbon ages of observations taken from an archaeological site with four phases of occupation.
- cawd contains 15 datasets of ancient Greek, Roman and Persian maps and digital atlas data
- chemometrics contains a dataset of elemental concentrations for 180 archaeological glass vessels excavated from 15th - 17th century contexts in Antwerp.
- zooaRch contains two zooarchaeological datasets.
- gsloid Contains published data sets for global benthic d18O data for 0-5.3 Myr and global sea levels based on marine sediment core data for 0-800 ka
- evoarchdata contains four published datasets widely used in archaeological studies of cultural evolution
Places to go for help
?meanto get built-in help on the mean function
sos::findFn("rose diagram")searches all installed packages for the search term, using the sos package
- Most major packages come with vignettes that narrate typical uses of the package's core functions. Vignettes can be accessed with the command
- Google searches in the form: r help [search terms]
- A custom search engine of R resources: http://www.rseek.org/
- Graphical output from the examples in the documentation for all CRAN packages
- All R package documentation (including CRAN, GitHub and Bioconductor packages) is online in an easy-to-ready format at http://www.rdocumentation.org/
- Cheatsheets to print for handy reference: short one on base, longer one on base, ggplot2, dplyr, tidyr, rmarkdown, making packages, data.table, using colours, colours, numerous others
- stackoverflow is an online Q&A point-scoring website where questions and answers can be voted on to indicate their quality. Many highly skilled R programmers are active participants. Cross Validated is a similar Q&A site for questions about statistics
- You could send a message to the official r-help email list, but do be sure to read, follow and cite the posting guide. The list is also searchable.
- CRAN Task View: SocialSciences
- CRAN Task View: Spatial
- CRAN Task View: Spatio-temporal
- CRAN Task View: Cluster analysis
- CRAN Task View: Multivariate Statistics
- CRAN Task View: Bayesian inference
- CRAN Task View: Phylogenetics
- CRAN Task View: Robust
- CRAN Task View: Visualization
- CRAN Task View: Reproducible research
- David L. Carlson's guides on using R for 'Quantifying Archaeology' by Stephen Shennan 'Statistics for Archaeologists' by Robert Drennan
- Matt Peeples' scripts for archaeological statistics
- Gianmarco Alberti's pages on Correspondence Analysis in Archaeology
- Quantitative Archaeology Wiki, including some code for battleship plots
- Michael Baxter's 'Notes on Quantitative Archaeology using R'
- GitHub repository for this Task View
Publications that include R code
d'Alpoim Guedes J, Jin G, Bocinsky RK (2015) The Impact of Climate on the Spread of Rice to North-Eastern China: A New Look at the Data from Shandong Province. PLOS ONE 10(6): e0130430. doi: http://dx.doi.org/10.1371/journal.pone.0130430
Angourakis, Andreas, Verònica Martínez Ferreras, Alexis Torrano, and Josep M. Gurt Esparraguera. 2018. “Presenting Multivariate Statistical Protocols in R Using Roman Wine Amphorae Productions in Catalonia, Spain.” Journal of Archaeological Science 93 (May): 150–65. https://doi.org/10.1016/j.jas.2018.03.007. Describes the cerUB pkg
Barton, C. M., Tortosa, J. E. A., Garcia-Puchol, O., Riel-Salvatore, J. G., Gauthier, N., Conesa, M. V., & Bouchard, G. P. (2017). Risk and resilience in the late glacial: A case study from the western Mediterranean. Quaternary Science Reviews. https://doi.org/10.1016/j.quascirev.2017.09.015
Beheim, Bret A., and Adrian V. Bell. 2011. Inheritance, Ecology and the Evolution of the Canoes of East Oceania. Proceedings of the Royal Society B: Biological Sciences, February. https://doi.org/10.1098/rspb.2011.0060 https://github.com/babeheim/polynesian-canoe-analysis
Bicho, N. and Cascalheira, J. (2018) The use of lithic assemblages for the definition of short-term occupations in hunter-gatherer prehistory. In Picin, A. and Cascalheira, J. (eds.) Short-term occupations in Paleolithic Archaeology. Interdisciplinary Contributions to Archaeology. Springer. https://doi.org/10.17605/OSF.IO/3WGSA
Birch, T. and M. Martinón-Torres (2019) Shape as a measure of weapon standardisation: From metric to geometric morphometric analysis of the Iron Age ‘Havor’ lance from Southern Scandinavia. Journal of Archaeological Science 101: 34-51 https://doi.org/10.1016/j.jas.2018.11.002
Breslawski RP, Etter BL, Jorgeson I, Boulanger MT (2018). The Atlatl to Bow Transition: What Can We Learn from Modern Recreational Competitions? Lithic Technology http://doi.org/10.1080/01977261.2017.1416918 https://github.com/taphocoenose/The-atlatl-to-bow-transition
Breslawski RP, Playford T (2017). Probabilistic Models of Seasonal Bison Exploitation Based on Fetal Prey Osteometry and Reproductive Phenology. Archaeological and Anthropological Sciences http://doi.org/10.1007/s12520-017-0500-y https://github.com/taphocoenose/Probabilistic-Models-of-Seasonal-Bison-Exploitation
Cardillo, Marcelo, Scartascini Federico Luis and Zangrando Atilio Francisco (2015) Combining morphological and metric variations in the study of design and functionality in stone weights. A comparative approach from continental and insular Patagonia, Argentina. Journal of Archaeological Science: Reports 4:578-587. http://dx.doi.org./10.1016/j.jasrep.2015.10.028
Carleton, W. C. , J. Conolly, and G. Iannone (2012) A locally-adaptive model of archaeological potential (LAMAP) Journal of Archaeological Science 39(11), 3371-3385, 2012 https://doi.org/10.1016/j.jas.2012.05.022, https://github.com/wccarleton/lamap
Carleton, W., McCauley, B., Costopoulos, A., & Collard, M. (2018). An evolutionary agent-based model contradicts Dunnell’s version of the waste hypothesis for cultural elaboration. https://doi.org/10.31235/osf.io/2h36u https://github.com/wccarleton/abm_waste
Clarkson, C., Mike Smith, Ben Marwick, Richard Fullagar, Lynley A. Wallis, Patrick Faulkner, Tiina Manne, Elspeth Hayes, Richard G. Roberts, Zenobia Jacobs, Xavier Carah, Kelsey M. Lowe, Jacqueline Matthews, S. Anna Florin (2015) The archaeology, chronology and stratigraphy of Madjedbebe (Malakunanja II): A site in northern Australia with early occupation. Journal of Human Evolution 8, 46–64 http://dx.doi.org/10.1016/j.jhevol.2015.03.014
Conrad, C., Higham, C., Eda, M. and Marwick, B. (2016) Paleoecology and Forager Subsistence Strategies During the Pleistocene-Holocene Transition: A Reinvestigation of the Zooarchaeological Assemblage from Spirit Cave, Mae Hong Son Province, Thailand. Asian Perspectives 55(1). https://github.com/cylerc/AP_SC
Contreras, Daniel A., Joël Guiot, Romain Suarez, and Alan Kirman. (2018) "Reaching The Human Scale: A Spatial and Temporal Downscaling Approach To The Archaeological Implications Of Paleoclimate Data." Journal of Archaeological Science 93:54-67.doi:10.1016/j.jas.2018.02.013
Contreras, Daniel A. and John Meadows. (2014) “Summed radiocarbon calibrations as a population proxy: a critical evaluation using a realistic simulation approach.” Journal of Archaeological Science 52:591-608. [doi:10.1016/j.jas.2014.05.030] (http://www.sciencedirect.com/science/article/pii/S0305440314002088)
Coto-Sarmiento, M., Rubio-Campillo, X., Remesal, J., 2018. Identifying social learning between Roman amphorae workshops through morphometric similarity. Journal of Archaeological Science 96, 117–123. https://doi.org/10.1016/j.jas.2018.06.002, https://github.com/Mcotsar/LearningBaetica
Crema, E.R., Kandler, A., Shennan, S., 2016. Revealing patterns of cultural transmission from frequency data: equilibrium and non-equilibrium assumptions, Scientific Reports 6, 39122.
Crema, E. R., J. Habu, K. Kobayashi and M. Madella (2016). "Summed Probability Distribution of 14C Dates Suggests Regional Divergences in the Population Dynamics of the Jomon Period in Eastern Japan." PLoS ONE 11(4): e0154809., GitHub repo, Zenodo repo.
Crema, E.R., K. Edinborough, T. Kerig, S.J. Shennan (2014) An Approximate Bayesian Computation approach for inferring patterns of cultural evolutionary change, Journal of Archaeological Science, Volume 50 Pages 160-170 http://dx.doi.org/10.1016/j.jas.2014.07.014
DiNapoli, R. J., Lipo, C. P., Brosnan, T., Hunt, T. L., Hixon, S., Morrison, A. E., & Becker, M. (2019). Rapa Nui (Easter Island) monument (ahu) locations explained by freshwater sources. PLOS ONE, 14(1), e0210409. https://doi.org/10.1371/journal.pone.0210409
Drake BL, Wills WH, Hamilton MI, Dorshow W (2014) Strontium Isotopes and the Reconstruction of the Chaco Regional System: Evaluating Uncertainty with Bayesian Mixing Models. PLoS ONE 9(5): e95580. doi:10.1371/journal.pone.0095580
Drake, Brandon L., David T. Hanson, James L. Boone (2012) The use of radiocarbon-derived Δ13C as a paleoclimate indicator: applications in the Lower Alentejo of Portugal, Journal of Archaeological Science, Volume 39, Issue 9, September 2012, Pages 2888-2896, http://dx.doi.org/10.1016/j.jas.2012.04.027
Drake, Brandon L., (2012) The influence of climatic change on the Late Bronze Age Collapse and the Greek Dark Ages, Journal of Archaeological Science, Volume 39, Issue 6, June 2012, Pages 1862-1870 http://dx.doi.org/10.1016/j.jas.2012.01.029
Drake, Brandon L., WH Wills, and Erik B Erhardt (2012) The 5.1 ka aridization event, expansion of piñon-juniper woodlands, and the introduction of maize (Zea mays) in the American Southwest The Holocene December 2012 22: 1353-1360, first published on July 9, 2012 doi:10.1177/0959683612449758
Dye, Thomas S. (2011). “A Model-based Age Estimate for Polynesian Colonization of Hawai‘i”. Archaeology in Oceania 46, pp. 130–138 https://github.com/tsdye/hawaii-colonization
Dye, T. S. (2016). "Long-term rhythms in the development of Hawaiian social stratification." Journal of Archaeological Science http://www.sciencedirect.com/science/article/pii/S030544031630053X 71: 1-9.
Giusti, D., Konidaris, G. E., Tourloukis, V., Marini, M., Maron, M., Zerboni, A., … Harvati, K. (2019). Recursive anisotropy: a spatial taphonomic study of the Early Pleistocene vertebrate assemblage of Tsiotra Vryssi, Mygdonia Basin, Greece. Boreas, 0(0). https://doi.org/10.1111/bor.12368
Giusti, D., Tourloukis, V., Konidaris, G., Thompson, N., Karkanas, P., Panagopoulou, E., & Harvati, K. (2018). Beyond maps: patterns of formation processes at the Middle Pleistocene open-air site of Marathousa 1, Megalopolis Basin, Greece. Quaternary International. https://doi.org/10.1016/j.quaint.2018.01.041
Giusti, D. and M. Arzarello, 2016, The need for a taphonomic perspective in spatial analysis: Formation processes at the Early Pleistocene site of Pirro Nord (P13), Apricena, Italy, Journal of Archaeological Science: Reports 8, 235--249 code and data: https://github.com/dncgst/pirronord_jas-reports
Hu, Y., Marwick, B., Zhang, J.-F., Rui, X., Hou, Y.-M., Yue, J.-P., ... Li, B. (2018). Late Middle Pleistocene Levallois stone-tool technology in southwest China. Nature https://doi.org/10.1038/s41586-018-0710-1 <https://doi.org/10.17605/OSF.IO/ERNTJ >
Huffer, D. and Graham, S. 2017 The Insta-Dead: the rhetoric of the human remains trade on Instagram, Internet Archaeology 45. https://doi.org/10.11141/ia.45.5, code & data: https://zenodo.org/record/546132
King, C. L., Millard, A. R., Gröcke, D. R., Standen, V. G., Arriaza, B. T., & Halcrow, S. E. (2018). Marine resource reliance in the human populations of the Atacama Desert, northern Chile–A view from prehistory. Quaternary Science Reviews, 182, 163-174. https://doi.org/10.1016/j.quascirev.2017.12.009
Lightfoot E and O'Connell TC (2016).“On The Use of Biomineral Oxygen Isotope Data to Identify Human Migrants in the Archaeological Record: Intra-Sample Variation, Statistical Methods and Geographical Considerations.” PLoS ONE 11(4). http://doi:10.1371/journal.pone.0153850, code and data: https://www.repository.cam.ac.uk/handle/1810/252773
Lowe, K., Wallis, L., Pardoe, C., Marwick, B., Clarkson, C., Manne, T., Smith, M. and R. Fullagar 2014 Ground-penetrating radar and burial practices in western Arnhem Land, Australia. Archaeology in Oceania 49(3): 148–157 http://onlinelibrary.wiley.com/doi/10.1002/arco.5039/abstract
Mackay, Alex, Sam C. Lin, Lachlan S. Kenna, and Alex F. Blackwood. 2018. Variance in the Response of Silcrete to Rapid Heating Complicates Assumptions about Past Heat Treatment Methods.” Archaeological and Anthropological Sciences, June 20, 2018, 1–12. https://doi.org/10.1007/s12520-018-0663-1.
Mackay A, Sumner A, Jacobs Z, Marwick B, Bluff K and Shaw M 2014. Putslaagte 1 (PL1), the Doring River, and the later Middle Stone Age in southern Africa's Winter Rainfall Zone. Quaternary International http://dx.doi.org/10.1016/j.quaint.2014.05.007
Marwick, B., Hiscock, P., Sullivan, M., & Hughes, P. 2017 Landform boundary effects on Holocene forager landscape use in arid South Australia. Journal of Archaeological Science: Reports http://doi.org/10.1016/j.jasrep.2017.07.004
Marwick, Ben, Elspeth Hayes, Chris Clarkson and Richard Fullagar 2017 Movement of lithics by trampling: An experiment in the Madjedbebe sediments, northern Australia. Journal of Archaeological Science 79:73-85. http://dx.doi.org/10.1016/j.jas.2017.01.008, https://github.com/benmarwick/mjbtramp, http://dx.doi.org/10.17605/OSF.IO/32A87
Marwick, B., Van Vlack, H.G., Conrad, C., Shoocongdej, R., Thongcharoenchaikit, C., Kwak, S. 2016 Adaptations to sea level change and transitions to agriculture at Khao Toh Chong rockshelter, Peninsular Thailand, Journal of Archaeological Science http://dx.doi.org/10.1016/j.jas.2016.10.010, https://github.com/benmarwick/ktc11, https://osf.io/axxf8/
Marwick, B, C. Clarkson, S. O'Connor & S. Collins 2016 "Pleistocene-aged stone artefacts from Jerimalai, East Timor: Long term conservatism in early modern human technology in island Southeast Asia" Journal of Human Evolution DOI: 10.1016/j.jhevol.2016.09.004, https://github.com/benmarwick/Pleistocene-aged-stone-artefacts-from-Jerimalai--East-Timor, https://osf.io/63zey
Marwick, B. 2017. Computational reproducibility in archaeological research: Basic principles and a case study of their implementation. Journal of Archaeological Method and Theory 1-27. doi: 10.1007/s10816-015-9272-9, text source repo
Marwick, B., 2013. Multiple Optima in Hoabinhian flaked stone artefact palaeoeconomics and palaeoecology at two archaeological sites in Northwest Thailand. Journal of Anthropological Archaeology 32, 553-564. http://dx.doi.org/10.1016/j.jaa.2013.08.004
Marwick, B. 2013. Discovery of Emergent Issues and Controversies in Anthropology Using Text Mining, Topic Modeling, and Social Network Analysis of Microblog Content. In Yanchang Zhao, Yonghua Cen (eds) Data Mining Applications with R Elsevier. p. 63-93 https://github.com/benmarwick/AAA2011-Tweets
McPherron SP 2018 Additional statistical and graphical methods for analyzing site formation processes using artifact orientations. PLOS ONE 13(1): e0190195. https://doi.org/10.1371/journal.pone.0190195
Nakoinz, O., D. Knitter (2016) Modelling Human Behaviour in Landscapes. Basic Concepts and Modelling Elements. Quantitative Archaeology and Archaeological Modelling 1. Springer, New York. https://github.com/dakni/mhbil, https://github.com/ISAAKiel
Negre, J., Muñoz, F., & Barceló, J. A. (2017). A Cost-Based Ripley’s K Function to Assess Social Strategies in Settlement Patterning. Journal of Archaeological Method and Theory, 1-18. https://doi.org/10.1007/s10816-017-9358-7
Negre, J., Muñoz, F., Lancelotti, C., 2016. Geostatistical modelling of chemical residues on archaeological floors in the presence of barriers, J Archaeol Sci 70, 91-101. https://github.com/famuvie/ArchaeologicalFloors
Orton, D., Gaastra, J., & Vander Linden, M. (2016). Between the Danube and the Deep Blue Sea: Zooarchaeological Meta-Analysis Reveals Variability in the Spread and Development of Neolithic Farming across the Western Balkans. Open Quaternary, 2, 6. DOI: http://doi.org/10.5334/oq.28 data & code: http://eprints.whiterose.ac.uk/104121/
Pargeter, Justin, Paloma de la Peña, and Metin I. Eren. 2018. “Assessing Raw Material’s Role in Bipolar and Freehand Miniaturized Flake Shape, Technological Structure, and Fragmentation Rates.” Archaeological and Anthropological Sciences, May, 1–15. https://doi.org/10.1007/s12520-018-0647-1 data & code: https://osf.io/38tsn/
Phillips, N., Pargeter, J., Low, M. et al. (2018). Open-air preservation of miniaturised lithics: experimental research in the Cederberg Mountains, southern Africa. Archaeol Anthropol Sci (2018). https://doi-org.offcampus.lib.washington.edu/10.1007/s12520-018-0617-7
Porčić, M., Nikolić, M., 2016. The Approximate Bayesian Computation approach to reconstructing population dynamics and size from settlement data: demography of the Mesolithic-Neolithic transition at Lepenski Vir. Archaeol Anthropol Sci 8, 169–186. https://doi.org/10.1007/s12520-014-0223-2
Režek, Ž., Dibble, H.L., McPherron, S.P., Braun, D.R., Lin, S.C., 2018. Two million years of flaking stone and the evolutionary efficiency of stone tool technology. Nature Ecology & Evolution 1.https://doi.org/10.1038/s41559-018-0488-4, https://doi.org/10.5281/zenodo.1194711
Riris, P. 2018. Dates as Data Revisited: A Statistical Examination of the Peruvian Preceramic Radiocarbon Record. Journal of Archaeological Science 97 (September 1, 2018): 67–76.https://doi.org/10.1016/j.jas.2018.06.008
Rubio-Campillo, X., Montanier,, J.M., Rull, G., Bermúdez Lorenzo, J.M., Moros Díaz, J., Pérez González, J., Remesal Rodríguez, J. 2018 The ecology of Roman trade. Reconstructing provincial connectivity with similarity measures, Journal of Archaeological Science, 92, pp. 37-47. doi:10.1016/j.jas.2018.02.010 https://github.com/xrubio/ecologyStamps
Rubio-Campillo, X., Coto-Sarmiento, M., Pérez-Gonzalez, J. and Remesal Rodríguez, J. 2017 Bayesian analysis and free market trade within the Roman Empire, Antiquity, 91(359), pp. 1241–1252. doi:10.15184/aqy.2017.131 https://github.com/xrubio/bayesRome
Shennan, SJ, Enrico R. Crema, Tim Kerig, (2014) Isolation-by-distance, homophily, and 'core' vs. 'package' cultural evolution models in Neolithic Europe, Evolution and Human Behavior, Available online 2 October 2014, http://dx.doi.org/10.1016/j.evolhumbehav.2014.09.006
Sinensky, R. J., and A. Farahani. 2018. DIVERSITY-DISTURBANCE RELATIONSHIPS IN THE LATE ARCHAIC SOUTHWEST: IMPLICATIONS FOR FARMER-FORAGER FOODWAYS American Antiquity 83 (2): 364–364. https://doi.org/10.1017/aaq.2017.74
Steele, Teresa E., Alex Mackay, Kathryn E. Fitzsimmons, Marina Igreja, Ben Marwick, Jayson Orton, Steve Schwortz, and Mareike C. Stahlschmidt (2016) "Varsche Rivier 003: A Middle and Later Stone Age Site with Still Bay and Howieson's Poort Assemblages in Southern Namaqualand, South Africa" PaleoAnthropology 2016:100-163 http://www.paleoanthro.org/media/journal/content/PA20160100.pdf, < http://dx.doi.org/10.5281/zenodo.31903>
Ullah, Isaac I. T., Ian Kuijt, and Jacob Freeman. 2015. “Toward a Theory of Punctuated Subsistence Change.” Proceedings of the National Academy of Sciences 112 (31): 9579–84. https://doi.org/10.1073/pnas.1503628112. http://figshare.com/articles/Cross_cultural_data_for_multivariate_analysis_of_subsistence_strategies/1404233
Ben Marwick, Agustin Diez Castillo, Allar Haav, Sebastian Heath, Phil Riris, Tom Brughmans, Lee Drake, Stefano Costa, Enrico Crema, Domenico Giusti, Matt Peeples, Mark Madsen, Daniel Contreras, Tal Galili