Skip to content

Summary of the R ArcGIS Bridge

Aaron Weinstock edited this page Jul 5, 2018 · 5 revisions

Aaron Weinstock, ESRI R&D Arlington




Introduction


The R-ArcGIS Bridge is manifest in the R package arcgisbinding. This package allows for easy transfer of ArcGIS data into R for analysis; in other words, it provides a clean transition between ArcGIS datasets and R spatial data structures. Importantly, the functions in this package are NOT meant for analysis, but rather for establishing a data connection between ArcGIS and R. Primarily, this package allows for data to be read and written easily between the two softwares. It consequently allows geoprocessing tools to be written as R scripts.

Below, the functions contained in arcgisbinding will be detailed, in an effort to describe the range of functionalities available via the R-ArcGIS Bridge. For more detail, check out the package documentation.

Installing the R-ArcGIS Bridge


Though the R-ArcGIS Bridge exists in an R package, this package is not available on CRAN (comprehensive R network archive), so it cannot be installed using the standard install.packages() call in R. Instead, it is easiest to initialize the Bridge via ArcMap or ArcGIS Pro; creating the connection in this way will automatically install arcgisbinding in R, allowing you to simply call library(arcgisbinding) from RStudio or an R console whenever use in R is desired.

Please note that the R-ArcGIS Bridge will only work on Windows, as the package arcgisbinding is only supported for Windows operating systems! If you are attempting to use the R-ArcGIS Bridge and the arcgisbinding package on a Mac, note that a Mac OS instance of R will be unable to communicate over a Virtual Machine (VM) to a Windows instance of ArcMap or ArcGIS Pro. Installing R on the Virtual Machine will be necessary.

First, ensure that the latest version of R is installed (on the Windows VM if working on a Mac). If R is not already installed, follow these steps:

  1. Proceed to the CRAN Mirrors page.

  2. Choose the link for the mirror from the institution closest to you.

  3. On the following page, select "Download for Windows" (3rd option under the "Download and Install R" subheading).

  4. On the following page, select "install R for the first time" (to the right of the "base" subheading).

  5. The above downloads the R installer to your machine. Open and run this program to install R; it is safe to keep all defaults recommended or preselected by the installer.

If you already have R and simply want to update the program, copy the following code into RStudio or an R console:

 if(!require(installr)){
    install.packages("installr")
    require(installr)}
 updateR()

This will check for any updates, and guide you through the update process should any be available.

Note that to use the R-ArcGIS Bridge, R is the only required program. However, if you plan on doing any R coding (which presumably is the case if you're reading this tutorial!), the RStudio program may be advantageous in that it provides a cleaner, more user friendly interface to R than a standard R console. If RStudio is not already installed, follow these steps:

  1. Proceed to the RStudio downloads page.

  2. Select the green "Download" button under "R Studio Desktop" (the left most column).

  3. The above downloads the RStudio installer to your machine. Open and run this program to install RStudio; as with R, it is safe to keep all defaults recommended or preselected by the installer.

Once R and R Studio are downloaded, follow these steps to initialize R-ArcGIS Bridge and install arcgisbinding to R:

  1. Open or create an ArcMap or ArcGIS Pro project, and select the blue "Project" ribbon in the top left corner of the window.

  2. On the left sidebar, select "Options."

  3. On the left sidebar under the "Application" setting, select "Geoprocessing" (6th option from the top).

  4. At the bottom of the page under the "R-ArcGIS Support" subheading, select the version of R to which you would like to connect the Bridge. Most users will have only one version of R downloaded on their machine, so there will be only one option; if you have multiple versions of R downloaded on your machine, make the selection based on the version of R you run most frequently (usually, this will be the most current version).

    • At this point, if the arcgisbinding package has not already been installed, a message will appear below this dropdown indicating that the package must be installed. If this is your first time installing the package, select the icon to the right of this message, select "Install from the internet", and the install will complete automatically. If prompted, be sure to select a download for the most recent version of the package, and allow R to install the necessary dependencies for the arcgisbinding install.
  5. Select "OK" in the bottom right corner of the "Options" window. The Bridge is now initialized, and arcgisbinding is installed! Upon return to the "R-ArcGIS Support" section of the Options page, the software will now indicate that arcgisbinding has been installed, and report the version number of the package.

If the arcgisbinding package or your local version of R is ever updated, return to the "R-ArcGIS Support" section of the "Options" page to update the connection.

  1. For a package update: select the icon to the right of the "Installed" message (below the dropdown menu), and select "Check package for updates." If an update is available, follow the prompts given to complete the update; if no update is available, you will receive notification that you are already using the most recent version. If changes are made, remember to click "OK" in the bottom right corner of the "Options" window upon completion to save the changes!

    • To ensure you are getting use of all possible functions provided by arcgisbinding, the package should be updated with each new release. If you use the Bridge infrequently, check the package for updates before each usage; if you use it regularly, check for updates every few weeks.
  2. For an R update: in the dropdown menu, select the new, updated version of R. Remember to click "OK" in the bottom right corner of the "Options" window upon completion to save the changes!

    • Unlike the package, it is not imperative that R is updated with each new release (as the base function of the program remains constant). However, to ensure proper function of packages and access to new developments in the program, it is recommended that R is updated as frequently as possible. A new version of R tends to be released every 1-3 months, and new releases will always be reflected in the "News" subheading of the R Project site. When an update is available, use the code provided above to quickly update R from RStudio or an R console, or (if you want to keep your old version of R on your machine as well), download the latest version of R using the steps outlined above.

Once initialization and install are complete, work in R can begin!

Reading ArcGIS Data into R


Reading ArcGIS data into R begins with the call library(arcgisbinding) (to load the package in R) followed by the function arc.check_product(). This function takes no arguments, and simply establishes the connection between your licensed ArcGIS software and R. All scripts utilizing the R-ArcGIS Bridge must begin with this function call! Conveniently, running library(arcgisbinding) will prompt you to run arc.check_product().

 arc.check_product()

If you are interested, the following useful product details on the installed ArcGIS software are accessible via the $ extension to arc.check_product():

  • app -- indicates the product: ArcMap or ArcGIS Pro

  • license -- indicates the license: basic, standard, or advanced

  • version -- indicates the build number of the licensed ArcGIS software

  • path -- indicates the local file location of the licensed ArcGIS software

Note that these slots may rarely be useful for ArcGIS professionals familiar with their product. Usually, this function will be used without any calls to its stored data.

Once the connection is established, the data may be read into R. This is accomplished using the function arc.open() for both vector and raster data. This function takes only a path argument, which indicates the full system path to the file of interest.

 arc.open(file_path_to_data)

This function can handle ArcGIS data in the following forms:

  • feature class (shapefile, geodatabase, coverage)

  • layer

  • table

  • raster

The data is read in as an arc.dataset (vector) or arc.datasetraster (raster) object of class S4 in R. Should you accidentally download the incorrect data, or ever want to delete the ArcGIS dataset, use the function arc.delete(). This will delete the file in your local environment itself, not the variable in R! If you learn mid-analysis that you have the wrong data, or want to delete intermediates along the way, this can be used as a substitute to navigating to the file in your documents and deleting manually.

 arc.delete(arc.dataset_or_arc.datasetraster)

If you are interested, the function arc.metadata() allows you to retain the pertinent environmental information of your data read-in process. It saves data slots for date and time of read-in, ArcGIS format, and synchronization. This might be useful for updating metadata information should any major changes to the data be made in processing.

 arc.metadata(arc.dataset_or_arc.datasetraster)

Once the ArcGIS data is in R, it can be explored, or converted into a R-based structure for analysis.

Setup for R-Based Analysis of ArcGIS Data


Basic exploration can actually begin from an arc.dataset or arc.datasetraster itself, as loading the data yields access to some of the file's metadata. The following useful metadata slots are accessible via the @ extension to the saved dataset (@ is used instead of $ due to the S4 object class):

  • shapeinfo -- the basic geometry information of the file: geometry type, WKT projection, WKID, and indicators of Z and M coordinates

  • extent -- the coordinates of the bounding box for the data

  • fields -- the names of the attributes of the data, along with their types (e.g. string, integer)

  • path -- the local file location

  • dataset_type -- the file type: feature class, layer, table, or raster

The names() function can be used to obtain attribute names for vector data, or band names for raster data.

 names(arc.dataset_or_arc.datasetraster)

For raster data, the dim() function will output the raster dimensions in an integer vector of the form < # of rows, # of columns, # of bands >.

 dim(arc.datasetraster)
  • This function will also work on an arc.raster object -- see below for more information

More in depth exploration will begin with a conversion away from the arc.dataset or arc.datasetraster object. This is accomplished with the arc.select() function for vector data, or the arc.raster() function for raster data.

arc.select() transforms a loaded dataset a data.frame object known as an arc.dataframe. This function is particularly powerful, because it allows the ArcGIS data to be processed using the same standard methodology used to analyze dataframes regularly in R. The function can also work like a "filter", allowing you to keep only fields of interest, or those features with only specific field values.

 arc.select(object       = arc.dataset,
            fields       = vector_of_names_for_fields_you_want_to_keep,  `         
            where_clause = logical_statement_based_on_field_name(s)_of_interest,
            selected     = TRUE(if_want_to_only_use_selected_features) or FALSE(if_not),
            sr           = NULL(unless_want_to_transform_geometry_to_spatial_reference))`
  • object is the only required field

  • selected = TRUE and sr = NULL are defaults

arc.raster() transforms a loaded dataset into a data.frame object known as an arc.raster. Unlike an arc.dataframe, which is edited via arguments in arc.select(), an arc.raster is edited via references to its data slots either within the arc.raster() function (by specifying the slot by name) or after creation (via the $ extension to the saved arc.raster).

 arc.raster(object = arc.datasetraster, 
            bands  = integer_vector_of_bands,
            ...)
  • object is the only required field

  • bands = defaults to all bands

  • ... indicates the optional references to editable data slots. The slots include:

    • sr -- spatial reference

    • extent -- bounding extent (useful for filtering raster by size)

    • nrow -- number of rows

    • ncol -- number of columns

    • cellsize -- pixel size

    • pixel_type -- pixel type

    • pixel_depth -- pixel depth

    • nodata -- "no data" value

    • resample_type -- resampling type

    • colormap -- color map table

    • bands -- band information

If you do not want to edit this information, you can still view it upon creation via the $ extension to the saved arc.raster. For editing purposes, note that pixel_type and resample_type have a prescribed set of supported values: to see them, check out the package documentation sections "pixeltypes" (page 22) and "resampletypes" (page 23).

For experienced R users, it should be noted that arcgisbinding provides package dplyr support for dataframes created by package functions. The supporting functions won't be detailed here (as that is a lesson in R), but for the sake of awareness, they include:

  • filter.arc.data()

  • arrange.arc.data()

  • mutate.arc.data()

  • group_by.arc.data()

  • ungroup.arc.data()

With the data now in a standard R dataframe format, second-level conversions can now be applied in preparation for R-based spatial analysis.

Converting Data to R Spatial Forms


While arcgisbinding exists primarily for ease of data transfer between ArcGIS and R, many R packages have already been developed for actual spatial data analysis. Thankfully, arcgisbinding provides the ability to convert data not just to dataframes, but also to R-based spatial data structures. In particular, arcgisbinding facilitates transitions between ArcGIS and R package sp, sf, and raster data structures. Note that the objects created by these functions are not necessarily restricted to use with the package that defines them. In fact, most all spatial analysis packages in R employ one of these types of objects. However, use of these functions depends on the associated package being installed and loaded in the R session.

One of these functions, arc.select(), has already been explored, but was detailed above due to its filtering power. However, its base role is to convert ArcGIS datasets to standard R dataframes. Other functions that focus on conversions (sorted by required package) include:

1. Requires sp

 arc.data2sp(arc.dataframe_or_arc.raster)
  • Vector: converts an arc.dataframe to an sp SpatialDataFrame object. Based on the input data, this can create a SpatialPointsDataFrame, SpatialLinesDataFrame, or SpatialPolygonsDataFrame.

  • Raster: converts an arc.raster to an sp SpatialGridDataFrame object.


 arc.sp2data(SpatialDataFrame_object)
  • Vector: reverse of above - converts a SpatialDataFrame to an arc.dataframe.

2. Requires sf

 arc.data2sf(arc.dataframe)
  • Vector: converts an arc.dataframe to an sf Simple Feature object.

3. Requires raster

 as.raster(arc.raster) 
  • Raster: converts an arc.raster to a raster RasterLayer or RasterBrick

4. No package requirements

 arc.fromP4ToWkt(projection4_string)
  • Data type independent: converts a PROJ.4 projection string to its well-known-text form.

 arc.fromWktToP4(WKT_string_or_WKID_Integer)
  • Data type independent: reverse of above - converts a well-known-text projection string (or its associated ID) to its PROJ.4 form.

Applying these functions creates smooth back-and-forth transitions between ArcGIS- and R-style data, depending on the needs of the analysis.

Dealing with ArcGIS Shapes in R


arcgisbinding also provides capability to work with the shapes associated with ArcGIS data. When an arc.dataframe is created by arc.select(), the shape information is stored in the attribute Shape. Note that the functions below work only with vector data, taking either arc.datasets or arc.dataframes as intial inputs.

Basic shape information can be obtained directly from an arc.dataset using the function arc.shapeinfo() and its stored data slots. This allows a user to observe the fundamental information about the geometry assciated with a shape.

 arc.shapeinfo(arc.dataset)

The data slots, accessible via the $ extension for the saved shape information, include:

  • type -- indicates the geometry type: outputs Point, Polyline, or Polygon

  • hasZ -- indicates if the geometry has Z-values: outputs TRUE if it does

  • hasM -- indicates if the geometry has M-values: outputs TRUE if it does

  • WKT -- indicates the WKT form of the shape's spatial reference

  • WKID -- indicates the ID associated with the WKT form of the shape's spatial reference

Note that the arc.shapeinfo() function provides equivalent information to the @shapeinfo slot of an arc.dataset.

The ArcGIS shape itself, or object of class arc.shape can be obtained using the aptly named arc.shape() function. This gives access to the spatial component of the ArcGIS data. The function takes an arc.dataframe as its input, and creates the shape from this information.

 arc.shape(arc.dataframe)

Once an ArcGIS shape is obtained, it can be converted to an sp spatial geometry using the function arc.shape2sp() or an sf simple feature geometry using the function arc.shape2sf() for analysis in R. Spatial geometries and simple feature geometries differ from spatial dataframes or simple features (respectively) in that they do not contain a slot for attribute data.

 arc.shape2sp(shape = arc.shape_object, 
              wkt   = wkt_spatial_reference)
  • the wkt argument will automatically read the projection of the input shape, so will (likely) be changed only in rare cases

 arc.shape2sf(shape = arc.shape_object)

It should be noted here that there is no reverse-conversion tool (as with the dataframe conversions) -- ArcGIS shapes cannot be obtained from R spatial geometries!

These functions empower you to create spatial geometries in addition to spatial dataframes. This is useful for processing efficiency (there no need to save attributes if you are not interested), as well as for basic data visualization.

R Scripts as ArcGIS Geoprocessing Tools


One of the primary advantages of the R-ArcGIS Bridge is that R-based geoprocessing scripts can be created from data that have been converted from ArcGIS form to R form. However, it might be the case that preset geoprocessing environment variables (in ArcGIS) affect the usefulness or effectiveness of an R-based tool. The arc.env() function allows a user to obtain the local geoprocessing settings and double check that they are set appropriately for analysis.

 arc.env() 

This function is most often used in R functions written into geoprocessing scripts. Referencing the geoprocessing environment settings from within a function ensures that the analysis uses the appropriate processing parameters. Furthermore, since these local settings cannot be changed from within an R script, knowing what these settings are is useful for debugging (should an R tool fail due to an attempt to override existing settings).

To monitor progress in ArcGIS as an R geoprocessing script runs, arcgisbinding provides a few handy, but not requisite, tools for processing: the functions arc.progress_label() and arc.progress_pos(). These, respectively, control the text label and fill amount for the progress bar at the top of a running script. Like arc.env(), these are most often used within functions - they give updates on time to completion, allowing users to fine-tune the application experience. An example is provided below. This example would produce a progress bar with 0% fill labeled "Start" when the process begins, and a 50% fill labeled "Step 1 Complete" after the first step is fully completed:

 function(in, out){
    arc.progress_label("Start")
    arc.progress_pos(0)
    *some processing*
    arc.progress_label("Step 1 Complete")
    arc.progress_pos(50)
    *more processing*
 }

It should be noted here that these two "progress" functions currently work only in ArcGIS Pro.

In short, processing ArcGIS data in R should always involve a call to arc.env(), to ensure that a tool operates properly. Use of the "progress" functions, while not necessary, offers an easy way for a script author to customize the processing interface.

Once completed, processing scripts may be run as a tool directly from ArcGIS, should the R script be uploaded to an ArcGIS toolbox.

Writing ArcGIS Data from R


So far, the arcgisbinding package has been explored in the context of moving from ArcGIS to R. The function arc.write() is the package's method for moving from R back to ArcGIS, by allowing for the writing of R dataframes into ArcGIS datasets.

 arc.write(path       = full_file_path_for_output_ArcGIS_dataset,
           data       = dataframe_to_be_written_to_ArcGIS_dataset,
           coords     = NULL,
           shape_info = NULL)`
  • The data argument can take a standard data.frame object; an arc.dataframe object; an sp SpatialPointsDataFrame, SpatialLinesDataFrame, or SpatialPolygonsDataFrame object; an arc.raster object; or a raster RasterBrick or RasterLayer object.

  • For any dataframe object, if the input dataframe includes a spatial attribute, a feature dataset will be written. Otherwise, a table will be written.

  • For any raster object, a raster dataset will be written.

  • coords is an optional list containing geometry information (i.e. xy coordinates, useful for adding a spatial properties to an existing non-spatial dataframe).

  • shape_info is an optional list containing geometry type (point, line, or polygon) and spatial reference (WKT projection).

This function can be used to re-write files (from ArcGIS data that has been since transformed in R), or write files from scratch (from a dataframe created in R).

With this function, the R-ArcGIS Bridge becomes a two way street, allowing you create ArcGIS datasets from dataframes that were either edited or created in R.

Further Information


The above is a simple summary of key functions and their uses in the arcgisbinding package, which creates the R-ArcGIS Bridge. Again, for more detailed information, please consider exploring the arcgisbinding package documentation.