Skip to content

Commit

Permalink
initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
cole-brokamp committed Oct 15, 2017
0 parents commit 54d2dfe
Show file tree
Hide file tree
Showing 15 changed files with 865 additions and 0 deletions.
4 changes: 4 additions & 0 deletions .Rbuildignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
^.*\.Rproj$
^\.Rproj\.user$
^README\.Rmd$
^README-.*\.png$
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
.Rproj.user
.Rhistory
.RData
13 changes: 13 additions & 0 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
Package: OfflineGeocodeR
Title: Geocoding Inside R Based on a Dockerized Offline TIGER/Line Range Geocoder
Version: 0.1
Authors@R: person("Cole", "Brokamp", email = "cole.brokamp@gmail.com", role = c("aut", "cre"))
Description: Uses a local Docker installation to pull and run an offline
geocoding container in the background. The geocode() function calls the
container and aggregates the results, optionally using a cache for memoization.
Depends:
R (>= 3.3.2)
License: GPL-3 + file LICENSE
Encoding: UTF-8
LazyData: true
RoxygenNote: 6.0.1
595 changes: 595 additions & 0 deletions LICENSE

Large diffs are not rendered by default.

8 changes: 8 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Generated by roxygen2: do not edit by hand

export(find_docker_cmd)
export(gc_call)
export(geocode)
export(start_geocoder_container)
export(stop_geocoder_container)
importFrom(jsonlite,fromJSON)
21 changes: 21 additions & 0 deletions OfflineGeocodeR.Rproj
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
Version: 1.0

RestoreWorkspace: No
SaveWorkspace: No
AlwaysSaveHistory: Default

EnableCodeIndexing: Yes
UseSpacesForTab: Yes
NumSpacesForTab: 2
Encoding: UTF-8

RnwWeave: knitr
LaTeX: pdfLaTeX

AutoAppendNewline: Yes
StripTrailingWhitespace: Yes

BuildType: Package
PackageUseDevtools: Yes
PackageInstallArgs: --no-multiarch --with-keep.source
PackageRoxygenize: rd,collate,namespace
61 changes: 61 additions & 0 deletions R/docker.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
#' find the docker executable
#'
#' @export
find_docker_cmd <- function() {
docker_cmd <- Sys.which('docker')
if (length(docker_cmd) == 0) stop(paste('\n','Docker command not found. ','\n',
'Please install docker: ','\n',
'https://www.docker.com/products/overview#/install_the_platform'))
docker_check <- suppressWarnings(system2(docker_cmd,'ps',stderr=TRUE,stdout=TRUE))
if(!is.null(attr(docker_check,'status'))) stop(paste0('Cannot connect to the Docker daemon. ',
'Is the docker daemon running on this host?'))
return(docker_cmd)
}

#' start geocoding container
#'
#' @param image_name name of geocoding image; can be used to specify the version
#' i.e. \code{degauss/geocoder_slim:2.4}; defaults to
#' \code{degauss/geocoder_slim:latest}
#'
#' @export
start_geocoder_container <- function(image_name = 'degauss/geocoder_slim') {
message('starting geocoding container...')
docker_cmd <- find_docker_cmd()
system2(docker_cmd,
args = c('run','-it','-d','--name gs',
'--entrypoint /bin/bash',
image_name))
message('loading address range database...')
invisible(gc_call('3333 Burnet Ave Cincinnati OH 45229'))
}

#' stop geocoding container
#'
#' @export
stop_geocoder_container <- function() {
docker_cmd <- find_docker_cmd()
message('stopping geocoding container...')
system2(docker_cmd,
args = c('stop','gs'))
system2(docker_cmd,
args = c('rm','gs'))
}

#' call a running geocoding container to geocode an address
#'
#' This is used internally by \code{geocode()} and normally should not be called directly.
#'
#' @param address a string; see vignette for best practices and examples
#'
#' @export
#' @importFrom jsonlite fromJSON
gc_call <- function(address) {
docker_cmd <- find_docker_cmd()
docker_out <- system2(docker_cmd,
args = c('exec','gs',
'ruby','/root/geocoder/geocode.rb',
shQuote(address)),
stderr=TRUE,stdout=TRUE)
jsonlite::fromJSON(docker_out)
}
25 changes: 25 additions & 0 deletions R/geocode.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
#' geocode
#'
#' To prevent errors and optimize accuracy, please remove non alphanumeric
#' characters from address strings prior to geocoding. In general, address
#' cleaning is outside the scope of this package and should be completed prior
#' to geocoding.
#'
#' By default, geocoding results will be cached in a local folder named
#' \code{geocoding_cache}. See help for \code{CB::mappp} to change these
#' defaults.
#'
#' @param addresses a list or vector addresses
#' @param ... additional arguments passed to \code{CB::mappp}; set options for
#' cache here
#'
#' @return list of geocoding results with one address per element; some
#' addresses may return more than one geocoding result if there is a tie among
#' the best matches
#' @export
geocode <- function(addresses, ...){
start_geocoder_container()
on.exit(stop_geocoder_container())
message('now geocoding...')
CB::mappp(addresses, gc_call, parallel=FALSE, cache=TRUE, cache.name='geocoding_cache', ...)
}
33 changes: 33 additions & 0 deletions README.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
---
output:
md_document:
variant: markdown_github
---

<!-- README.md is generated from README.Rmd. Please edit that file -->

```{r, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "README-"
)
```

# OfflineGeocodeR

`OfflineGeocodeR` provides a wrapper around calling a Docker container (`DeGAUSS/geocoder_slim`) in order to geocode addresses from R without using the internet. This is advantageous for several reasons, two of which are maintaining the privacy of protected health information and maintaining a reproducible research workflow. See our manuscript about using DeGAUSS (Decentralized Geomarker Assessment for Multi-Site Studies) [here]().

## Installing

OfflineGeocodeR is a private package hosted on GitHub.

Install with:

```{r eval=FALSE}
remotes::install_github('cole-brokamp/OfflineGeocodeR')
```

You must also have Docker installed and available on your system. In the future, it will hopefully be possible to use this package to call a remote docker machine, but for now, it must be local.

*Note: This package was originally designed as a personal convenience and has not been tested on setups other than unix-based operating systems with Docker running natively (i.e. not Docker Toolbox). Your mileage may vary*
20 changes: 20 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
<!-- README.md is generated from README.Rmd. Please edit that file -->
OfflineGeocodeR
===============

`OfflineGeocodeR` provides a wrapper around calling a Docker container (`DeGAUSS/geocoder_slim`) in order to geocode addresses from R without using the internet. This is advantageous for several reasons, two of which are maintaining the privacy of protected health information and maintaining a reproducible research workflow. See our manuscript about using DeGAUSS (Decentralized Geomarker Assessment for Multi-Site Studies) [here]().

Installing
----------

OfflineGeocodeR is a private package hosted on GitHub.

Install with:

``` r
remotes::install_github('cole-brokamp/OfflineGeocodeR')
```

You must also have Docker installed and available on your system. In the future, it will hopefully be possible to use this package to call a remote docker machine, but for now, it must be local.

*Note: This package was originally designed as a personal convenience and has not been tested on setups other than unix-based operating systems with Docker running natively (i.e. not Docker Toolbox). Your mileage may vary*
11 changes: 11 additions & 0 deletions man/find_docker_cmd.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

14 changes: 14 additions & 0 deletions man/gc_call.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

30 changes: 30 additions & 0 deletions man/geocode.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

16 changes: 16 additions & 0 deletions man/start_geocoder_container.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

11 changes: 11 additions & 0 deletions man/stop_geocoder_container.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit 54d2dfe

Please sign in to comment.