Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Predicting on tiled rasters, in parallel #93

Closed
ctlamb opened this issue Feb 12, 2018 · 40 comments
Closed

Predicting on tiled rasters, in parallel #93

ctlamb opened this issue Feb 12, 2018 · 40 comments

Comments

@ctlamb
Copy link

@ctlamb ctlamb commented Feb 12, 2018

My goal is to speed up the model predictions on large RasterLayers. I had hoped I could break a RasterLayer into many tiles, predict the model on each of these smaller tiles in parallel using googleComputeEngineR, then reassemble.

I however seem to be stumbling. Below I provide a simplified, reproducible example. It seems that the function fails to find ".filetype" after loading the raster package. I can run the function with no issues locally, as shown at end of script.

#Test to split up raster and predict
library(raster)
library(googleComputeEngineR)
library(future)
library(SpaDES.tools)

##create raster
row <- 8
col <- 8
r <- raster(nrows=row, ncols=col,xmn=0, xmx=row, ymn=0, ymx=col, vals=c(1:(row*col)))
plot(r)

##Split
r_split <- splitRaster(r, nx=2, ny=2)

##create model
df <- data.frame(y=c(1:10),layer=c(1:5,7,6,8:10))
mod <- glm(y~layer, data=df)

## auto auth to GCE via environment file arguments
## create CPUs names
vm_names <- paste0("my-server",1:4)

## make sure vms won't get shut off
preemptible = list(preemptible = FALSE)

## start up VMs with R base on them (can also customise via Dockerfiles using gce_vm_template instead)
vms <- lapply(vm_names, gce_vm, predefined_type = "n1-standard-1", template = "r-base", scheduling = preemptible)

## add any ssh details, username etc.
vms <- lapply(vms, gce_ssh_setup)

## once all launched, add to cluster
plan(cluster, workers = as.cluster(vms))

## the action you want to perform via cluster
##function
my_single_function <- function(x){
  install.packages("raster", dependencies=TRUE)
  library(raster)
  a <- predict(r_split[[x]], mod)
  return(a)
}


#parallel
system.time(save3 <- future_lapply(1:4, my_single_function))


## tidy up
lapply(vms, FUN = gce_vm_stop)

##why does this not work?
##try locally:
predict(r_split[[1]], mod)
lapply(1:4, function(x) { a <- predict(r_split[[x]], mod)
       return(a)})


@MarkEdmondson1234
Copy link
Collaborator

@MarkEdmondson1234 MarkEdmondson1234 commented Feb 12, 2018

Just eliminating the obvious, but are you running this line:

## tidy up
lapply(vms, FUN = gce_vm_stop)

...before doing your calculations? As that line deletes all the VMs you started. :) Its meant to be after you've done all your stuff.

@ctlamb
Copy link
Author

@ctlamb ctlamb commented Feb 12, 2018

Always good to eliminate the obvious first!! But no, I run that after I should be done with the VMs. The code below that line is just me running the function locally to ensure it should in fact work.

@MarkEdmondson1234
Copy link
Collaborator

@MarkEdmondson1234 MarkEdmondson1234 commented Feb 12, 2018

Ok, moving on then :) are you sure the installation of raster works? Perhaps raise an error via require(). As its on a very stripped down default Docker you may need to install linux dependencies first, and it fails.

I would suggest its better to install raster in a custom Dockerfile which installs any linux dependencies since you can better control whats going on. May as well install future too in there since it needs that to communicate.

@ctlamb
Copy link
Author

@ctlamb ctlamb commented Feb 12, 2018

I agree. I am/have struggled a bit understanding the Docker code. If I want to install raster and future, would by Docker look like this?

Modified from (https://cloudyr.github.io/googleComputeEngineR/articles/docker.html#dockerfiles)

    
## Install packages from CRAN
RUN install2.r --error \ 
    -r 'http://cran.rstudio.com' \
    raster \
    ## clean up
    && rm -rf /tmp/downloaded_packages/ /tmp/*.rds

##Not sure how to add future??

Or like this:
from (https://github.com/glamp/r-docker/blob/master/Dockerfile)

#setup R configs
RUN echo "r <- getOption('repos'); r['CRAN'] <- 'http://cran.us.r-project.org'; options(repos = r);" > ~/.Rprofile
RUN Rscript -e "install.packages('raster')"
RUN Rscript -e "install.packages('future')"
@MarkEdmondson1234
Copy link
Collaborator

@MarkEdmondson1234 MarkEdmondson1234 commented Feb 15, 2018

The first is I think easier to maintain, and takes advantage of littr being installed in the base rocker/r-base image - for multiple packages you add it to the RUN, and you can also use its installGithub.r script to install GitHub dependencies:

An example from here

FROM rocker/tidyverse
MAINTAINER Mark Edmondson (r@sunholo.com)

# install R package dependencies
# only needed if the R packages in the second RUN need them
RUN apt-get update && apt-get install -y \
    ## here you would add an unix dependencies needed by R packages below
    && apt-get clean \ 
    && rm -rf /var/lib/apt/lists/ \ 
    && rm -rf /tmp/downloaded_packages/ /tmp/*.rds
    
## Install packages from CRAN
RUN install2.r --error \ 
    -r 'http://cran.rstudio.com' \
    googleAuthR \ 
    googleComputeEngineR \ 
    googleAnalyticsR \ 
    searchConsoleR \ 
    googleCloudStorageR \
    bigQueryR \ 
    ## install Github packages
    && installGithub.r MarkEdmondson1234/youtubeAnalyticsR \
                       MarkEdmondson1234/googleID \
    ## clean up
    && rm -rf /tmp/downloaded_packages/ /tmp/*.rds 

But it may not be dependencies, so I would first launch just one Docker container with your script first to test if it works ok. Once it works on one then you can add the complication of multiple VMs

@ctlamb
Copy link
Author

@ctlamb ctlamb commented Feb 18, 2018

OK, so I have edited the docker code below. It is not clear to me whether I can just call this as a text file, or if I need to upload to Docker. I first tried to upload to the Google Container Registry as per instructions here (https://cloudyr.github.io/googleComputeEngineR/articles/docker.html), where it appeared I could make a trigger and upload my docker code and then link to this file. But I had trouble setting up the trigger, the mirror never seemed to complete between GoogleCompute and GitHub. I tried a second approach: logged onto Docker and created a repository, but it wasn't clear how I uploaded the code below.

So in short, I appear not be succeeding in my Docker endeavour..

FROM rocker/tidyverse
MAINTAINER Mark Edmondson (r@sunholo.com)

# install R package dependencies
RUN apt-get update && apt-get install -y \
    ## clean up
    && apt-get clean \ 
    && rm -rf /var/lib/apt/lists/ \ 
    && rm -rf /tmp/downloaded_packages/ /tmp/*.rds
    
## Install packages from CRAN
RUN install2.r --error \ 
    -r 'http://cran.rstudio.com' \
    googleAuthR \ 
    googleComputeEngineR \ 
	raster \
    ## install Github packages
    && installGithub.r rforge/raster/pkg/raster \
    ## clean up
    && rm -rf /tmp/downloaded_packages/ /tmp/*.rds \
@MarkEdmondson1234
Copy link
Collaborator

@MarkEdmondson1234 MarkEdmondson1234 commented Feb 18, 2018

The link between the Dockerfile and GitHub is the build trigger setting, I think you are just missing setting that up - https://cloud.google.com/container-builder/docs/running-builds/automate-builds

@ctlamb
Copy link
Author

@ctlamb ctlamb commented Feb 21, 2018

I have set up what I think is my docker image (here)

Which I set up in the Google container registry:
screen shot 2018-02-21 at 10 49 34 am
screen shot 2018-02-21 at 10 49 49 am

I added the dockerfile as per (here:

#Test to split up raster and predict
library(raster)
library(googleComputeEngineR)
library(future)
library(SpaDES.tools)

##create raster
row <- 8
col <- 8
r <- raster(nrows=row, ncols=col,xmn=0, xmx=row, ymn=0, ymx=col, vals=c(1:(row*col)))
plot(r)

##Split
r_split <- splitRaster(r, nx=2, ny=2)

##create model
df <- data.frame(y=c(1:10),layer=c(1:5,7,6,8:10))
mod <- glm(y~layer, data=df)

## auto auth to GCE via environment file arguments
## create CPUs names
vm_names <- paste0("my-server",1:2)

## make sure vms won't get shut off
preemptible = list(preemptible = FALSE)

## start up VMs with R base on them (can also customise via Dockerfiles using gce_vm_template instead)
vms <- lapply(vm_names, gce_vm, 
              predefined_type = "n1-standard-1", 
              template = "r-base", 
              scheduling = preemptible,
              dynamic_image = gce_tag_container("raster_docker", project = "LambEcoResearch"))

## add any ssh details, username etc.
vms <- lapply(vms, gce_ssh_setup)

## once all launched, add to cluster
plan(cluster, workers = as.cluster(vms))

## the action you want to perform via cluster
##function
my_single_function <- function(x){
  a <- predict(r_split[[x]], mod)
  return(a)
}


#parallel
system.time(save3 <- future_lapply(1:2, my_single_function))


## tidy up
lapply(vms, FUN = gce_vm_stop)

But I am still not getting the raster package loaded onto my vms.
screen shot 2018-02-21 at 10 46 59 am

@MarkEdmondson1234
Copy link
Collaborator

@MarkEdmondson1234 MarkEdmondson1234 commented Feb 22, 2018

Ah, strikes me although you installed the raster package on the image your function isn't explictly loading it - specify the function using raster::predict and/or use library/require in the function you send to the cluster.

The function you send up you can think of as only existing in the Docker image's environment, not the script you are sending it from.

So, your function will also need to send up the data its computing upon:

my_single_function <- function(x, mod, r_split){
  raster::predict(r_split[[x]], mod)
}
save3 <- future_lapply(1:2, my_single_function, mod = mod, r_split = r_split)
@ctlamb
Copy link
Author

@ctlamb ctlamb commented Feb 22, 2018

Thanks for your dedicated assistance and patience, Mark!

The vm still seems to be having issues finding the "raster" package. So must be an issue with my docker image (here)? This is just a text file I uploaded to Github then linked with Container Registry as shown above.

Can I check from googleComputeEngineR whether Raster has been loaded?

screen shot 2018-02-22 at 8 01 37 am

@MarkEdmondson1234
Copy link
Collaborator

@MarkEdmondson1234 MarkEdmondson1234 commented Feb 22, 2018

Your build trigger is probably not working correctly - looking at your Docker repo, the Docker file needs to be called exactly Dockerfile and not raster_docker in your repo at the moment - oh unless you did the Build trigger option that I see now. Maybe ignore that.

@ctlamb
Copy link
Author

@ctlamb ctlamb commented Feb 22, 2018

I feel like I am very close just not quite there..

Edited repo name to = "Dockerfile" here

Remade trigger
screen shot 2018-02-22 at 8 29 17 am
screen shot 2018-02-22 at 8 29 05 am

Edited R code to reflect new Trigger name:
screen shot 2018-02-22 at 8 29 54 am

But same issue
screen shot 2018-02-22 at 8 01 37 am

@MarkEdmondson1234
Copy link
Collaborator

@MarkEdmondson1234 MarkEdmondson1234 commented Feb 22, 2018

I just forked the Dockerfile and building it so I can reproduce and hopefully fix

@MarkEdmondson1234
Copy link
Collaborator

@MarkEdmondson1234 MarkEdmondson1234 commented Feb 22, 2018

Found that although we create the VMs with your image, we're not using that in the plan cluster so still defaulting back to r-base - need this bit:

plan(cluster, workers = as.cluster(
  vms, 
  docker_image=gce_tag_container("raster_docker", project = "LambEcoResearch"),
  rscript=c("docker", "run", c("--net=host"),
            gce_tag_container("raster_docker", project = "LambEcoResearch"), 
            "Rscript")))
@ctlamb
Copy link
Author

@ctlamb ctlamb commented Feb 22, 2018

oooo, good catch!

I assume that I edited your above script correctly to represent my dockerfile?

plan(cluster, workers = as.cluster(
  vms, 
  docker_image=gce_tag_container("dockerfile", project = "LambEcoResearch"),
  rscript=c("docker", "run", c("--net=host"),
            gce_tag_container("dockerfile", project = "LambEcoResearch"), 
            "Rscript")))

Throwing a capitalization error now? Did you not get that error?

screen shot 2018-02-22 at 8 45 16 am

@MarkEdmondson1234
Copy link
Collaborator

@MarkEdmondson1234 MarkEdmondson1234 commented Feb 22, 2018

Yes that is a case of updating your build trigger name, I'm just working on it now so hopefully post a working example by end of today. In container registry you can look at your build history to see if they succeed ok.

@ctlamb
Copy link
Author

@ctlamb ctlamb commented Feb 22, 2018

OK, will check back in tomorrow. Thank you!

I've changed all names in Container Registry to lower case. The only two upper case strings are my project name "LambEcoResearch" and the name of "Dockerfile" from github.

When I try to run the trigger from within the Container Registry, it fails, but not sure if that is expected or not. Once again, really appreciate your help on this.

screen shot 2018-02-22 at 8 54 31 am

@MarkEdmondson1234
Copy link
Collaborator

@MarkEdmondson1234 MarkEdmondson1234 commented Feb 22, 2018

This is working up to some error about connections, but I have verified that raster is installed on the cluster:

#Test to split up raster and predict
library(raster)
library(googleComputeEngineR)
library(future)
library(SpaDES.tools)

gce_global_project("xxxx")

##create raster
row <- 8
col <- 8
r <- raster(nrows=row, ncols=col,xmn=0, xmx=row, ymn=0, ymx=col, vals=c(1:(row*col)))
plot(r)

##Split
r_split <- splitRaster(r, nx=2, ny=2)

##create model
df <- data.frame(y=c(1:10),layer=c(1:5,7,6,8:10))
mod <- glm(y~layer, data=df)

## auto auth to GCE via environment file arguments
## create CPUs names
vm_names <- paste0("my-server",1:2)

## make sure vms won't get shut off
preemptible = list(preemptible = FALSE)

my_image <- gce_tag_container("your-docker-name", project = "xxxx")
## start up VMs with R base on them (can also customise via Dockerfiles using gce_vm_template instead)
vms <- lapply(vm_names, gce_vm, 
              predefined_type = "n1-standard-1", 
              template = "r-base", 
              scheduling = preemptible,
              dynamic_image = my_image)

## add any ssh details, username etc.
vms <- lapply(vms, gce_ssh_setup)

## once all launched, add to cluster
plan(cluster, workers = as.cluster(
  vms, 
  docker_image=my_image,
  rscript=c("docker", "run", c("--net=host"),
            my_image, 
            "Rscript")))

## the action you want to perform via cluster
##function
my_single_function <- function(x, r_split, mod){
  raster::predict(r_split[[x]], mod)
}


#parallel
save3 <- future_lapply(1:2, my_single_function, r_split = r_split, mod = mod)
#> Error: cannot open the connection
@MarkEdmondson1234
Copy link
Collaborator

@MarkEdmondson1234 MarkEdmondson1234 commented Feb 22, 2018

This is my build trigger config

screen shot 2018-02-22 at 17 02 55

@ctlamb
Copy link
Author

@ctlamb ctlamb commented Feb 22, 2018

Hmmm, I set up my build trigger config same as yours, but I am getting captialization errors. Haven't made it to the connection issues yet

2018-02-22 16:09:01> External IP for instance my-server1 : 35.230.8.205
Warning: Permanently added '35.230.8.205' (RSA) to the list of known hosts.
docker: Error parsing reference: "gcr.io/LambEcoResearch/Raster" is not a valid repository/tag: repository name must be lowercase.
See 'docker run --help'.
@MarkEdmondson1234
Copy link
Collaborator

@MarkEdmondson1234 MarkEdmondson1234 commented Feb 22, 2018

Ok the source of my connection error is the raster package writes files to the working directory (in layers) for its calculations, and those files are not present when attempting to work on that file on the cluster. The function then needs to upload those files (I guess one per node) so they exist for the node to work on them.

I'm not familiar with the package so not sure best wya to do that, but you can see the file locations in the test r_split obj:

r_split
[[1]]
class       : RasterLayer 
dimensions  : 4, 4, 16  (nrow, ncol, ncell)
resolution  : 1, 1  (x, y)
extent      : 0, 4, 0, 4  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0 
data source : /Users/mark/dev/R/xxx/layer/layer_tile1.grd 
names       : layer 
values      : 33, 60  (min, max)


[[2]]
class       : RasterLayer 
dimensions  : 4, 4, 16  (nrow, ncol, ncell)
resolution  : 1, 1  (x, y)
extent      : 0, 4, 4, 8  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0 
data source : /Users/mark/dev/R/xxx/layer/layer_tile2.grd 
names       : layer 
values      : 1, 28  (min, max)


[[3]]
class       : RasterLayer 
dimensions  : 4, 4, 16  (nrow, ncol, ncell)
resolution  : 1, 1  (x, y)
extent      : 4, 8, 0, 4  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0 
data source : /Users/mark/dev/R/xxxx/layer/layer_tile3.grd 
names       : layer 
values      : 37, 64  (min, max)


[[4]]
class       : RasterLayer 
dimensions  : 4, 4, 16  (nrow, ncol, ncell)
resolution  : 1, 1  (x, y)
extent      : 4, 8, 4, 8  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0 
data source : /Users/mark/dev/R/xxx/layer/layer_tile4.grd 
names       : layer 
values      : 5, 32  (min, max)
@ctlamb
Copy link
Author

@ctlamb ctlamb commented Feb 22, 2018

You can pull each of those RasterLayer objects in the list out of the working directory and into memory using raster::readAll(r_split[[1]]). So for each node, say node 3, we want to send the following data:

raster::readAll(r_split[[3]]) and mod

@MarkEdmondson1234
Copy link
Collaborator

@MarkEdmondson1234 MarkEdmondson1234 commented Feb 22, 2018

Thanks for that, got it working!

You can use plan(sequential) to test out the loop locally before using plan(cluster, ...) to send it up.

#Test to split up raster and predict
library(raster)
library(googleComputeEngineR)
library(future)
library(SpaDES.tools)

gce_global_project("xxxx")

##create raster
row <- 8
col <- 8
r <- raster(nrows=row, ncols=col,xmn=0, xmx=row, ymn=0, ymx=col, vals=c(1:(row*col)))
plot(r)

##Split
r_split <- splitRaster(r, nx=2, ny=2)

##create model
df <- data.frame(y=c(1:10),layer=c(1:5,7,6,8:10))
mod <- glm(y~layer, data=df)

## auto auth to GCE via environment file arguments
## create CPUs names
vm_names <- paste0("my-server",1:2)

# made via build triggers, installed raster on custom Docker image
my_image <- gce_tag_container("raster", project = "xxxx")

## start up VMs with custom Dockerfile
vms <- lapply(vm_names, gce_vm, 
              predefined_type = "n1-standard-1", 
              template = "r-base", 
              dynamic_image = my_image)

## add any ssh details, username etc.
vms <- lapply(vms, gce_ssh_setup)

## once all launched, add to cluster with custom Dockerfile
# use plan(sequential) for local testing
plan(cluster, workers = as.cluster(
  vms, 
  docker_image=my_image,
  rscript=c("docker", "run", c("--net=host"),
            my_image, 
            "Rscript")))

## make the vector of stuff to send to nodes
o <- lapply(r_split, readAll)

## the action you want to perform on the elements in the cluster
my_single_function <- function(x){
  raster::predict(x, mod)
}

#parallel - working!
result <- future_lapply(o, my_single_function)

## tidy up
lapply(vms, FUN = gce_vm_stop)
@MarkEdmondson1234
Copy link
Collaborator

@MarkEdmondson1234 MarkEdmondson1234 commented Feb 22, 2018

If you don't mind I'll add this example to the documentation?

@ctlamb
Copy link
Author

@ctlamb ctlamb commented Feb 22, 2018

Thank you, Mark! Very excting!

Please do add to the documentation.

To ensure it works for others, we may want to figure out my captialization error first, as I still can't get past the plan() call due to:

2018-02-22 17:14:34> External IP for instance my-server1 : 35.230.8.205
Warning: Permanently added '35.230.8.205' (RSA) to the list of known hosts.
docker: Error parsing reference: "gcr.io/LambEcoResearch/raster" is not a valid repository/tag: repository name must be lowercase.
See 'docker run --help'.
@MarkEdmondson1234
Copy link
Collaborator

@MarkEdmondson1234 MarkEdmondson1234 commented Feb 22, 2018

You can change that in the build trigger settings when setting the name - this bit:
screen shot 2018-02-22 at 18 19 15

It doesn't necessarily have to be your Google project name. Then change the gce_tag_container("raster", project = "xxxx") accordingly.

@ctlamb
Copy link
Author

@ctlamb ctlamb commented Feb 22, 2018

So I removed the caps. My project name is LambEcoResearch, but as suggested above, it needn't be my google project name:
screen shot 2018-02-22 at 10 29 41 am

I then changed:
my_image <- gce_tag_container("raster", project = "lambecoresearch")

But it appears to look for project "lambecoresearch" and of course can't find it because it doesn't exist. Perhaps I need to change my project name...

2018-02-22 17:29:31> External IP for instance my-server1 : 35.230.8.205
Unable to find image 'gcr.io/lambecoresearch/raster:latest' locally
docker: Error response from daemon: Get https://gcr.io/v2/lambecoresearch/raster/manifests/latest: denied: Please enable or contact project owners to enable the Google Container Registry API in Cloud Console at https://console.cloud.google.com/apis/api/containerregistry.googleapis.com/overview?project=lambecoresearch before performing this operation..
See 'docker run --help'.
@MarkEdmondson1234
Copy link
Collaborator

@MarkEdmondson1234 MarkEdmondson1234 commented Feb 22, 2018

Whats the project ID on the Google console home screen? I'm surprised if it allows uppercase in that, perhaps you are using project name instead?

@ctlamb
Copy link
Author

@ctlamb ctlamb commented Feb 22, 2018

screen shot 2018-02-22 at 10 38 23 am

@MarkEdmondson1234
Copy link
Collaborator

@MarkEdmondson1234 MarkEdmondson1234 commented Feb 22, 2018

Yep you need the t-skyline-188419 as your project ID

@ctlamb
Copy link
Author

@ctlamb ctlamb commented Feb 22, 2018

Very good!!

Seems I need to allow permission somehow

2018-02-22 17:42:20> External IP for instance my-server1 : 35.199.190.241
Unable to find image 'gcr.io/t-skyline-188419/raster:latest' locally
Pulling repository gcr.io/t-skyline-188419/raster
docker: unauthorized: authentication required.
See 'docker run --help'.
@MarkEdmondson1234
Copy link
Collaborator

@MarkEdmondson1234 MarkEdmondson1234 commented Feb 22, 2018

That will happen if your VMs are in a different project than your docker images, which may be the case if your project IDs were different.

@ctlamb
Copy link
Author

@ctlamb ctlamb commented Feb 22, 2018

Is there a function I can call to check the project ID of docker and vms? Looking into the info inside my vms object, it appears that they are associated with project ID="t-skyline-188419". I deleted al vms and restarted them to make sure.

And it seems I had originally set the SSH link through terminal on my Mac with the same ID, so should be OK?
screen shot 2018-02-22 at 11 04 50 am

My docker is also inside project ID="t-skyline-188419" as far as I can tell

@ctlamb
Copy link
Author

@ctlamb ctlamb commented Feb 22, 2018

I did just notice this error at the beginning though, when I set global project:
screen shot 2018-02-22 at 11 05 50 am

@MarkEdmondson1234
Copy link
Collaborator

@MarkEdmondson1234 MarkEdmondson1234 commented Feb 22, 2018

Oh that’s an old bug but I thought fixed on GitHub version. But sadly if not easiest is to make a new project without numbers.

@MarkEdmondson1234
Copy link
Collaborator

@MarkEdmondson1234 MarkEdmondson1234 commented Feb 22, 2018

I’m not 100% positive if you start the VMs using gcloud if it will work, since the R launcher customises the VM, and one of those is to allow auth with all cloud services so that may be it.

MarkEdmondson1234 added a commit that referenced this issue Feb 22, 2018
@ctlamb
Copy link
Author

@ctlamb ctlamb commented Mar 1, 2018

I made and authenticated a new project w/ name and ID =lambspatialgrid.

However, I still get a similar error:

> # use plan(sequential) for local testing
> plan(cluster, workers = as.cluster(
+   vms, 
+   docker_image=my_image,
+   rscript=c("docker", "run", c("--net=host"),
+             my_image, 
+             "Rscript")))
2018-03-01 18:18:40> External IP for instance my-server1 : 35.230.73.230
Unable to find image 'gcr.io/lambspatialgrid/raster:latest' locally
docker: Error response from daemon: repository gcr.io/lambspatialgrid/raster not found: does not exist or no pull access.
See 'docker run --help'.
@MarkEdmondson1234
Copy link
Collaborator

@MarkEdmondson1234 MarkEdmondson1234 commented Mar 4, 2018

You have to be so close! Is the raster build trigger on the same project too? Just to check, I have put the raster image I built in the public directory which should have no authentication problems, so is available via

googleComputeEngineR::gce_tag_container("raster", project = "gcer-public")
#[1] "gcr.io/gcer-public/raster"

But part of the power is having your own private images, so its a workaround.

@ctlamb
Copy link
Author

@ctlamb ctlamb commented Mar 9, 2018

This is working!! Thank you!!

@MarkEdmondson1234
Copy link
Collaborator

@MarkEdmondson1234 MarkEdmondson1234 commented Mar 15, 2018

Great, thanks for your patience in getting it up and running.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.