Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

User defined function not found during cataloge process for LiDAR metrics extraction #267

Closed
komazsofi opened this issue Aug 1, 2019 · 5 comments
Assignees
Labels
Bug A bug in the package

Comments

@komazsofi
Copy link

komazsofi commented Aug 1, 2019

Dear JR,

I am trying to use lidR2.1.0 to write a user-defined function for extracting LiDAR metrics. I have got the following error message during the process:

 4s Error: could not find function "myMetrics"

Here is the full code using the external dataset of yours:

library(lidR)
library(future)

#Global settings
workdir="D:/R-3.5.1/library/lidR/extdata/"
setwd(workdir)

chunksize=2500
buffer=10
resolution=10
groupid=10

rasterOptions(maxmemory = 200000000000)

# Set up cataloge
plan(multisession, workers = 2L)
set_lidr_threads(2L)

ctg <- catalog(workdir)

opt_chunk_buffer(ctg) <- buffer
opt_chunk_size(ctg) <- chunksize
opt_output_files(ctg) <- ""

myMetrics = function(z, i) {
  
  metrics = list(
    imean  = mean(i),      
    z95p = quantile(z, 0.95))  
  
  return(metrics)
}

metrics = grid_metrics(ctg, ~myMetrics(Z,Intensity),res=resolution, filter = ~Classification == 1L)

I have followed the cran example to write this code. If I put a single R function (like ~quantile(Z, 0.95) it is working. But if I put a function there the code crashes. Do you know what could be the issue?

Thank in advance for your help.

Best,
Zsofia

@Jean-Romain
Copy link
Collaborator

Jean-Romain commented Aug 1, 2019

Reproducible example

library(lidR)
library(future)

LASfile <- system.file("extdata", "Megaplot.laz", package="lidR")

plan(multisession, workers = 2L)

ctg <- catalog(LASfile)
opt_chunk_buffer(ctg) <- 20
opt_chunk_size(ctg) <- 160

myMetrics = function(z, i) 
{
  metrics = list(
    imean  = mean(i),      
    z95p = quantile(z, 0.95))  
  
  return(metrics)
}

metrics = grid_metrics(ctg, myMetrics(Z,Intensity), res=10)

@Jean-Romain
Copy link
Collaborator

Jean-Romain commented Aug 1, 2019

This error comes back again and again and again... It has been solved several times and yet it comes back anyway 😢 . The problem is that your user-defined function is not exported in each workers. grid_metrics automatically detects the user-defined functions and export them but in v2.1.0 a dummy letter was added in the code (I suspect my cat 🐈) and broke the automatic exportation. I fixed it 36bcc35.

Thank for reporting the issue.


That being said set_lidr_threads(2L) is useless here. grid_metrics is not parallelized at C++ level. If your intention was to use 4 cores you can write plan(multisession, workers = 4L)

Also filter = ~Classification == 1L is suboptimal here. You read all your points in R and then make the computation on a subset of the point. Just don't read the points you don't need

opt_filter(ctg) <- "-keep_class 1"
opt_select(ctg) <- "xyzi"
metrics = grid_metrics(ctg, myMetrics(Z,Intensity), res=10)

To finish you don't have control on the buffer in grid_metrics. So opt_chunk_buffer(ctg) <- buffer has no effect here.

@komazsofi
Copy link
Author

komazsofi commented Aug 2, 2019

Hi,

I have re-installed the latest version after fix from GitHub (master branch) but the problem is still there. What is now changed: the filter = ~Classification == 1L breaks too (I know it is not optimal...)

So here are the details:

library(lidR)
library(future)

LASfile <- system.file("extdata", "Megaplot.laz", package="lidR")

plan(multisession, workers = 2L)

ctg <- catalog(LASfile)
opt_chunk_buffer(ctg) <- 20
opt_chunk_size(ctg) <- 160
opt_select(ctg) <- "xyzi"

myMetrics = function(z, i) 
{
  metrics = list(
    imean  = mean(i),      
    z95p = quantile(z, 0.95))  
  
  return(metrics)
}

metrics = grid_metrics(ctg, ~myMetrics(Z, Intensity), res=10)

This gives both with and without ~ the following error:

Error: could not find function "myMetrics"

If I try to run the following

metrics = grid_metrics(ctg, ~quantile(Z, 0.95), res=10,filter = ~Classification == 1L)

I got the following new error:

The items in the 'by' or 'keyby' list are length (4312). Each must be length 2812; the same length as there are rows in x (after subsetting if i is provided).

@Jean-Romain Jean-Romain reopened this Aug 2, 2019
Jean-Romain added a commit that referenced this issue Aug 2, 2019
@Jean-Romain
Copy link
Collaborator

Ok for the first issue there were actually two typos (I don't understand why I found it working yesterday). It works now.

For the second issue I fixed it. The filter was not actually working. I suspect a change in the data.table package between the moment I add the feature and now (I don't really use it myself). I will add unit tests for that.

@komazsofi
Copy link
Author

Thank you, now it is working.

@Jean-Romain Jean-Romain self-assigned this Aug 5, 2019
@Jean-Romain Jean-Romain added the Bug A bug in the package label Aug 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug A bug in the package
Projects
None yet
Development

No branches or pull requests

2 participants