Segfault adding #13

Closed
mikepb opened this Issue Feb 1, 2016 · 8 comments

Projects

None yet

3 participants

@mikepb
Contributor
mikepb commented Feb 1, 2016

I ran into this segfault today after upgrading Rcpp and rebuilding RcppAnnoy. It was working yesterday, with the older Rcpp, but I'm not sure if that's the underlying issue. This seems like one of those "was working yesterday" bugs...

> str(knn_dt)
Classes ‘data.table’ and 'data.frame':  213451 obs. of  1050 variables:
 $ gender                                                                            : num  NaN NaN NaN 2 NaN 2 2 NaN 1 1 ...
 $ signup_method                                                                     : num  1 1 1 1 1 1 1 1 1 1 ...
 $ signup_flow                                                                       : num  1 1 1 1 17 1 1 16 3 4 ...
 $ language                                                                          : num  6 6 6 6 6 6 6 6 6 6 ...
 $ affiliate_channel                                                                 : num  3 3 3 3 3 3 3 3 7 3 ...
 $ affiliate_provider                                                                : num  5 5 5 5 5 5 5 5 9 5 ...
 $ first_affiliate_tracked                                                           : num  4 7 7 7 7 7 7 7 7 7 ...
 $ signup_app                                                                        : num  4 4 4 4 3 4 4 1 4 4 ...
 $ first_device_type                                                                 : num  6 4 9 6 9 9 6 7 6 6 ...
 $ first_browser                                                                     : num  43 30 17 43 8 17 17 NaN 17 8 ...
 $ age                                                                               : num  31 NaN NaN 40 NaN 38 41 NaN 34 28 ...
 $ age_ul                                                                            : num  0 0 0 0 0 0 0 0 0 0 ...
 $ age_ll                                                                            : num  0 0 0 0 0 0 0 0 0 0 ...
 $ date_account_created.wday                                                         : num  3 0 5 2 1 3 1 4 2 5 ...
 $ date_account_created.mday                                                         : num  14 4 27 3 8 27 8 24 27 21 ...
 $ date_account_created.yday                                                         : num  133 215 208 336 188 57 188 113 360 264 ...
 $ date_account_created.mon                                                          : num  4 7 6 11 6 1 6 3 11 8 ...
 $ date_account_created.yweek                                                        : num  2 4 4 6 3 1 3 2 6 5 ...
 $ date_account_created.year                                                         : num  2014 2013 2012 2013 2013 ...
 $ date_account_created.wday0                                                        : num  0 1 0 0 0 0 0 0 0 0 ...
 $ date_account_created.wday1                                                        : num  0 0 0 0 1 0 1 0 0 0 ...
 $ date_account_created.wday2                                                        : num  0 0 0 1 0 0 0 0 1 0 ...
 $ date_account_created.wday3                                                        : num  1 0 0 0 0 1 0 0 0 0 ...
 $ date_account_created.wday4                                                        : num  0 0 0 0 0 0 0 1 0 0 ...
 $ date_account_created.wday5                                                        : num  0 0 1 0 0 0 0 0 0 1 ...
 $ date_account_created.wday6                                                        : num  0 0 0 0 0 0 0 0 0 0 ...
 $ timestamp_first_active.wday                                                       : num  3 0 5 2 1 3 1 4 2 5 ...
 $ timestamp_first_active.mday                                                       : num  14 4 27 3 8 27 8 24 27 21 ...
 $ timestamp_first_active.yday                                                       : num  133 215 208 336 188 57 188 113 360 264 ...
 $ timestamp_first_active.mon                                                        : num  4 7 6 11 6 1 6 3 11 8 ...
 $ timestamp_first_active.hour                                                       : num  19 22 9 13 17 19 16 17 1 5 ...
 $ timestamp_first_active.yweek                                                      : num  2 4 4 6 3 1 3 2 6 5 ...
 $ timestamp_first_active.year                                                       : num  2014 2013 2012 2013 2013 ...
 $ timestamp_first_active.wday0                                                      : num  0 1 0 0 0 0 0 0 0 0 ...
 $ timestamp_first_active.wday1                                                      : num  0 0 0 0 1 0 1 0 0 0 ...
 $ timestamp_first_active.wday2                                                      : num  0 0 0 1 0 0 0 0 1 0 ...
 $ timestamp_first_active.wday3                                                      : num  1 0 0 0 0 1 0 0 0 0 ...
 $ timestamp_first_active.wday4                                                      : num  0 0 0 0 0 0 0 1 0 0 ...
 $ timestamp_first_active.wday5                                                      : num  0 0 1 0 0 0 0 0 0 1 ...
 $ timestamp_first_active.wday6                                                      : num  0 0 0 0 0 0 0 0 0 0 ...
 $ device_percent.Android_App_Unknown_Phone_Tablet                                   : num  0 0 0 0 0 0 0 0 0 0 ...
 $ device_percent.Android_Phone                                                      : num  0 0 0 0 0 0 0 0 0 0 ...
 $ device_percent.Blackberry                                                         : num  0 0 0 0 0 0 0 0 0 0 ...
 $ device_percent.Chromebook                                                         : num  0 0 0 0 0 0 0 0 0 0 ...
 $ device_percent.Linux_Desktop                                                      : num  0 0 0 0 0 0 0 0 0 0 ...
 $ device_percent.Mac_Desktop                                                        : num  0 0 0 0 0 0 0 0 0 0 ...
 $ device_percent.Opera_Phone                                                        : num  0 0 0 0 0 0 0 0 0 0 ...
 $ device_percent.Tablet                                                             : num  0 0 0 0 0 0 0 0 0 0 ...
 $ device_percent.Windows_Desktop                                                    : num  0 0 0 0 0 0 0 0 0 0 ...
 $ device_percent.Windows_Phone                                                      : num  0 0 0 0 0 0 0 0 0 0 ...
 $ device_percent._unknown_                                                          : num  0 0 0 0 0 0 0 0 0 0 ...
 $ device_percent.iPad_Tablet                                                        : num  0 0 0 0 0 0 0 0 0 0 ...
 $ device_percent.iPhone                                                             : num  0 0 0 0 0 0 0 0 0 0 ...
 $ device_percent.iPodtouch                                                          : num  0 0 0 0 0 0 0 0 0 0 ...
 $ count.secs_elapsed.Compose.device_time.Android_App_Unknown_Phone_Tablet           : num  0 0 0 0 0 0 0 0 0 0 ...
 $ count.secs_elapsed.Compose.device_time.Android_Phone                              : num  0 0 0 0 0 0 0 0 0 0 ...
 $ count.secs_elapsed.Compose.device_time.Blackberry                                 : num  0 0 0 0 0 0 0 0 0 0 ...
 $ count.secs_elapsed.Compose.device_time.Chromebook                                 : num  0 0 0 0 0 0 0 0 0 0 ...
 $ count.secs_elapsed.Compose.device_time.Linux_Desktop                              : num  0 0 0 0 0 0 0 0 0 0 ...
 $ count.secs_elapsed.Compose.device_time.Mac_Desktop                                : num  0 0 0 0 0 0 0 0 0 0 ...
 $ count.secs_elapsed.Compose.device_time.Opera_Phone                                : num  0 0 0 0 0 0 0 0 0 0 ...
 $ count.secs_elapsed.Compose.device_time.Tablet                                     : num  0 0 0 0 0 0 0 0 0 0 ...
 $ count.secs_elapsed.Compose.device_time.Windows_Desktop                            : num  0 0 0 0 0 0 0 0 0 0 ...
 $ count.secs_elapsed.Compose.device_time.Windows_Phone                              : num  0 0 0 0 0 0 0 0 0 0 ...
 $ count.secs_elapsed.Compose.device_time._unknown_                                  : num  0 0 0 0 0 0 0 0 0 0 ...
 $ count.secs_elapsed.Compose.device_time.iPad_Tablet                                : num  0 0 0 0 0 0 0 0 0 0 ...
 $ count.secs_elapsed.Compose.device_time.iPhone                                     : num  0 0 0 0 0 0 0 0 0 0 ...
 $ count.secs_elapsed.Compose.device_time.iPodtouch                                  : num  0 0 0 0 0 0 0 0 0 0 ...
 $ sum.secs_elapsed.Compose.device_time.Android_App_Unknown_Phone_Tablet             : num  0 0 0 0 0 0 0 0 0 0 ...
 $ sum.secs_elapsed.Compose.device_time.Android_Phone                                : num  0 0 0 0 0 0 0 0 0 0 ...
 $ sum.secs_elapsed.Compose.device_time.Blackberry                                   : num  0 0 0 0 0 0 0 0 0 0 ...
 $ sum.secs_elapsed.Compose.device_time.Chromebook                                   : num  0 0 0 0 0 0 0 0 0 0 ...
 $ sum.secs_elapsed.Compose.device_time.Linux_Desktop                                : num  0 0 0 0 0 0 0 0 0 0 ...
 $ sum.secs_elapsed.Compose.device_time.Mac_Desktop                                  : num  0 0 0 0 0 0 0 0 0 0 ...
 $ sum.secs_elapsed.Compose.device_time.Opera_Phone                                  : num  0 0 0 0 0 0 0 0 0 0 ...
 $ sum.secs_elapsed.Compose.device_time.Tablet                                       : num  0 0 0 0 0 0 0 0 0 0 ...
 $ sum.secs_elapsed.Compose.device_time.Windows_Desktop                              : num  0 0 0 0 0 0 0 0 0 0 ...
 $ sum.secs_elapsed.Compose.device_time.Windows_Phone                                : num  0 0 0 0 0 0 0 0 0 0 ...
 $ sum.secs_elapsed.Compose.device_time._unknown_                                    : num  0 0 0 0 0 0 0 0 0 0 ...
 $ sum.secs_elapsed.Compose.device_time.iPad_Tablet                                  : num  0 0 0 0 0 0 0 0 0 0 ...
 $ sum.secs_elapsed.Compose.device_time.iPhone                                       : num  0 0 0 0 0 0 0 0 0 0 ...
 $ sum.secs_elapsed.Compose.device_time.iPodtouch                                    : num  0 0 0 0 0 0 0 0 0 0 ...
 $ mean.secs_elapsed.Compose.device_time.Android_App_Unknown_Phone_Tablet            : num  0 0 0 0 0 0 0 0 0 0 ...
 $ mean.secs_elapsed.Compose.device_time.Android_Phone                               : num  0 0 0 0 0 0 0 0 0 0 ...
 $ mean.secs_elapsed.Compose.device_time.Blackberry                                  : num  0 0 0 0 0 0 0 0 0 0 ...
 $ mean.secs_elapsed.Compose.device_time.Chromebook                                  : num  0 0 0 0 0 0 0 0 0 0 ...
 $ mean.secs_elapsed.Compose.device_time.Linux_Desktop                               : num  0 0 0 0 0 0 0 0 0 0 ...
 $ mean.secs_elapsed.Compose.device_time.Mac_Desktop                                 : num  0 0 0 0 0 0 0 0 0 0 ...
 $ mean.secs_elapsed.Compose.device_time.Opera_Phone                                 : num  0 0 0 0 0 0 0 0 0 0 ...
 $ mean.secs_elapsed.Compose.device_time.Tablet                                      : num  0 0 0 0 0 0 0 0 0 0 ...
 $ mean.secs_elapsed.Compose.device_time.Windows_Desktop                             : num  0 0 0 0 0 0 0 0 0 0 ...
 $ mean.secs_elapsed.Compose.device_time.Windows_Phone                               : num  0 0 0 0 0 0 0 0 0 0 ...
 $ mean.secs_elapsed.Compose.device_time._unknown_                                   : num  0 0 0 0 0 0 0 0 0 0 ...
 $ mean.secs_elapsed.Compose.device_time.iPad_Tablet                                 : num  0 0 0 0 0 0 0 0 0 0 ...
 $ mean.secs_elapsed.Compose.device_time.iPhone                                      : num  0 0 0 0 0 0 0 0 0 0 ...
 $ mean.secs_elapsed.Compose.device_time.iPodtouch                                   : num  0 0 0 0 0 0 0 0 0 0 ...
 $ sd.secs_elapsed.Compose.device_time.Android_App_Unknown_Phone_Tablet              : num  0 0 0 0 0 0 0 0 0 0 ...
 $ sd.secs_elapsed.Compose.device_time.Android_Phone                                 : num  0 0 0 0 0 0 0 0 0 0 ...
 $ sd.secs_elapsed.Compose.device_time.Blackberry                                    : num  0 0 0 0 0 0 0 0 0 0 ...
  [list output truncated]
 - attr(*, ".internal.selfref")=<externalptr> 
>   apply(knn_dt[1,], 1, function(row) {
+     knn$addItem(as.integer(row[[na_col]]), row[names(row) != na_col])
+   })

 *** caught segfault ***
address 0xfffff7c600000008, cause 'memory not mapped'

Traceback:
 1: .External(list(name = "CppMethod__invoke_void", address = <pointer: 0x7f92f6a37840>,     dll = list(name = "Rcpp", path = "/usr/local/lib/R/3.2/site-library/Rcpp/libs/Rcpp.so",         dynamicLookup = TRUE, handle = <pointer: 0x7f92f36f9360>,         info = <pointer: 0x107138670>), numParameters = -1L),     <pointer: 0x7f92f6e39f00>, <pointer: 0x7f92f3495f10>, .pointer,     ...)
 2: knn$addItem(as.integer(row[[na_col]]), row[names(row) != na_col])
 3: FUN(newX[, i], ...)
 4: apply(knn_dt[1, ], 1, function(row) {    knn$addItem(as.integer(row[[na_col]]), row[names(row) !=         na_col])})
@mikepb
Contributor
mikepb commented Feb 1, 2016

Let me try again with gcc instead of clang and update this thread in a few hours.

@eddelbuettel
Owner

There is nothing reproducible in your bug report so my ability to help you from here is limited.

Did you use 0.0.7 from CRAN as your base, or the current master from GitHub (which is marginally ahead and which I called 0.0.7.1) ? FWIW it is worth I can run R CMD check on both just fine (and this build and installs to run tests). But this stuff has been affected by R changes and/or g++ changes.

@erikbern
erikbern commented Feb 1, 2016

i pushed some pretty extensive changes to annoy yesterday but don't think it's a part of rcppannoy yet right? just wanted to double check

@eddelbuettel
Owner

Hi @erikbern and thanks for being so proactive. RcppAnnoy does not automagically copy code from Annoy; it is pretty much a manual copy (along with copious regression checks).

Now, I can't speak for @mikepb, but I dont't think he copied code either. At this point we need more details (as eg provided by R's sessionInfo() along with OS, compiler, ... versions. As I said, "works for me" and also works at CRAN or else they'd come after me..

@mikepb
Contributor
mikepb commented Feb 1, 2016

Sorry, this was my bad! I had NA for the item value. Here's reproducible code:

sessionInfo()
R version 3.2.3 (2015-12-10)
Platform: x86_64-apple-darwin15.2.0 (64-bit)
Running under: OS X 10.11.3 (El Capitan)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] RcppAnnoy_0.0.7

loaded via a namespace (and not attached):
[1] Rcpp_0.12.3      codetools_0.2-14
f <- 1049
knn <- new(AnnoyEuclidean, f)
sv <- rnorm(f)
knn$addItem(NA, sv)
 *** caught segfault ***
address 0xfffff7c600000008, cause 'memory not mapped'

Traceback:
 1: .External(list(name = "CppMethod__invoke_void", address = <pointer: 0x7fb5e3431380>,     dll = list(name = "Rcpp", path = "/usr/local/lib/R/3.2/site-library/Rcpp/libs/Rcpp.so",         dynamicLookup = TRUE, handle = <pointer: 0x7fb5e3558f40>,         info = <pointer: 0x10a7af3d0>), numParameters = -1L),     <pointer: 0x7fb5e3544400>, <pointer: 0x7fb5e354d760>, .pointer,     ...)
 2: knn$addItem(NA, sv)

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace

Should the API check for this error condition?

@mikepb
Contributor
mikepb commented Feb 2, 2016

Looks like the segfault likely comes from annoy passing the NA to realloc, which then crashes the program.

https://github.com/eddelbuettel/rcppannoy/blob/master/inst/include/annoylib.h#L422

@eddelbuettel
Owner

Thanks for looking into this. We can probably catch that.

@eddelbuettel
Owner

Sorry for taking so long. This basically blew up whenever x in knn$addItem(x, vector) was negative, and NA is a special negative value (actually, - INT_MAX). But because we passed from R's SEXP to an int32_t already, I no longer have R's NA tests (easily) so I just test for negative values.

Commit and fix coming up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment