rhbase segfaults if hb.list.tables() is called before hb.init() #152

Open
jbarber opened this Issue Oct 30, 2012 · 2 comments

Comments

Projects
None yet
2 participants

jbarber commented Oct 30, 2012

(warning: I've only just started using hadoop + RHadoop, it's entirely possible I'm doing something wrong)

I get a segfault in rhbase (from the rmr-2.0.0 tag) if I don't call hb.init() before hb.list.tables():

> library(rhbase)
> hb.list.tables()

 *** caught segfault ***
address 0x10000b8, cause 'memory not mapped'

Traceback:
 1: .Call("hb_get_tables", hbc, PACKAGE = "rhbase")
 2: hb.list.tables()

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
Selection: 

gdb + the core file show the problem being at line 61 of tools.cc:

(gdb) bt
#0  0x00007fca4005720c in hb_get_tables (r=<value optimized out>) at tools.cc:61
#1  0x0000003ed12a921c in ?? () from /usr/lib64/R/lib/libR.so
#2  0x0000003ed12dd6fb in Rf_eval () from /usr/lib64/R/lib/libR.so
#3  0x0000003ed12e44b0 in ?? () from /usr/lib64/R/lib/libR.so
#4  0x0000003ed12dd51b in Rf_eval () from /usr/lib64/R/lib/libR.so
#5  0x0000003ed12def50 in ?? () from /usr/lib64/R/lib/libR.so
#6  0x0000003ed12dd51b in Rf_eval () from /usr/lib64/R/lib/libR.so
#7  0x0000003ed12df831 in Rf_applyClosure () from /usr/lib64/R/lib/libR.so
#8  0x0000003ed12dd3f8 in Rf_eval () from /usr/lib64/R/lib/libR.so
#9  0x0000003ed1314a98 in Rf_ReplIteration () from /usr/lib64/R/lib/libR.so
#10 0x0000003ed1314d29 in ?? () from /usr/lib64/R/lib/libR.so
#11 0x0000003ed1315260 in run_Rmainloop () from /usr/lib64/R/lib/libR.so
#12 0x000000000040084b in main ()
(gdb) f 0
#0  0x00007fca4005720c in hb_get_tables (r=<value optimized out>) at tools.cc:61
61        client->getTableNames(tables);
(gdb) list
56    SEXP hb_get_tables(SEXP r){
57      HbaseClient *client  = static_cast<HbaseClient*>(R_ExternalPtrAddr(r));
58      std::vector<std::string> tables;
59      SEXP result = R_NilValue;
60      try{
61        client->getTableNames(tables);
62        if(tables.size()>0){
63      PROTECT(result = Rf_allocVector(STRSXP,tables.size()));
64      for(unsigned int i=0;i < tables.size(); i++){
65        SET_STRING_ELT(result,i,Rf_mkChar(static_cast<const char*>(tables[i].c_str())));
(gdb) p tables
$4 = std::vector of length 0, capacity 0
(gdb) p client
$5 = (apache::hadoop::hbase::thrift::HbaseClient *) 0x10d5238

I guess either "tables" or "client" aren't initialized properly, but my C++ and R skills are too weak to diagnose it further.

@ghost

ghost commented Oct 30, 2012

You have to call hb.init() before invoking any other function in the rhbase package. Please look to the examples in the documentation, and unit tests in the package

jbarber commented Oct 30, 2012

Thank you for pointing out the documentation. However, on reviewing it I don't think it says that:

  1. the call to hb.init() is required
  2. your entire R environment will blow up if you don't call hb.init()

In addition, I wouldn't normally expect an R function call to crash the entire environment.

Can I suggest improving the user friendliness of the package by detecting that hb.init() hadn't been called and then some combination of:

  1. emitting a warning
  2. calling hb.init() with the defaults

and not causing a segfault.

Regards

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment