Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error when using topicmodel package #194

Closed
simm13 opened this issue Aug 2, 2013 · 2 comments
Closed

error when using topicmodel package #194

simm13 opened this issue Aug 2, 2013 · 2 comments

Comments

@simm13
Copy link

simm13 commented Aug 2, 2013

there occurred a problem when i use topicmodel with rmr2

the code is :
tm_mapreduce<-function(x){
words<-strsplit(x,',')
corpus = Corpus(VectorSource(words))
corpus<- tm_map(corpus, removeWords, stopwords("english"))
sample.dtm <- DocumentTermMatrix(corpus, control = list(wordLengths = c(2, Inf)))
summary(col_sums(sample.dtm))
term_tfidf <- tapply(sample.dtm$v/row_sums( sample.dtm)[ sample.dtm$i], sample.dtm$j, mean)*log2(nDocs( sample.dtm)/col_sums( sample.dtm > 0))
sample.dtm <- sample.dtm[, term_tfidf >= 0.1]
sample.dtm <- sample.dtm[row_sums(sample.dtm) > 0,]
k <- 3
SEED <- 2013

sample_TM <-

list(

VEM = LDA(sample.dtm, k = k, control = list(seed = SEED)),

VEM_fixed = LDA(sample.dtm, k = k,control = list(estimate.alpha = FALSE, seed = SEED)),

Gibbs = LDA(sample.dtm, k = k, method = "Gibbs",control = list(seed = SEED, burnin = 1000,thin = 100, iter = 1000)),

CTM = CTM(sample.dtm, k = k,control = list(seed = SEED,var = list(tol = 10^-4), em = list(tol = 10^-3)))

)

VEM = LDA(sample.dtm, k = k,control=NULL)

CTM = CTM(sample.dtm, k = k,control = list(seed = SEED,var = list(tol = 10^-4), em = list(tol = 10^-3)))

Terms <- terms(sample_TM[["VEM"]], 10)

Terms <- terms(VEM, 10)
return(Terms)
}

keyword <- function (input, output, split='\001'){
mapreduce(
input=input, output=output,
map=function(k, v){
keyval(word(v,3,sep=fixed("\001")),word(v,5,sep=fixed("\001")))
},
reduce=function(k,vv){
d<-data.frame(k,vv)
acc_nbr <- unique(d$k)
term<-lapply(acc_nbr,function(x){tm_mapreduce(as.character(d[which(d$k==x),]$vv))})
#keyval(1,as.list(dlply(data.frame(k,vv)$vv,~vv)))
keyval(acc_nbr,term)
#keyval(k,vv)
}
)
}

when i run it,the datanode returned:

Loading required package: slam
Loading required package: topicmodels
Loading required package: tm
Loading required package: methods
Loading required package: rmr2
Loading required package: Rcpp
Loading required package: RJSONIO
Loading required package: bitops
Loading required package: digest
Loading required package: functional
Loading required package: stringr
Loading required package: plyr
Loading required package: reshape2
Error in as(control, "CTM_VEMcontrol") :
no method or default for coercing “list” to “CTM_VEMcontrol”
Calls: ... lapply -> FUN -> tm_mapreduce -> CTM -> method -> as
Execution halted
java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
at org.apache.hadoop.streaming.PipeReducer.close(PipeReducer.java:137)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:479)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:417)
at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177)
at org.apache.hadoop.mapred.Child.main(Child.java:264)
log4j:WARN No appenders could be found for logger (org.apache.hadoop.hdfs.DFSClient).
log4j:WARN Please initialize the log4j system properly.

how can i solve this porblem?thanks

@fcastanedo
Copy link

It seems that you do not have permission to read/write hdfs with that user. Try to change the value of dfs.permissions from hdfs config file to false and restart your cluster. That is tou need to check having this property

 <property>
    <name>dfs.permissions</name>
    <value>false</value>
  </property>

in conf/hdfs-site.xml file.

Hope this helps.

@piccolbo
Copy link
Collaborator

piccolbo commented Aug 2, 2013

please file in the appropriate issue tracker, this one is closed.

@piccolbo piccolbo closed this as completed Aug 2, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants