Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel execution fails on Windows when TmParallelApply called from inside a function #21

Closed
tlutz1 opened this issue Apr 29, 2016 · 3 comments

Comments

@tlutz1
Copy link

@tlutz1 tlutz1 commented Apr 29, 2016

I think it has something to do with the environment that TmParallelApply is looking within. If I have something in my work space with the right name, the error does not happen. Examples below

Fails, because I don't have anything in my work space named stopword_vec

rm(list=ls())
data(nih_sample)
 dtm <- CreateDtm(nih_sample$ABSTRACT_TEXT, 
                  doc_names = nih_sample$APPLICATION_ID, 
                  ngram_window = c(1, 2))

Fails for the same reason

rm(list=ls())
data(nih_sample)
 dtm <- CreateDtm(nih_sample$ABSTRACT_TEXT, 
                  stopword_vec = c("blah")
                  doc_names = nih_sample$APPLICATION_ID, 
                  ngram_window = c(1, 2))

Does not fail, even though this is not the stopword_vec passed to the function

rm(list=ls())
data(nih_sample)
stopword_vec <- "blah"
 dtm <- CreateDtm(nih_sample$ABSTRACT_TEXT, 
                  doc_names = nih_sample$APPLICATION_ID, 
                  ngram_window = c(1, 2))

It looks like the source might be parallel::clusterExport

@TommyJones
Copy link
Owner

@TommyJones TommyJones commented Apr 29, 2016

Eek. Ok. Looking into this.
On Fri, Apr 29, 2016 at 10:33 AM tlutz1 notifications@github.com wrote:

I think it has something to do with the environment that TmParallelApply
is looking within. If I have something in my work space with the right
name, the error does not happen. Examples below

  1. Fails, because I don't have anything in my work space named
    stopword_vec

rm(list=ls())
data(nih_sample)
dtm <- CreateDtm(nih_sample$ABSTRACT_TEXT,
doc_names = nih_sample$APPLICATION_ID,
ngram_window = c(1, 2))

  1. Fails for the same reason

rm(list=ls())
data(nih_sample)
dtm <- CreateDtm(nih_sample$ABSTRACT_TEXT,
stopword_vec = c("blah")
doc_names = nih_sample$APPLICATION_ID,
ngram_window = c(1, 2))

  1. Does not fail, even though this is not the stopword_vec passed to
    the function

rm(list=ls())
data(nih_sample)
stopword_vec <- "blah"
dtm <- CreateDtm(nih_sample$ABSTRACT_TEXT,
doc_names = nih_sample$APPLICATION_ID,
ngram_window = c(1, 2))

It looks like the source might be parallel::clusterExport


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#21

@TommyJones
Copy link
Owner

@TommyJones TommyJones commented Apr 29, 2016

Think I got it. I added an option to declare a default search environment to TmParallelApply. Testing now. Will close the issue if it passes all tests.

@TommyJones
Copy link
Owner

@TommyJones TommyJones commented Apr 29, 2016

Tested. Please open a new issue if it crops up again elsewhere.

@TommyJones TommyJones closed this Apr 29, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.