Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verifying a bag throws an exception: Unable to create new native threads #121

Open
chinuhub opened this issue Jul 11, 2018 · 5 comments · May be fixed by #125
Open

Verifying a bag throws an exception: Unable to create new native threads #121

chinuhub opened this issue Jul 11, 2018 · 5 comments · May be fixed by #125

Comments

@chinuhub
Copy link

chinuhub commented Jul 11, 2018

When submitting an issue please include:

  • Bagit library version 5.1.1
  • MacOS version 10.12.6
  • If available Attach all logs, and or output, and or screenshots

Please format it in the given when then style

For example (from link above):

Given

  • I have a bag of size 5.8 GB. Number of data files is 28730.

When

  • I run bag.verify() method on this bag from java (JDK 1.8)

Then

  • After some time verify throws an exception saying
    Exception in thread "main" java.lang.OutOfMemoryError: unable to create new native thread
    at java.lang.Thread.start0(Native Method)
    at java.lang.Thread.start(Thread.java:714)
    at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:950)
    at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1368)
    at gov.loc.repository.bagit.verify.PayloadVerifier.checkAllFilesListedInManifestExist(PayloadVerifier.java:146)
    at gov.loc.repository.bagit.verify.PayloadVerifier.verifyPayload(PayloadVerifier.java:103)

The exception is thrown in method "checkAllFilesListedInManifestExist(Set files)" in file PayloadVerifier.java in line
this.executor.execute(new CheckIfFileExistsTask(file, missingFiles, latch));
when a new task isto be executed on executor.

When checking the thread creation limit on mac It was 709.

@chinuhub
Copy link
Author

When instantiating BagVerifier I am now using Executors.newSingleThreadExecutor(). It fixed the issue.

@jscancella
Copy link
Contributor

jscancella commented Jul 17, 2018

For a large bag you may want to use Executors.newFixedThreadPool() and specify how many threads you want to use instead of just using a single thread as multi-threading it will be much faster(as long as you aren't hitting IO problems).

@acdha does Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors()) seem reasonable for being the default instead of Executors.newCachedThreadPool()?

@acdha
Copy link
Member

acdha commented Jul 17, 2018

That's exactly what I was wondering — as long as someone can override it for unusual cases, the CPU count seems like a reasonable default.

@jscancella
Copy link
Contributor

Yup, they are able to because a different user asked to be able to finely tune that threadpool. That's why there are 4 different constructors on that class, to be able to override various parts of it or keep the defaults.

@acdha
Copy link
Member

acdha commented Jul 17, 2018

Yeah, I figure there are a few cases where someone would need to change that value but it really does seem like most of them would be edge cases.

@jscancella jscancella linked a pull request Jul 22, 2018 that will close this issue
rvanheest pushed a commit to DANS-KNAW/dans-bagit-lib that referenced this issue Feb 20, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants