-
Notifications
You must be signed in to change notification settings - Fork 215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how can i do multiple pdf extraction processes concurrently? #53
Comments
We are using a |
@knowtheory & @Natim - I'm trying to do the same thing as Quyen, but having some trouble figuring out Circus.. Would you guys happen to know of any tutorial covering the set-up for using Circus to run multiple processes? Thanks in advance. |
@antonlakin Have a look at thoose projects : https://github.com/novagile/insight-reloaded and https://github.com/novagile/insight-installer there is some configuration example : https://github.com/novagile/insight-installer/tree/master/chef/cookbooks/insight/templates/default |
Just for some additional details, DocumentCloud uses CloudCrowd for distributed queuing of jobs which use DocSplit. You can check out the actions we've written, and in particular note the document_import action. |
I'd like to be able to extract pdf concurently, but it is not possible with docsplit gem
I tried to extract 2 ppt files to pdf, the gem fails to process.
The code is as below, please replace path_to_docsplit.rb, path_to_test_file1.ppt, path_to_test_file2.ppt
Im looking forward to your answer.
Thank you,
Quyen
!/usr/bin/ruby
require 'path_to_docsplit.rb'
def extraction(path_to_file)
Docsplit.extract_pdf(path_to_file)
end
puts('start extraction')
t1=Thread.new{extraction('path_to_test_file1.ppt')}
t2=Thread.new{extraction('path_to_test_file2.ppt')}
t1.join
t2.join
puts('end extraction')
The text was updated successfully, but these errors were encountered: