Skip to content

Commit

Permalink
PDF ingest upload Actor/Job implementation
Browse files Browse the repository at this point in the history
Fixes #33.

* Independent method to create child NewspaperPage works with TIFF for
NewspaperIssueIngest, purpose of which is to be called by job triggered
by Hyrax upload, which is already handling attachment of PDF to issue.

* NewspaperIssueIngest only saves once and without validation for all
child pages added.

* CreateIssuePagesJob and NewspaperWorksUploadActor that calls it,
injected into actor stack before OOTB Hyrax upload actor.

* Admin set of NewspaperIssue work, or default admin set, saved to created
child NewspaperPage objects on PDF upload.

* Factories for upload (pdf), ability, user.

* Add faraday as a testing/development dependency, useful for checking
Fedora in RSpec tests.

* Use :async not :sidekiq as presumed default in tests when disabling
inline after temporary use.

* get Travis-CI to install FITS
  • Loading branch information
seanupton authored and ebenenglish committed Jun 25, 2018
1 parent 017462d commit 998498e
Show file tree
Hide file tree
Showing 14 changed files with 189 additions and 14 deletions.
4 changes: 4 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,10 @@ before_install:
- gem update --system
- gem install bundler
- google-chrome-stable --headless --disable-gpu --no-sandbox --remote-debugging-port=9222 http://localhost &
- sudo wget -P /opt/install https://brussels.lib.utah.edu/FITS/fits-1.3.0.zip
- sudo unzip /opt/install/fits-1.3.0.zip -d /opt/install/fits-1.3.0
- sudo chmod +x /opt/install/fits-1.3.0/*.sh
- sudo ln -s /opt/install/fits-1.3.0/fits.sh /usr/local/bin/fits.sh

rvm:
- 2.5.0
Expand Down
4 changes: 4 additions & 0 deletions Gemfile
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,9 @@ else
gem 'rails', github: 'rails/rails'
ENV['ENGINE_CART_RAILS_OPTIONS'] = '--edge --skip-turbolinks'
else
# rubocop:disable Bundler/DuplicatedGem
gem 'rails', ENV['RAILS_VERSION']
# rubocop:enable Bundler/DuplicatedGem
end
end

Expand All @@ -33,7 +35,9 @@ else
gem 'responders', '~> 2.0'
gem 'sass-rails', '>= 5.0'
when /^4.[01]/
# rubocop:disable Bundler/DuplicatedGem
gem 'sass-rails', '< 5.0'
# rubocop:enable Bundler/DuplicatedGem
end
end
# END ENGINE_CART BLOCK
57 changes: 57 additions & 0 deletions app/actors/newspaper_works/actors/newspaper_works_upload_actor.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
module NewspaperWorks
module Actors
class NewspaperWorksUploadActor < Hyrax::Actors::BaseActor
def create(env)
# If NewspaperIssue, we might have a PDF to split...
handle_issue_upload(env) if env.curation_concern.class == NewspaperIssue
# pass to next actor
next_actor.create(env)
end

def update(env)
handle_issue_upload(env) if env.curation_concern.class == NewspaperIssue
# pass to next actor
next_actor.update(env)
end

def default_admin_set
AdminSet.find_or_create_default_admin_set_id
end

def queue_job(work, paths, user, admin_set_id)
NewspaperWorks::CreateIssuePagesJob.perform_later(
work,
paths,
user,
admin_set_id
)
end

def handle_issue_upload(env)
return unless env.attributes.keys.include? 'uploaded_files'
upload_ids = filter_file_ids(env.attributes['uploaded_files'])
return if upload_ids.empty?
uploads = Hyrax::UploadedFile.find(upload_ids)
paths = uploads.map(&method(:upload_path))
paths = paths.select { |path| path.end_with?('.pdf') }
return if paths.empty?
work = env.curation_concern
# must persist work to serialize job using it
work.save!(validate: false)
user = env.current_ability.current_user.user_key
env.attributes[:admin_set_id] ||= default_admin_set
queue_job(work, paths, user, env.attributes[:admin_set_id])
end

# Given Hyrax::Upload object, return path to file on local filesystem
def upload_path(upload)
# so many layers to this onion:
upload.file.file.file
end

def filter_file_ids(input)
Array.wrap(input).select(&:present?)
end
end
end
end
17 changes: 17 additions & 0 deletions app/jobs/newspaper_works/create_issue_pages_job.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
module NewspaperWorks
# Create child page works for issue
class CreateIssuePagesJob < NewspaperWorks::ApplicationJob
def perform(work, pdf_paths, user, admin_set_id)
# we will need depositor set on work, if it is nil
work.depositor ||= user
# if we do not have admin_set_id yet, set it on the issue work:
work.admin_set_id ||= admin_set_id
# create child pages for each page within each PDF uploaded:
pdf_paths.each do |path|
adapter = NewspaperWorks::Ingest::NewspaperIssueIngest.new(work)
adapter.load(path)
adapter.create_child_pages
end
end
end
end
6 changes: 6 additions & 0 deletions lib/newspaper_works/engine.rb
Original file line number Diff line number Diff line change
Expand Up @@ -8,5 +8,11 @@ module NewspaperWorks
# Engine Class
class Engine < ::Rails::Engine
isolate_namespace NewspaperWorks

config.to_prepare do
# Register actor to handle any NewspaperWorks upload behaviors before
# CreateWithFilesActor gets to them:
Hyrax::CurationConcern.actor_factory.insert_before Hyrax::Actors::CreateWithFilesActor, NewspaperWorks::Actors::NewspaperWorksUploadActor
end
end
end
21 changes: 16 additions & 5 deletions lib/newspaper_works/ingest/newspaper_issue_ingest.rb
Original file line number Diff line number Diff line change
Expand Up @@ -5,20 +5,31 @@ def import
# first, handle the PDF itself on the issue...
super
# ...then create child works from split pages
pages = NewspaperWorks::Ingest::PdfPages.new(path)
create_child_pages
end

# Creates child pages with attached TIFF masters, can be called by
# `import`, or independently if `load` is called first. The
# latter is appropriate if framework is already handling the
# NewspaperIssue file attachment (e.g. Hyrax upload via browser).
def create_child_pages
pages = NewspaperWorks::Ingest::PdfPages.new(path).to_a
pages.each_with_index do |tiffpath, idx|
new_child_page_with_file(tiffpath, idx)
page = new_child_page_with_file(tiffpath, idx)
@work.members.push(page)
end
@work.save!(validate: false) unless pages.empty?
end

def new_child_page_with_file(tiffpath, idx)
page = NewspaperPage.new
page.title = [format("Page %<pagenum>i", pagenum: idx + 1)]
# Set depositor and admin-set id:
page.depositor = @work.depositor
page.save!
@work.members.push(page)
@work.save!
page.admin_set_id = @work.admin_set_id
NewspaperPageIngest.new(page).ingest(tiffpath)
page.save!
page
end
end
end
Expand Down
2 changes: 1 addition & 1 deletion newspaper_works.gemspec
Original file line number Diff line number Diff line change
Expand Up @@ -28,12 +28,12 @@ SUMMARY
spec.add_development_dependency 'bixby'
spec.add_development_dependency 'engine_cart', '~> 2.0'
spec.add_development_dependency "factory_bot", '~> 4.4'
spec.add_development_dependency "faraday"
spec.add_development_dependency 'fcrepo_wrapper', '~> 0.1'
spec.add_development_dependency 'rspec'
spec.add_development_dependency 'rspec-rails', '~> 3.1'
spec.add_development_dependency 'solr_wrapper', '~> 0.4'
spec.add_development_dependency 'sqlite3'

spec.add_dependency 'simple_form', '~> 3.2', '<= 3.5.0'

end
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
require 'faraday'
require 'spec_helper'

RSpec.describe NewspaperWorks::Actors::NewspaperWorksUploadActor do
let(:issue) { build(:newspaper_issue) }
let(:ability) { build(:ability) }
let(:uploaded_pdf_file) { create(:uploaded_pdf_file) }
let(:uploaded_file_ids) { [uploaded_pdf_file.id] }
let(:attributes) { { uploaded_files: uploaded_file_ids } }
let(:terminator) { Hyrax::Actors::Terminator.new }
let(:env) { Hyrax::Actors::Environment.new(issue, ability, attributes) }
let(:middleware) do
stack = ActionDispatch::MiddlewareStack.new.tap do |middleware|
middleware.use described_class
end
stack.build(terminator)
end

let(:uploaded_issue) do
Rails.application.config.active_job.queue_adapter = :inline
middleware.public_send(:create, env)
Rails.application.config.active_job.queue_adapter = :async
# return work, reloaded, because env.curation_concern will be stale after
# running actor.
NewspaperIssue.find(env.curation_concern.id)
end

describe "NewspaperIssue upload of PDF" do
# we over-burden one example, because sadly RSpec does not do well with
# shared state across examples (without use of `before(:all)` which is
# mutually exclusive with `let` in practice, and ruffles rubocop's
# overzealous sense of moral duty, speaking of which:
# rubocop:disable RSpec/ExampleLength
it "correctly creates child pages for issue" do
pages = uploaded_issue.members.select { |w| w.class == NewspaperPage }
expect(pages.size).to eq 2
pages.each_with_index do |page|
# Page needs correct admin set:
expect(page.admin_set_id).to eq 'admin_set/default'
file_sets = page.members.select { |v| v.class == FileSet }
expect(file_sets.size).to eq 1
files = file_sets[0].files
url = files[0].uri.to_s
# fetch the thing from Fedora Commons:
response = Faraday.get(url)
stored_size = response.body.length
expect(stored_size).to be > 0
end
end
# rubocop:enable RSpec/ExampleLength
end
end
6 changes: 6 additions & 0 deletions spec/factories/ability.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
FactoryBot.define do
factory :ability do
user
initialize_with { new(user) }
end
end
9 changes: 9 additions & 0 deletions spec/factories/uploaded_pdf_file.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
FactoryBot.define do
factory :uploaded_pdf_file, class: Hyrax::UploadedFile do
initialize_with do
base = File.join(NewspaperWorks::GEM_PATH, 'spec', 'fixtures', 'files')
pdf_path = File.join(base, 'sample-color-newsletter.pdf')
new(file: File.open(pdf_path), user: create(:user))
end
end
end
13 changes: 13 additions & 0 deletions spec/factories/user.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
FactoryBot.define do
factory :user do
id "skroob"
email "spaceballs@example.com"
password "password_is_12345"
initialize_with do
User.find_or_create_by(id: id) do |user|
user.email = email
user.password = password
end
end
end
end
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
adapter = build(:newspaper_page_ingest)
# Rails.application.config.active_job.queue_adapter = :inline
adapter.ingest(path)
# Rails.application.config.active_job.queue_adapter = :sidekiq
# Rails.application.config.active_job.queue_adapter = :async
file_sets = adapter.work.members.select { |w| w.class == FileSet }
expect(file_sets[0].title).to contain_exactly 'page1.tiff'
expect(file_sets.size).to eq 1
Expand Down
8 changes: 2 additions & 6 deletions spec/spec_helper.rb
Original file line number Diff line number Diff line change
Expand Up @@ -7,15 +7,11 @@
EngineCart.load_application!

RSpec.configure do |config|

# enable FactoryBot:
require 'factory_bot'
config.include FactoryBot::Syntax::Methods
# require to load specific factories:
require 'factories/newspaper_issue'
require 'factories/newspaper_issue_ingest'
require 'factories/newspaper_page'
require 'factories/newspaper_page_ingest'
# auto-detect and load all factories in spec/factories:
FactoryBot.find_definitions

# require shared examples
require 'lib/newspaper_works/ingest/ingest_shared'
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
module NewspaperWorks
class InstallGeneratorTest < Rails::Generators::TestCase
tests InstallGenerator
destination Rails.root.join('tmp/generators')
destination Rails.root.join('tmp', 'generators')
setup :prepare_destination

# test "generator runs without errors" do
Expand Down

0 comments on commit 998498e

Please sign in to comment.