-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add active_encode support to hydra-derivatives #161
Conversation
transcoders besides ffmpeg. Story avalonmediasystem/avalon#1785
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cjcolvar the PR looks good. I added a few comments on code style. I have done little with AWS so I am not inclined to approve or disapprove...
module Hydra::Derivatives::Processors | ||
class ActiveEncodeError < StandardError | ||
def initialize(status, source_path, errors = []) | ||
msg = "ActiveEncode status was \"#{status}\" for #{source_path}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cjcolvar Could this be I18N?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There currently isn't any i18n in this gem and I haven't found any examples of non-UI strings using i18n in hydra gems. Do you think it is still worth doing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i18nizing hydra-derivatives from the ground up feels like a separate stream of work.
end | ||
end | ||
|
||
def wait_for_encode_with_timeout(encode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this method be part of the public interface or should it and the methods below be private?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Makes sense to me.
# After a timeout error, try to cancel the encoding. | ||
def cleanup_after_timeout(encode) | ||
encode.cancel! | ||
rescue => e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be safer to rescue a typed error here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point it isn't really defined what errors may be thrown. Seems like what might need to happen is ActiveEncode's adapters should catch any errors thrown and raise standard errors defined by ActiveEncode. I believe the default type for rescue is StandardError and I think that is the best that can be done right now.
end | ||
end | ||
|
||
def self.processor_options(options) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this method and the ones below be public?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.processor_options
is the only method here that isn't an override of a public method in Runner
. I'll move it to be private.
@cjcolvar You bet. |
24bc216
to
e68faca
Compare
@@ -74,6 +74,7 @@ Hydra::Derivatives::Processors::Video::Processor.timeout = 10.minutes | |||
Hydra::Derivatives::Processors::Document.timeout = 5.minutes | |||
Hydra::Derivatives::Processors::Audio.timeout = 10.minutes | |||
Hydra::Derivatives::Processors::Image.timeout = 5.minutes | |||
Hydra::Derivatives::Processors::ActiveEncode.timeout = 5.minutes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps a naïve question: if I have video files, how do I know whether to use the Video
processor or the ActiveEncode
processor? How would you feel about adding a short section to the README immediately above ## Configuration
that makes clear what the options are for generating derivatives from video? I.e., answering the question, "what does ActiveEncode support bring to hydra-derivatives, and why/when would I use it?"
hydra-derivatives.gemspec
Outdated
@@ -27,6 +27,7 @@ Gem::Specification.new do |spec| | |||
spec.add_dependency 'mini_magick', '>= 3.2', '< 5' | |||
spec.add_dependency 'activesupport', '>= 4.0', '< 6' | |||
spec.add_dependency 'mime-types', '> 2.0', '< 4.0' | |||
spec.add_dependency 'active_encode' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this pin to '~> 0.1'
, or put otherwise: what's the roadmap look like for AE, and might changes come along in 1.0.0 that would break hydra-derivatives?
lib/hydra/derivatives.rb
Outdated
@@ -8,9 +8,12 @@ module Derivatives | |||
extend Deprecation | |||
self.deprecation_horizon = "hydra-derivatives 1.0" | |||
|
|||
require 'active_encode' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this require
here? Other dependencies, like mini_magick
, are required where needed rather than in the Derivatives
module.
module Hydra::Derivatives::Processors | ||
class ActiveEncodeError < StandardError | ||
def initialize(status, source_path, errors = []) | ||
msg = "ActiveEncode status was \"#{status}\" for #{source_path}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i18nizing hydra-derivatives from the ground up feels like a separate stream of work.
class ActiveEncodeError < StandardError | ||
def initialize(status, source_path, errors = []) | ||
msg = "ActiveEncode status was \"#{status}\" for #{source_path}" | ||
msg = "#{msg}: #{errors.join(' ; ')}" unless errors.empty? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use if errors.any?
here instead of unless errors.empty?
?
class << self | ||
private | ||
|
||
def processor_options(options) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose of this method? It looks like it's taking one hash and building a new, similar one from it. I get that much, but why's it needed? (Pls add a comment above this method to make that clear.)
external_file.content = '' | ||
external_file.mime_type = "message/external-body; access-type=URL; URL=\"#{output[:url]}\"" | ||
# external_file.external_url = output[:url] | ||
external_file.original_name = URI.parse(output[:url]).path.split('/').last |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you use Addressable::URI.parse
here, it would be more unicode-friendly.
module Hydra::Derivatives | ||
class RemoteSourceFile | ||
# Finds the file name of the remote source file. | ||
# @param [String, ActiveFedora::Base] String file name, or an object that has a method that will return the file name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the second String
here should be object
, the name of the param.
require 'spec_helper' | ||
|
||
describe Hydra::Derivatives::Processors::ActiveEncode do | ||
# before { # ActiveEncode::Base.engine_adapter = :test } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cruft?
end | ||
|
||
it 'processes the encoding without a timeout' do | ||
expect(processor).to receive(:wait_for_encode_with_timeout).never |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would you think of using .not_to receive
here?
@mjgiarlo I think I took care of everything that isn't documentation or comments changes. I'll come back to those later today/tomorrow. |
@cjcolvar OK, thanks! Ping me when it's 💚 and all ready to go. |
Changes Unknown when pulling ee0ad3f on active_encode_dev_con into ** on master**. |
2 similar comments
Changes Unknown when pulling ee0ad3f on active_encode_dev_con into ** on master**. |
Changes Unknown when pulling ee0ad3f on active_encode_dev_con into ** on master**. |
@mjgiarlo Ready. Any additional changes you'd like to see? |
@cjcolvar Taking one last look. Thanks for all the changes, and yr patience! |
🎉 Thanks for the contribution! Great work. (Do you want/need a new release now that this is merged?) |
The ActiveEncode runner allows using ActiveEncode within hydra-derivatives to generate derivatives using the adapters supported by ActiveEncode. Currently Amazon Elastic Transcoder is the only production ready adapter available but prototypes exist for zencoder and shingoncoder (FFmpeg through ActiveJob). The key differences between ActiveEncode and hydra-derivatives are:
In support of this a source file service and output file service have been added. The
RemoteServiceFile
takes the:source
option and calls a method of that name on the source object to determine the source file path or url to send to the processor. ThePersistExternalFileOutputService
creates anActiveFedora::File
with a url to the derivative output instead of storing the actual bit contents of the derivative in Fedora.Documentation is added for configuring and using the ActiveEncode runner with Amazon Elastic Transcoder which seems like the main use case for users at this point.
Thanks to @val99erie for a lot of this work.