Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add active_encode support to hydra-derivatives #161

Merged
merged 8 commits into from
Jun 9, 2017

Conversation

cjcolvar
Copy link
Member

The ActiveEncode runner allows using ActiveEncode within hydra-derivatives to generate derivatives using the adapters supported by ActiveEncode. Currently Amazon Elastic Transcoder is the only production ready adapter available but prototypes exist for zencoder and shingoncoder (FFmpeg through ActiveJob). The key differences between ActiveEncode and hydra-derivatives are:

  • ActiveEncode can produce multiple derivatives of potentially different formats (video derivatives, thumbnail images) from a single invocation
  • ActiveEncode does not necessarily operate on local files
  • ActiveEncode returns a url to the derivatives it creates and not a bitstream
  • ActiveEncode allows for polling to get updates on its progress

In support of this a source file service and output file service have been added. The RemoteServiceFile takes the :source option and calls a method of that name on the source object to determine the source file path or url to send to the processor. The PersistExternalFileOutputService creates an ActiveFedora::File with a url to the derivative output instead of storing the actual bit contents of the derivative in Fedora.

Documentation is added for configuring and using the ActiveEncode runner with Amazon Elastic Transcoder which seems like the main use case for users at this point.

Thanks to @val99erie for a lot of this work.

Copy link
Contributor

@carolyncole carolyncole left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cjcolvar the PR looks good. I added a few comments on code style. I have done little with AWS so I am not inclined to approve or disapprove...

module Hydra::Derivatives::Processors
class ActiveEncodeError < StandardError
def initialize(status, source_path, errors = [])
msg = "ActiveEncode status was \"#{status}\" for #{source_path}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cjcolvar Could this be I18N?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There currently isn't any i18n in this gem and I haven't found any examples of non-UI strings using i18n in hydra gems. Do you think it is still worth doing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i18nizing hydra-derivatives from the ground up feels like a separate stream of work.

end
end

def wait_for_encode_with_timeout(encode)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this method be part of the public interface or should it and the methods below be private?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Makes sense to me.

# After a timeout error, try to cancel the encoding.
def cleanup_after_timeout(encode)
encode.cancel!
rescue => e
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be safer to rescue a typed error here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point it isn't really defined what errors may be thrown. Seems like what might need to happen is ActiveEncode's adapters should catch any errors thrown and raise standard errors defined by ActiveEncode. I believe the default type for rescue is StandardError and I think that is the best that can be done right now.

end
end

def self.processor_options(options)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this method and the ones below be public?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.processor_options is the only method here that isn't an override of a public method in Runner. I'll move it to be private.

@cjcolvar
Copy link
Member Author

@Cam156 Thanks for your comments. I think I responded or handled them in the last commit I pushed.
@mjgiarlo Do you mind reviewing as well since you looked over things while they were WIP?

@mjgiarlo
Copy link
Member

@cjcolvar You bet.

@@ -74,6 +74,7 @@ Hydra::Derivatives::Processors::Video::Processor.timeout = 10.minutes
Hydra::Derivatives::Processors::Document.timeout = 5.minutes
Hydra::Derivatives::Processors::Audio.timeout = 10.minutes
Hydra::Derivatives::Processors::Image.timeout = 5.minutes
Hydra::Derivatives::Processors::ActiveEncode.timeout = 5.minutes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps a naïve question: if I have video files, how do I know whether to use the Video processor or the ActiveEncode processor? How would you feel about adding a short section to the README immediately above ## Configuration that makes clear what the options are for generating derivatives from video? I.e., answering the question, "what does ActiveEncode support bring to hydra-derivatives, and why/when would I use it?"

@@ -27,6 +27,7 @@ Gem::Specification.new do |spec|
spec.add_dependency 'mini_magick', '>= 3.2', '< 5'
spec.add_dependency 'activesupport', '>= 4.0', '< 6'
spec.add_dependency 'mime-types', '> 2.0', '< 4.0'
spec.add_dependency 'active_encode'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this pin to '~> 0.1', or put otherwise: what's the roadmap look like for AE, and might changes come along in 1.0.0 that would break hydra-derivatives?

@@ -8,9 +8,12 @@ module Derivatives
extend Deprecation
self.deprecation_horizon = "hydra-derivatives 1.0"

require 'active_encode'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this require here? Other dependencies, like mini_magick, are required where needed rather than in the Derivatives module.

module Hydra::Derivatives::Processors
class ActiveEncodeError < StandardError
def initialize(status, source_path, errors = [])
msg = "ActiveEncode status was \"#{status}\" for #{source_path}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i18nizing hydra-derivatives from the ground up feels like a separate stream of work.

class ActiveEncodeError < StandardError
def initialize(status, source_path, errors = [])
msg = "ActiveEncode status was \"#{status}\" for #{source_path}"
msg = "#{msg}: #{errors.join(' ; ')}" unless errors.empty?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use if errors.any? here instead of unless errors.empty??

class << self
private

def processor_options(options)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of this method? It looks like it's taking one hash and building a new, similar one from it. I get that much, but why's it needed? (Pls add a comment above this method to make that clear.)

external_file.content = ''
external_file.mime_type = "message/external-body; access-type=URL; URL=\"#{output[:url]}\""
# external_file.external_url = output[:url]
external_file.original_name = URI.parse(output[:url]).path.split('/').last
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you use Addressable::URI.parse here, it would be more unicode-friendly.

module Hydra::Derivatives
class RemoteSourceFile
# Finds the file name of the remote source file.
# @param [String, ActiveFedora::Base] String file name, or an object that has a method that will return the file name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the second String here should be object, the name of the param.

require 'spec_helper'

describe Hydra::Derivatives::Processors::ActiveEncode do
# before { # ActiveEncode::Base.engine_adapter = :test }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cruft?

end

it 'processes the encoding without a timeout' do
expect(processor).to receive(:wait_for_encode_with_timeout).never
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would you think of using .not_to receive here?

@cjcolvar
Copy link
Member Author

@mjgiarlo I think I took care of everything that isn't documentation or comments changes. I'll come back to those later today/tomorrow.

@mjgiarlo
Copy link
Member

@cjcolvar OK, thanks! Ping me when it's 💚 and all ready to go.

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling ee0ad3f on active_encode_dev_con into ** on master**.

2 similar comments
@coveralls
Copy link

Coverage Status

Changes Unknown when pulling ee0ad3f on active_encode_dev_con into ** on master**.

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling ee0ad3f on active_encode_dev_con into ** on master**.

@cjcolvar
Copy link
Member Author

cjcolvar commented Jun 9, 2017

@mjgiarlo Ready. Any additional changes you'd like to see?

@mjgiarlo
Copy link
Member

mjgiarlo commented Jun 9, 2017

@cjcolvar Taking one last look. Thanks for all the changes, and yr patience!

@mjgiarlo mjgiarlo merged commit 6e68c3d into master Jun 9, 2017
@mjgiarlo mjgiarlo deleted the active_encode_dev_con branch June 9, 2017 15:16
@mjgiarlo
Copy link
Member

mjgiarlo commented Jun 9, 2017

🎉 Thanks for the contribution! Great work. (Do you want/need a new release now that this is merged?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants