Skip to content
Permalink
Browse files

Merge 7d3ab98 into ddbca12

  • Loading branch information...
cjcolvar committed May 17, 2017
2 parents ddbca12 + 7d3ab98 commit ccd8459f4a3d041a7255822d249dbdc4bb80f247
@@ -74,6 +74,7 @@ Hydra::Derivatives::Processors::Video::Processor.timeout = 10.minutes
Hydra::Derivatives::Processors::Document.timeout = 5.minutes
Hydra::Derivatives::Processors::Audio.timeout = 10.minutes
Hydra::Derivatives::Processors::Image.timeout = 5.minutes
Hydra::Derivatives::Processors::ActiveEncode.timeout = 5.minutes
```

@@ -88,6 +89,28 @@ Hydra::Derivatives::Processors::Video::Processor.config.mkv.codec = '-vcodec ffv
Hydra::Derivatives::Processors::Video::Processor.config.jpeg.codec = '-vcodec mjpeg'
```

### Configuration for Audio/Video Processing with ActiveEncode

```ruby
# Set the transcoding engine
ActiveEncode::Base.engine_adapter = :elastic_transcoder
# Sleep time (in seconds) to poll for status of encoding job
Hydra::Derivatives.active_encode_poll_time = 10
# If you want to use a different class for the source file service
Hydra::Derivatives::ActiveEncodeDerivatives.source_file_service = MyCustomSourceFileService
# If you want to use a different class for the output file service
Hydra::Derivatives::ActiveEncodeDerivatives.output_file_service = MyCustomOutputFileService
```

Note: Please don't confuse these methods with the similar methods in the parent class:
`Hydra::Derivatives.source_file_service` and `Hydra::Derivatives.output_file_service`

For additional documentation on using ActiveEncode, see:
* [Using Amazon Elastic Transcoder](doc/amazon_elastic_transcoder.md)

### Additional Directives

#### Layers
@@ -0,0 +1,116 @@
# Create Audio and Video Derivatives using Amazon Elastic Transcoder

`hydra-derivatives` uses the
[active\_encode gem](https://github.com/projecthydra-labs/active_encode)
to allow you to use different encoding services.
These instructions are for Amazon's Elastic Transcoder service.

## Prerequsites

### Set up the Elastic Transcoder Pipeline

Set up a pipeline on AWS Elastic Transcoder that defines:

* input bucket
* bucket for transcoded files
* bucket for thumbnails

### Configure AWS credentials

Optional: If you don't want to pass these values in your ruby code using `Aws.config`, you can set environment variables instead:

* AWS\_ACCESS\_KEY\_ID
* AWS\_SECRET\_ACCESS\_KEY
* AWS\_REGION

### Install gems

Add to your `Gemfile`:

* aws-sdk

### Configure initializer

In an initializer file such as `config/initializers/active_encode.rb`, make sure you have the following code:

```ruby
# Use Amazon's Elastic Transcoder
ActiveEncode::Base.engine_adapter = :elastic_transcoder
```

## How to create derivatives (Multiple derivatives per Elastic Transcoder job)

```ruby
# Access config for AWS
Aws.config[:access_key_id] = 'put your access key here'
Aws.config[:secret_access_key] = 'put your secret key here'
Aws.config[:region] = 'us-east-1'
# The pipeline that I set up in Elastic Transcoder
pipeline_id = '1490715200916-25b08y'
# The file "sample_data.mp4" has already been uploaded to the input bucket for my pipeline.
input_file = 'sample_data.mp4'
# Choose a name for the output files
base_file_name = 'output_file_17'
# Settings for a low-res video derivative using a preset for a 320x240 resolution mp4 file
low_res_video = { key: "#{base_file_name}.mp4", preset_id: '1351620000001-000061' }
# Settings for a flash video derivative
flash_video = { key: "#{base_file_name}.flv", preset_id: '1351620000001-100210' }
# Settings to send to the Elastic Transcoder job
job_settings = { pipeline_id: pipeline_id, output_key_prefix: "active_encode-demo_app/", outputs: [low_res_video, flash_video] }
# Run the encoding
Hydra::Derivatives::ActiveEncodeDerivatives.create(input_file, outputs: [job_settings])
# Note: Your rails console will not return to the prompt until the encoding is complete,
# so it might sit there for several minutes with no feedback.
# Use the AWS console to see the current status of the encoding.
```

## How to create derivatives (One derivative per Elastic Transcoder job)

If you want to run a separate Elastic Transcoder job for each derivative file, you could do something like this:

```ruby
# Settings for a low-res video derivative using a preset for a 320x240 resolution mp4 file.
low_res_preset_id = '1351620000001-000061'
low_res_output_file = 'output_15.mp4'
low_res_video = { pipeline_id: pipeline_id, output_key_prefix: "active_encode-demo_app/", outputs: [{ key: low_res_output_file, preset_id: low_res_preset_id }] }
# Settings for a flash video derivative
flash_preset_id = '1351620000001-100210'
flash_output_file = 'output_15.flv'
flash_video = { pipeline_id: pipeline_id, output_key_prefix: "active_encode-demo_app/", outputs: [{ key: flash_output_file, preset_id: flash_preset_id }] }
Hydra::Derivatives::ActiveEncodeDerivatives.create(input_file, outputs: [low_res_video, flash_video])
```

## How to pass in a ruby object

If you want to pass in an `ActiveFedora::Base` object (or some other record) instead of just a String for the input file name, you need to set the `source` option to specify which method to call on your object to get the file name. For example:

```ruby
# Some object that contains the source file name
class Video
attr_accessor :source_file_name
end
video_record = Video.new
video_record.source_file_name = 'sample_data.mp4'
Hydra::Derivatives::ActiveEncodeDerivatives.create(video_record, source: :source_file_name, outputs: [low_res_video])
```

## How to pass in a custom encode class

If you don't want to use the default encode class `::ActiveEncode::Base`, you can pass in `encode_class`:

```ruby
Hydra::Derivatives::ActiveEncodeDerivatives.create(video_record, encode_class: MyCustomEncode, source: :source_file_name, outputs: [low_res_video])
```

@@ -27,6 +27,8 @@ Gem::Specification.new do |spec|
spec.add_dependency 'mini_magick', '>= 3.2', '< 5'
spec.add_dependency 'activesupport', '>= 4.0', '< 6'
spec.add_dependency 'mime-types', '> 2.0', '< 4.0'
spec.add_dependency 'active_encode', '~>0.1'
spec.add_dependency 'addressable', '~>2.5'
spec.add_dependency 'deprecation'
end

@@ -11,6 +11,7 @@ module Derivatives
# Runners take a single input and produce one or more outputs
# The runner typically accomplishes this by using one or more processors
autoload_under 'runners' do
autoload :ActiveEncodeDerivatives
autoload :AudioDerivatives
autoload :DocumentDerivatives
autoload :FullTextExtract
@@ -30,8 +31,10 @@ module Derivatives

autoload_under 'services' do
autoload :RetrieveSourceFileService
autoload :RemoteSourceFile
autoload :PersistOutputFileService
autoload :PersistBasicContainedOutputFileService
autoload :PersistExternalFileOutputFileService
autoload :TempfileService
autoload :MimeTypeService
end
@@ -48,7 +51,7 @@ def self.reset_config!
end

CONFIG_METHODS = [:ffmpeg_path, :libreoffice_path, :temp_file_base, :fits_path, :kdu_compress_path,
:kdu_compress_recipes, :enable_ffmpeg, :source_file_service, :output_file_service].freeze
:kdu_compress_recipes, :enable_ffmpeg, :source_file_service, :output_file_service, :active_encode_poll_time].freeze
CONFIG_METHODS.each do |method|
module_eval <<-RUBY
def self.#{method}
@@ -5,7 +5,8 @@ module Derivatives
class Config
attr_writer :ffmpeg_path, :libreoffice_path, :temp_file_base,
:source_file_service, :output_file_service, :fits_path,
:enable_ffmpeg, :kdu_compress_path, :kdu_compress_recipes
:enable_ffmpeg, :kdu_compress_path, :kdu_compress_recipes,
:active_encode_poll_time

def ffmpeg_path
@ffmpeg_path ||= 'ffmpeg'
@@ -72,6 +73,13 @@ def kdu_compress_recipes
"Stiles={1024,1024}" ).gsub(/\s+/, " ").strip
}
end

# The poll time (in seconds) that the active encode
# processor will sleep before it checks the status of an
# encoding job.
def active_encode_poll_time
@active_encode_poll_time ||= 10
end
end
end
end
@@ -6,6 +6,7 @@ module Processors
autoload :Processor
end

autoload :ActiveEncode
autoload :Audio
autoload :Document
autoload :Ffmpeg
@@ -0,0 +1,56 @@
require 'active_encode'

module Hydra::Derivatives::Processors
class ActiveEncodeError < StandardError
def initialize(status, source_path, errors = [])
msg = "ActiveEncode status was \"#{status}\" for #{source_path}"
msg = "#{msg}: #{errors.join(' ; ')}" if errors.any?
super(msg)
end
end

class ActiveEncode < Processor
class_attribute :timeout
attr_accessor :encode_class
attr_reader :encode_job

def initialize(source_path, directives, opts = {})
super
@encode_class = opts.delete(:encode_class) || ::ActiveEncode::Base
end

def process
@encode_job = encode_class.create(source_path, directives)
timeout ? wait_for_encode_job_with_timeout : wait_for_encode_job
encode_job.output.each do |output|
output_file_service.call(output, directives)
end
end

private

def wait_for_encode_job_with_timeout
Timeout.timeout(timeout) { wait_for_encode_job }
rescue Timeout::Error
cleanup_after_timeout
end

# Wait until the encoding job is finished. If the status
# is anything other than 'completed', raise an error.
def wait_for_encode_job
sleep Hydra::Derivatives.active_encode_poll_time while encode_job.reload.running?
raise ActiveEncodeError.new(encode_job.state, source_path, encode_job.errors) unless encode_job.completed?
end

# After a timeout error, try to cancel the encoding.
def cleanup_after_timeout
encode_job.cancel!
rescue => e
cancel_error = e
ensure
msg = "Unable to process ActiveEncode derivative: The command took longer than #{timeout} seconds to execute. Encoding will be cancelled."
msg = "#{msg} An error occurred while trying to cancel encoding: #{cancel_error}" if cancel_error
raise Hydra::Derivatives::TimeoutError, msg
end
end
end
@@ -0,0 +1,43 @@
module Hydra::Derivatives
class ActiveEncodeDerivatives < Runner
# @param [String, ActiveFedora::Base] object_or_filename source file name (or path), or an object that has a method that will return the file name
# @param [Hash] options options to pass to the encoder
# @option options [Symbol] :source a method that can be called on the object to retrieve the source file's name
# @option options [Symbol] :encode_class class name of the encode object (usually a subclass of ::ActiveEncode::Base)
# @options options [Array] :outputs a list of desired outputs
def self.create(object_or_filename, options)
processor_opts = processor_options(options)
source_file(object_or_filename, options) do |file_name|
transform_directives(options.delete(:outputs)).each do |instructions|
processor = processor_class.new(file_name, instructions, processor_opts)
processor.process
end
end
end

# Use the source service configured for this class or default to the remote file service
def self.source_file_service
@source_file_service || RemoteSourceFile
end

# Use the output service configured for this class or default to the external file service
def self.output_file_service
@output_file_service || PersistExternalFileOutputFileService
end

def self.processor_class
Processors::ActiveEncode
end

class << self
private

def processor_options(options)
opts = { output_file_service: output_file_service }
encode_class = options.delete(:encode_class)
opts = opts.merge(encode_class: encode_class) if encode_class
opts
end
end
end
end
@@ -0,0 +1,20 @@
require 'addressable'

module Hydra::Derivatives
class PersistExternalFileOutputFileService < PersistOutputFileService
# Persists a new file at specified location that points to external content
# @param [Hash] output information about the external derivative file
# @option output [String] url the location of the external content
# @param [Hash] directives directions which can be used to determine where to persist to.
# @option directives [String] url This can determine the path of the object.
def self.call(output, directives)
external_file = ActiveFedora::File.new(directives[:url])
# TODO: Replace the following two lines with the shorter call to #external_url once active_fedora/pull/1234 is merged
external_file.content = ''
external_file.mime_type = "message/external-body; access-type=URL; URL=\"#{output[:url]}\""
# external_file.external_url = output[:url]
external_file.original_name = Addressable::URI.parse(output[:url]).path.split('/').last
external_file.save
end
end
end
@@ -0,0 +1,18 @@
# For the case where the source file is a remote file, and we
# don't want to download the file locally, just return the
# file name or file path (or whatever we need to pass to the
# encoding service so that it can find the file).

module Hydra::Derivatives
class RemoteSourceFile
# Finds the file name of the remote source file.
# @param [String, ActiveFedora::Base] object file name, or an object that has a method that will return the file name
# @param [Hash] options
# @option options [Symbol] :source a method that can be called on the object to retrieve the source file's name
# @yield [String] the file name
def self.call(object, options, &_block)
source_name = options.fetch(:source, :to_s)
yield(object.send(source_name))
end
end
end

0 comments on commit ccd8459

Please sign in to comment.
You can’t perform that action at this time.