Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

move full text extraction to hydra:derivatives #90

Closed
elrayle opened this issue Oct 5, 2015 · 6 comments
Closed

move full text extraction to hydra:derivatives #90

elrayle opened this issue Oct 5, 2015 · 6 comments

Comments

@elrayle
Copy link

elrayle commented Oct 5, 2015

Move code from Hydra::Works full text extraction into hydra-derivatives allowing the use of the #make_derivatives configuration process defined by hydra-derivatives to identify that full text extraction should occur.

See issue samvera/hydra-works#220 in Hydra::Works for more details.

@jcoyne
Copy link
Member

jcoyne commented Oct 5, 2015

There is no makes_derivatives in the API any longer.

@elrayle
Copy link
Author

elrayle commented Oct 5, 2015

Upon review, I see that a major change was made recently. Hydra-works is not currently in sync with this change as it still uses #makes_derivatives.

This is an interesting change to the approach for generating derivatives. Based on discussions I've heard previously, I was under the impression that there was a desire for apps to be able to choose which derivatives they wanted generated.

For example, once full text extraction was moved over, #makes_derivatives might look something like...

      makes_derivatives do |obj|
        case obj.original_file.mime_type
        when *pdf_mime_types
          obj.transform_file :original_file, thumbnail: { format: 'jpg', size: '338x493' }
          obj.transform_file :original_file, ext_text: { format: 'txt' }
        when ...
        end
      end

If other derivatives are added and are desired for PDFs, then another transform_file call would be added for each that should be run for PDFs. A simple example that an app might want to do now is to generate multiple thumbnails for various sizes.

The new approach seems to be deciding what gets generated for you and bundling together a predefined set of derivatives that you get based on file type. This seems like a less flexible approach. What were the thoughts behind making this change?

@elrayle
Copy link
Author

elrayle commented Oct 5, 2015

BTW, the lack of #makes_derivatives doesn't effect the validity of this issue. It does effect the implementation approach. Also, we will have to consider whether this change goes only in the master branch destined for 3.0 or also in a new branch destined for 2.x release.

@jcoyne
Copy link
Member

jcoyne commented Oct 5, 2015

@elrayle hydra-works has been updated to use the current version of hydra-derivatives and it no longer uses makes_derivatives. See samvera/hydra-works#219 And yes, this is just a note to the implementer. Applications can choose which derivatives they generate by overriding create_derivatives as curation_concerns does: https://github.com/projecthydra-labs/curation_concerns/blob/master/curation_concerns-models/app/models/concerns/curation_concerns/generic_file/derivatives.rb#L14-L37

@elrayle
Copy link
Author

elrayle commented Oct 5, 2015

I'm back up to date on the recent changes. The original issue stands as described except that #create_derivatives block as described in the README will be used to specify which derivatives to run instead of #makes_derivatives. Thanks Justin for the clarifications.

@jcoyne
Copy link
Member

jcoyne commented Dec 1, 2015

Closed by 51d81e7

@jcoyne jcoyne closed this as completed Dec 1, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants