Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include fcr:version in IIIF image URI #3165

Closed
wants to merge 1 commit into from
Closed

Conversation

dlpierce
Copy link
Contributor

Fixes #2910

Include fcr:version in IIIF image URI so the cached file's hashed filename created by riiif is version specific.
Use Rails' URI escape library (instead of CGI) in the spec because colon is handled differently.

Guidance for testing, such as acceptance criteria or new user interface behaviors:

  • Create a work with an image file.
  • Add a new visually distinct version of the image file.
  • Verify the universal viewer displays the new version on the work show page.

@samvera/hyrax-code-reviewers

Use Rails' URI escape library because colon is handled differently.
@RudyOnRails
Copy link
Contributor

Hi @dlpierce - we've been anticipating this one!

What are your thoughts on this patch also being applied to Sufia while using the Universal Viewer? Would it work the same way?

@cjcolvar
Copy link
Member

This is a creative approach! I think this could work, but am still wondering about IIIF manifests that are already in the wild. It seems like some guidance might be useful in the release notes. I think expiring the RIIIF cache after upgrading with this patch would temporarily solve this issue for both new and existing manifests. I'm guessing that if new versions are added for a file then it will still affect old manifests unless local RIIIF cache expiration happens.

@no-reply no-reply added this to Review in Hyrax WG -- Sprint 2 via automation Jul 30, 2018
@no-reply no-reply added this to the 2.x series milestone Jul 30, 2018
@RudyOnRails
Copy link
Contributor

RudyOnRails commented Jul 30, 2018

@cjcolvar - I agree, I believe the final solution will involve two parts. The iiif_endpoint change (which @dlpierce did above), and a change to the HTTPFileResolver.

Like you said above, only newly cached network_files will work okay. As soon as a new version is ingested, and this runs again:

      def fetch
        download_file unless ::File.exist?(file_name)
        file_name
      end

the previous version's image will be displayed since unless ::File.exist?(file_name) will return true. As @jcoyne and you mentioned we'd need to improve the resolver like you suggested. I'm toying around with a url + timestamp idea now.

@dlpierce
Copy link
Contributor Author

Using the file's parent FileSet timestamp from Solr might be a better performing solution since it should be able to avoid some requests to Fedora to determine available versions. There is still the preexisting original_file request, however.

Perhaps the riiif resover could compare its cached file's filesystem mtime to the FileSet timestamp?

@cjcolvar
Copy link
Member

cjcolvar commented Aug 1, 2018

Looking at this again I have a small concern about manifests being downloaded after this change is applied and then being stuck with fedora specific path elements in the uri. Upgrading to a valkyrie-based Hyrax would probably break these uris unless special routes were added to continue to support them. I'm not sure if that should block this PR though which makes real progress in fixing a bug.

@dlpierce Maybe the timestamp approach you mention would deal with this concern?
@no-reply @vantuyls @chrisdaaz What are your thoughts?

@jcoyne
Copy link
Member

jcoyne commented Aug 1, 2018

Another way this could be fixed would be to create a listener to the Fedora messages and invalidate the cache entries when an appropriate message is received.

@vantuyls
Copy link

vantuyls commented Aug 1, 2018

@cjcolvar i'd be interested to hear @no-reply and others thoughts on whether we should be predictively flagging PRs based on what we think valkyrie based hyrax is going to look like. should we, instead, be flagging this type of thing as work we need to do on the valkyrie - hyrax sprints to ensure that this doesn't break?

@@ -13,17 +13,18 @@ def display_image
return nil unless ::FileSet.exists?(id) && solr_document.image? && current_ability.can?(:read, id)
# @todo this is slow, find a better way (perhaps index iiif url):
original_file = ::FileSet.find(id).original_file
latest_file_id = original_file.has_versions? ? ActiveFedora::File.uri_to_id(original_file.versions.last.uri) : original_file.id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dlpierce - I was just wondering - does this need to be encoded? So that Apache or the like doesn't receive a decoded path? I am learning a bit here and in my Suifa application, I had to apply your spec's ActionDispatch::Journey::Router::Utils.escape_segment(uri) above so that the image requests went through ok.

@dlpierce dlpierce self-assigned this Aug 7, 2018
@dlpierce dlpierce moved this from Review to In progress in Hyrax WG -- Sprint 2 Aug 7, 2018
@@ -13,17 +13,18 @@ def display_image
return nil unless ::FileSet.exists?(id) && solr_document.image? && current_ability.can?(:read, id)
# @todo this is slow, find a better way (perhaps index iiif url):
original_file = ::FileSet.find(id).original_file
latest_file_id = original_file.has_versions? ? ActiveFedora::File.uri_to_id(original_file.versions.last.uri) : original_file.id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, original_file.has_versions? always seems to return true even when a file_set has only one version, so our iiif_endpoints changed for all our manifests, invalidating Rails.cache entries. Perhaps something like this is needed?:

latest_file_id = original_file.versions.count > 1 ? ActiveFedora::File.uri_to_id(original_file.versions.last.uri) : original_file.id

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of has_versions? was meant as a safeguard in case a file did not have support for versioning, although I do not know if that is really an issue in any application.

For now I'm working on the potential timestamp based solution.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the knowledge. And thanks for getting this solution going so we could begin applying it to Sufia.

Not sure if it helps to share, but we are experiencing some older version Riiif tiles still when a new file_set version derivative is not completed yet. FileSetPresenter#display_image returns the new iiif_endpoint using the new Fedora file version id, but when the images requests are sent to Riiif, the new file id invalidates the Rails.cache keys, sparking the file resolver to check it's cache for a matching hashed file name, but it's checking the derivative path and the timestamp of the old version. That new file can sometimes not be ready yet, since it is created in a background job.

What i plan to do is control the iiif_endpoint change from some sort of after_perform on the CreateDerivativesJob - that way the derivative file will be ready when the HTTPFileResolver fetches.

What I'm not 100% sure of yet is how I will control the versioning. I may just increment some solr field on the file_set that acts as the version.

Any thoughts from the community are welcome, even those that correct my thinking. All of this is new to me and I may be a smidgeon off.

Best,
Kevin

@no-reply no-reply self-requested a review August 29, 2018 16:53
@no-reply no-reply added this to Review in Hyrax WG -- Sprint 3 Sep 4, 2018
@dlpierce dlpierce moved this from Review to In progress in Hyrax WG -- Sprint 3 Sep 13, 2018
@no-reply no-reply added this to In progress in Hyrax WG -- Sprint 4 Oct 1, 2018
@stale
Copy link

stale bot commented Oct 11, 2018

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Oct 11, 2018
@no-reply
Copy link
Member

@dlpierce how can we unblock finishing this work?

@stale stale bot removed the stale label Oct 13, 2018
@dlpierce
Copy link
Contributor Author

@no-reply I believe I can focus on this now that CircleCI is usable.

@vantuyls vantuyls added this to Ready in Hyrax WG -- Sprint 5 Oct 29, 2018
@ghost
Copy link

ghost commented Nov 14, 2018

@dlpierce any updates on this? tx

@dlpierce
Copy link
Contributor Author

I've been working on an alternative approach that doesn't modify the URIs, but instead compares timestamps. I plan to put that work in its own PR, so this one could be closed unless people think it should be used after all.

@cjcolvar
Copy link
Member

Closed per suggestion from @dlpierce

@cjcolvar cjcolvar closed this Nov 28, 2018
@mjgiarlo mjgiarlo deleted the 2910-versioned-iiif branch April 1, 2019 17:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Hyrax WG -- Sprint 2
  
In progress
Hyrax WG -- Sprint 3
  
In progress
Hyrax WG -- Sprint 4
  
In progress
Development

Successfully merging this pull request may close these issues.

None yet

6 participants