Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple Islandora Sites Connecting to a Single Fedora Repository. #245

Closed
ruebot opened this issue May 20, 2016 · 13 comments
Closed

Multiple Islandora Sites Connecting to a Single Fedora Repository. #245

ruebot opened this issue May 20, 2016 · 13 comments

Comments

@ruebot
Copy link
Member

ruebot commented May 20, 2016

Issue by dmoses
Wednesday Feb 04, 2015 at 21:26 GMT
Originally opened as https://github.com/islandora-interest-groups/Islandora-Fedora4-Interest-Group/issues/14


Title (Goal) Multiple Drupal Sites Connecting to a Single Fedora Repository
Primary Actor Repository and Multisite Implementers
Scope architecture, access, security
Level
Story As a repository / drupal site implementer I want to be able to connect multiple drupal sites within my drupal multi-site[1] to a single fedora repository.

Examples:

  • This is a use case commonly adopted by consortia (eg. UPEI’s CAIRN repository, University of Florida Virtual Campus, Colorado Alliance of Research Libraries).
  • UPEI uses multi-sites locally to split out publicly accessible collections (a group of sites connecting to a single 'public' fedora) and those that are dark (a group of sites connecting to a separate fedora repo).

Remarks:

  • “Multi-site allows you to share a single Drupal installation (including core code, contributed modules, and themes) among several sites” . This option simplifies site maintenance, module upgrades, management of dependencies, documentation, standards maintenance, XML Form deployment, etc.
  • Concerns related to security
    • Users may be involved in a number of Drupal sites and in each of those sites may have different permissions and roles.
    • Need to ensure that only users authenticated in their particular Drupal only see their content, and not other content in the shared repository.

[1] https://www.drupal.org/documentation/install/multi-site

@ruebot
Copy link
Member Author

ruebot commented May 20, 2016

Comment by mjordan
Wednesday Feb 04, 2015 at 21:55 GMT


In the F4 re-architecting of Drupal's role in the Islandora stack, web-facing derivatives will be managed by Drupal (presumably as file attachments to nodes) and not fetched from Fedora 4 on demand as they currently are.

Given this architecture, a specific use case coming from the one @dmoses describes is: I have two Drupals, each containing separate collections that share an object (and therefore each containing a copy of that object's web-facing derivatives). I as admin, or some external process, regenerates derivatives for that object. Now both copies of those derivatives in the separate Drupals need to be refreshed. Would the REST APIs on both Drupals receive a request from Fedora 4 to replace the relevant files (so that admin and end users wouldn't know the difference)?

@ruebot
Copy link
Member Author

ruebot commented May 20, 2016

Comment by DiegoPino
Thursday Feb 05, 2015 at 14:56 GMT


This leads to a question (a derivative question!): how does fedora 4 knows a derivative is needed if not stored inside the new object structure(or said in another way, why should/need fedora know about this)? I mean in terms of object portability. If derivatives are no longer part of the "entity", then the notion of e.g rels-int describing that e.g, an image is a representation (cover) of a book, will have still space? (something like and external datastream?). Or is the decision of leaving derivatives out of the REPO a decision more than a technical restriction?
We have some needs to have components/parts of an Object attached to the notion of object, because our objects are sometimes "authoritative" representations of something and multiple others are related to those.
So having a unique real path(URI) to a derivative would be optimal or at least a form os owl:sameAs.
My biggest concern(ok, not really so bad) of having something in Drupal that is not really the real object but a "node" that represents a "state"+some extras, like derivatives, of what we really have in the repo, is that exactly what @mjordan and @dmoses are suggesting. Synchronisation. If i need to move my drupal site, then i can't reconstruct everything from REPO, i need to move with my drupal data too, nodes, files and regenerate everything based on the decision i made on drupal,(like some stored states) and not on what i implicitly stated on my objects.
Also, my last concern. Derivatives (URI's) will be left out of Solr(if Solr is Object based and not Drupal based). So if i wan't to make some very fancy Islandora Solr Display i need to fetch data from Solr and then ask Drupal of what extra info attached to the node representing the object is there. In our case Solr is cloud based. So our displays use the real path/uri of the object. I would need to write an additional drupal service that allows me to ask other Drupals(in my network) for what they have locally for my PID and add this to my Solr results. Just some random thoughts i'm having right now!

@ruebot
Copy link
Member Author

ruebot commented May 20, 2016

Comment by ksclarke
Thursday Feb 05, 2015 at 15:12 GMT


I don't think derivatives should be in Fedora, but I'm also not sure that, in many cases, they should be in Drupal either. For images, for instance, I'd like to see an IIIF Drupal module and have Drupal, and in this case many Drupals, reference the images from that. That consolidates the image serving (though the image server could be a clustered thing in itself) and relieves us from having to move around Drupal files directories.

I could imagine something similar done with video (though I'm not sure if there is a video IIIF-like thing).

And, I think, using a standard like IIIF.io would mean you could put the derivatives' URIs in Solr because you have a pattern that you know they will be resolvable at even if, at a particular stage in the process, they have not yet been generated.

Edit: Totally forgot about pulibrary/iiif_image_field. I haven't looked at it yet, but did notice when it popped up in my stream. Making a note here about it to help remind me to look at it.

@ruebot
Copy link
Member Author

ruebot commented May 20, 2016

Comment by daniel-dgi
Thursday Feb 05, 2015 at 17:04 GMT


I have no experience with IIIF, but that seems incredibly sane. When I was in the game industry, we would put our assets on a CDN box tuned specifically to serving static assets so the application logic server could be tuned for dynamic requests. Pretty sure Drupal has the ability to 'manage' externally referenced files using the Media and File Entity modules, and decoupling that type of content serving from both Drupal's Apache and Fedora would definitely be the most appropriate way to handle this situation. Single site, multi sites, and even multiple Drupals could all benefit from some sort of setup like this.

Standards are good, so let's check out the iif_image_field module too!

And Kevin, I think you've touched on the elephant in the room there with video. MASSIVE videos are something that have never been handled well, maybe we should pop open another issue to deal with that. If we can handle large videos, we can handle anything 😃

@ruebot
Copy link
Member Author

ruebot commented May 20, 2016

Comment by ruebot
Thursday Feb 05, 2015 at 17:06 GMT


...oh, I have lots of massive video files!

@ruebot
Copy link
Member Author

ruebot commented May 20, 2016

Comment by daniel-dgi
Thursday Feb 05, 2015 at 17:08 GMT


And to get back to the original post, YES! Quite a few of our users run multi-sites for lots of reasons. Seperate branding/theming per collections, consortiums, etc... We are 100% committed to making sure this will happen. Thanks for opening the issue and starting the discussion, Donald.

@ruebot
Copy link
Member Author

ruebot commented May 20, 2016

Comment by ruebot
Thursday Feb 19, 2015 at 18:08 GMT


I'd like to move to formatting our uses cases like how the Fedora community is formatting their uses cases. Little bit of standardization, and we get to flesh things out a little bit better.

So, @dmoses when you have a moment, check out the template I set-up, and edit your initial comment. You can check of an example I did here.

thanks!

@ruebot
Copy link
Member Author

ruebot commented May 20, 2016

Comment by dmoses
Thursday Feb 19, 2015 at 19:09 GMT


Took a poke at that.
https://github.com/Islandora/Islandora-Fedora4-Interest-Group/issues/14
dm

@ruebot
Copy link
Member Author

ruebot commented May 20, 2016

Comment by ruebot
Thursday Feb 19, 2015 at 20:41 GMT


@dmoses++

@ruebot
Copy link
Member Author

ruebot commented May 20, 2016

Comment by dmoses
Thursday Feb 19, 2015 at 20:55 GMT


@ksclarke one of the reasons to leave the derivatives in Fedora is that your XACML policies can apply to the content at the object or datastream level. If versions of the content are stored in multiple places, then some sort of inherited? security will need to apply to that content. Having it all in Fedora simplifies the security. If collections are public ... then having them in multiple places is less of a concern as there usually isn't a policy applied.

Straying from the use case ... other practical considerations around object syncing and distributed datastreams. When you delete an object ... how will Islandora know where the derivatives are to remove them as well if they aren't 'part' of the object. Likewise when you update an object with a new binary ... how will Islandora know how to update the existing derivatives. Maybe this is understood by others (redirect or external ds?) and I may be missing something?

@ruebot
Copy link
Member Author

ruebot commented May 20, 2016

Comment by daniel-dgi
Thursday Feb 19, 2015 at 21:00 GMT


Our xacml's typically mirror Drupal perms anyway, and when derivatives are managed by Drupal they will have security applied to them on the Drupal layer. But yeah, you're right, we'll also have to add some management code for those files in the Drupal hooks in order to make sure they're cleaned up.

I think in Kevin's case there's discussions going on about IIIF and security. I'll gracefully bow out of that, since I've not been keeping up.

@ruebot
Copy link
Member Author

ruebot commented May 20, 2016

Comment by ksclarke
Friday Feb 20, 2015 at 15:58 GMT


Yes, though I haven't been tracking it too well either, documentation on the IIIF auth stuff is available at: https://github.com/IIIF/auth/blob/master/iiif_authentication_working_group_charter.md

My thought (without any implementation behind it) is that once IIIF works out auth, that Fedora would be the definitive source of that information and that an IIIF image server that worked with Fedora would use what's defined in Fedora as the source from which to make auth decisions. Though I need to do this myself, I think it would be good for those of us thinking of using an IIIF server and Fedora 4 to make sure our use cases are handled by the IIIF auth work.

@dannylamb
Copy link
Contributor

Closing old use cases until after MVP doc is released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants