Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Fetching contributors…

Cannot retrieve contributors at this time

371 lines (255 sloc) 16.809 kb

The concepts covered in this tutorial can be applied in any application with the active-fedora gem installed. (Don’t forget to “require ‘active-fedora’”).

Dependencies

You will need Ruby 1.8.7+ to go through this tutorial. If you don’t have Ruby 1.8.7 installed, RVM is the best way to install it.

Get a Copy of the Code

It is probably easiest to Because this tutorial uses the sample files included For this tutorial, you’re cloning a full copy of the active-fedora code so you have access to the sample files that are stored there.

First, clone the git repository and cd into the root:

git clone git@github.com:mediashelf/active_fedora.git
cd active_fedora

If you don’t have bundler installed yet, install it:

gem install bundler

Now let bundler handle installing active-fedora’s dependencies:

bundle install

Run Solr & Fedora

Get a copy of hydra-jetty and start it (easiest)

hydra-jetty is a copy of a Jetty server with Fedora and Solr installed and ready to use with ActiveFedora. Grab a copy and start it up.

git clone git://github.com/projecthydra/hydra-jetty.git
cd hydra-jetty
java -jar start.jar

This will start Jetty and spit a bunch of info out onto the console. Leave that terminal window open and open a new one to play around with ActiveFedora.

You can also set up your own copies of Fedora and Solr to run against. For info on that, see Setting up Fedora and Solr for use with ActiveFedora

Open up the (Rails) Console (or, use irb)

script/console
 
require "rubygems"
require "active-fedora"
require "active_fedora/samples" # these are the sample models and datastreams that come with ActiveFedora

Initialize ActiveFedora

In order to function, ActiveFedora needs to know where Fedora and Solr are running. It gets this information from a YAML file. You can see a sample fedora.yml in the ActiveFedora code on GitHub: http://github.com/mediashelf/active_fedora/blob/master/config/fedora.yml

ActiveFedora.init
I, [...]  INFO -- : Using the default fedora.yml that comes with active-fedora.  
If you want to override this, pass the path to fedora.yml as an argument to ActiveFedora.init or set RAILS_ROOT and put fedora.yml into #{RAILS_ROOT}/config.
I, [...]  INFO -- : FEDORA: loading ActiveFedora config from /opt/local/lib/ruby/gems/1.8/gems/active-fedora-1.1.3/config/fedora.yml
I, [...]  INFO -- : FEDORA: initializing ActiveFedora::SolrService with solr_config: {:url=>"http://127.0.0.1:8983/solr/development"}
I, [...8]  INFO -- : FEDORA: initialized Solr with ActiveFedora.solr_config: #<ActiveFedora::SolrService:0x1021245f0 @conn=#<Solr::Connection:... >>>
I, [...]  INFO -- : FEDORA: initializing Fedora with fedora_config: {:url=>"http://fedoraAdmin:fedoraAdmin@127.0.0.1:8983/fedora"}
I, [...]  INFO -- : FEDORA: initialized Fedora as: #<Fedora::Repository:0x102123b78 ....>
=> true

As you can see, ActiveFedora.init defaults to using the fedora.yml included in the gem, which points to a local instance of jetty running on port 8983 with fedora and solr installed.

If you want to use a different yml file, put your info (pointing ActiveFedora to specific Fedora & Solr URLs) into the file and pass its path to ActiveFedora.init:

ActiveFedora.init("path/to/fedora.yml")

ActiveFedora within Rails

If you are running a rails app, ActiveFedora.init will automatically look for config/fedora.yml

Also, within a rails app, you should create a file in config/initializers (ie. fedora_config.rb) that calls ActiveFedora.init

Load a Fixture Object To Play With

The ActiveFedora code includes sample Fedora objects (as foxml files) that you can load into a Fedora repository and play around with. Here we will load the one called hydrangea_fixture_mods_article1.foxml.xml:

filename = File.join(File.dirname(__FILE__),"spec","fixtures", "hydrangea_fixture_mods_article1.foxml.xml")
file = File.new(filename, "r")
result = foxml = Fedora::Repository.instance.ingest(file.read)

If you get an error that starts with the lines below, this means that you already have a copy of that object in fedora.

  Fedora::ServerError: Failed with 500 Error from Fedora: javax.ws.rs.WebApplicationException: org.fcrepo.server.errors.ObjectExistsException: The PID 'hydrangea:fixture_mods_article1' already exists in the registry; the object can't be re-created.

The easiest way to delete an object from Fedora is to use the following line. Note that this will raise an error if the object didn’t exist in the first place.

  ActiveFedora::Base.load_instance("hydrangea:fixture_mods_article1").delete

To see a more complete implementation of importing and deleting Fedora objects, see the code in this gem’s fedora rake tasks https://github.com/mediashelf/active_fedora/blob/master/lib/tasks/fedora.rake

When you’re done playing around with importing and deleting, make sure that you leave a copy of hydrangea:fixture_mods_article1 in fedora so we can play with it.

Define a Model for Your (Active)Fedora Objects

Look at the SpecialThing model defined in lib/active_fedora/samples/special_thing.rb to see how you declare an ActiveFedora model.

Create an instance of the SpecialThing class:

newthing = SpecialThing.new

Get the pid of your new object:

newthing.pid
=> "changeme:30"

This pid was retrieved from Fedora’s getNextPid method.

Your object will not show up in the actual Fedora repository until you save it using newthing.save, but let’s hold off on saving it for now.

Fedora RELATIONSHIPS in ActiveFedora

ActiveFedora provides convenience methods for creating and editing Fedora RELS-EXT relationships. It also auto-generates methods for searching these relationships with Solr. (see https://github.com/projecthydra/solrizer and https://github.com/projecthydra/solrizer-fedora)

Use the relationships method to list the object’s relationships:

newthing.relationships
=> {:self=>{}}

The SpecialThing class definition contains these lines:

  #
  # RELATIONSHIPS
  #
  
  # This is an example of how you can add a custom relationship to a model
  # This will allow you to call .derivations on instances of the model to get a list of all of the _outbound_ "hasDerivation" relationships in the RELS-EXT datastream
  has_relationship "derivations", :has_derivation

  # This will allow you to call .inspirations on instances of the model to get a list of all of the objects that assert "hasDerivation" relationships pointing at this object
  has_relationship "inspirations", :has_derivation, :inbound => true  

ActiveFedora creates methods for the SpecialThing object based on has_relationship lines. So we can call the “inspirations” method that is automatically created by ActiveFedora:

newthing.inspirations
=> []

This method is actually making a search request to Solr — it is looking in Solr to see if the “newthing” object has any “inspirations” relationships.

Now we’ll create another Fedora object (using the default ActiveFedora object model) and we’ll use ActiveFedora’s add_relationship method to relate our new object to the SpecialThing object. We’ll also save our new object in our Fedora repository.

newobj = ActiveFedora::Base.new
newobj.add_relationship(:has_derivation, newthing)
=> true
newobj.relationships
=> {:self=>{:has_derivation=>["info:fedora/changeme:30"]}}
newobj.save
=> ...
newobj.pid
=> "changeme:164" # this is the pid you want to put in the following URLs as a replacement for (PID)

You can see objects in Fedora by going to http://localhost:8983/fedora/objects/(PID) and you can see the relationships for an object by looking at the RELS-EXT datastream content: http://localhost: 8983/fedora/objects/(PID)/datastreams/RELS-EXT/content

Now let’s see if the “newthing” object has an “inspiration” relationship with our “newobj”

newthing.inspirations
=> ... (FIXME: put expected output here)
newthing.inspirations.each {|pt| puts pt.pid }
=> ... (FIXME: put expected output here)
newthing.inspirations(:response_format=>:id_array)
=> ... (FIXME: put expected output here)

Note that you didn’t have to save the “newthing” object in order for this relationship to show up in Solr because it is an inbound relationship.

Only the ActiveFedora object making the assertion needs to be saved in order for the search to work. In our example above, new_obj asserts :has_derivation (rather than the derivative asserting :is_derivation_of), so only new_obj had to be saved.

Fedora DATASTREAMS & METADATA in ActiveFedora

Blobs (a.k.a. File Datastreams, a.k.a Managed Content Datastreams)

Here we create a simple Datastream (using the default Datastream model).

file = File.new('spec/fixtures/minivan.jpg')
=> #<File:spec/fixtures/minivan.jpg>
file_ds = ActiveFedora::Datastream.new(:dsID => "minivan", :dsLabel => 'hello', :controlGroup => 'M', :blob => file)
=> ...
newthing.add_datastream(file_ds)
=> "minivan"
newthing.save
=> true

Now use your browser to find the file datastreams in Fedora …

On auto-generating datastream ids

If you don’t specify a dsid, ActiveFedora will generate one for you. In the example below, “DS1” is the dsID assigned to the new datastream

file_ds2 = ActiveFedora::Datastream.new(:dsLabel => 'Minivan Plays', :altIDs => 'default', :controlGroup => 'M', :blob => file)
newthing.add_datastream(file_ds2)
=> "DS1"
newthing.datastreams.keys
=> ["DS1", "descMetadata", "minivan", "RELS-EXT", "rightsMetadata", "DC", "extraMetadataForFun"]
newthing.datastreams_in_memory["DS1"] == file_ds2
=> true

You can choose a different prefix for the dsid by passing a :prefix value to add_datastream (be careful to ensure that the resulting dsid is a valid XMLString, or fedora will reject it!)

file_ds3 = ActiveFedora::Datastream.new(:dsLabel => 'Minivan Plays', :altIDs => 'default', :controlGroup => 'M', :blob => file)
newthing.add_datastream(file_ds3, :prefix=>"Foo")
=> "Foo1"
newthing.datastreams.keys
newthing.save

Retrieving Existing Fedora Repository Objects

When you want your code to interact with existing digital objects in a Fedora repository, use the ActiveFedora load_instance method.

You can use the load_instance class method on any kind of ActiveFedora::Base class to load objects from Fedora. In the example below, the ActiveFedora object model for “copy_as_base” is ActiveFedora::Base.

newthing.pid
=> "changeme:30"
copy_as_base = ActiveFedora::Base.load_instance("changeme:30")
copy_as_base.pid
=> "changeme:30"
newthing.relationships
=> {:self=>{:has_model=>["info:fedora/afmodel:SpecialThing"]}} 
copy_as_base.relationships
=> {:self=>{:has_model=>["info:fedora/afmodel:SpecialThing"]}} 
newthing.datastreams.keys
=> ["DS1", "descMetadata", "Foo1", "minivan", "RELS-EXT", "rightsMetadata", "DC", "extraMetadataForFun"] 
copy_as_base.datastreams.keys
=> ["DS1", "descMetadata", "Foo1", "minivan", "RELS-EXT", "rightsMetadata", "DC", "extraMetadataForFun"] 

As you can see, ActiveFedora::Base will load the object, its datastreams, its generic Fedora Object information, and even its RELS-EXT relationships. It will not, however, know how to deserialize any model-specific metadata datastreams. In other words, ActiveFedora::Base treats all datastreams as generic Fedora datastreams.

copy_as_base.datastreams["extraMetadataForFun"].class
=> ActiveFedora::Datastream

If you want the model-specific metadata to be deserialized, you must call load_instance on the appropriate ActiveFedora model class. This will load all of the same info as ActiveFedora::Base, but it will also attempt to deserialize the xml from any metadata datastreams that were declared by the has_metadata method in the model.

copy_as_specialthing = SpecialThing.load_instance(newthing.pid)
copy_as_specialthing.datastreams["descMetadata"].class
=> Hydra::ModsArticleDatastream
copy_as_specialthing.datastreams["extraMetadataForFun"].class
=> ActiveFedora::Marpa::DcDatastream

You can use “find” instead of “load_instance”. In practice, we tend to use load_instance though — it’s more direct.

Base.find("changeme:30")
SpecialThing.find("changeme:30")

Finding Fedora Repository Objects of the Same Class

All descendants of the ActiveFedora::Base class inherit the “find” method which searches Solr for Fedora repository objects of the given class. The method is somewhat incomplete at the moment, but is functional. We are actively working on making it better.

Finding Instances of the Class

Imitating ActiveRecord, the find(:all) method (inherited from the ActiveFedora::Base class) searches for instances of the calling class. In current versions of the gem, this method searches solr using the active_fedora_model_field. In future versions it will not hit solr at all, instead relying on Fedora’s Resource Index and searching for anything that asserts “conformsTo” or “hasModel” relationships pointing at the given model.

ActiveFedora::Base.find(:all)
SpecialThing.find(:all)

You can also query solr directly like so:

solr_result = ActiveFedora::SolrService.instance.conn.query('has_model_s:info\:fedora/afmodel\:SpecialThing')

This query will return a Solr::Result containing all of the objects that have conformsTo relationships pointing at info:fedora/afmodel:SpecialThing in their RELS-EXT. This relationship gets added to the RELS-EXT whenever you save an object as a given ActiveFedora model and it does not get erased if you later save it as a different model.

More About ActiveFedora Models

ActiveFedora Models for Fedora objects don’t actually do much. They mainly keep a list of datastream ids and associate them with classes that help you use the content from those datastreams.

Let’s load an instance of the SpecialThing model and take a look at its datastreams.

st = SpecialThing.load_instance("hydrangea:fixture_mods_article1")
st.datastreams
... woah.  that's a lot of stuff.  how about just the datastream ids
st.datastreams.keys
 => ["descMetadata", "RELS-EXT", "rightsMetadata", "DC", "extraMetadataForFun", "properties"] 

We see the three datastreams that are declared by the SpecialThing Model, but where did the other datastreams come from?

Default Fedora Datastreams

The RELS-EXT is where Fedora objects store their relationships, so SpecialThing uses that datastream when it uses the methods created by has_relationship.

The other two datastreams, DC and properties, were already there in the object we imported. Our model doesn’t define anything about those datastreams, so they are loaded as mere ActiveFedora::Datastreams. When a datastream is loaded in this way, you can still see it and access/update its content as a blob, but your model doesn’t know anything special about its contents. This behavior is what allows us to have multiple interfaces for the same content. One model might care only about the descMetadata and the properties while another model only cares about the descMetadata and rightsMetadata. The two models only need to be consistent with each other when they are both operating on the same datastream in the same object.

Let’s see what classes the datastreams have been bound to

st.datastreams.keys.each do |dsid|
  puts "#{dsid}:"  
  puts "    #{st.datastreams[dsid].class}"
end

This will output

descMetadata:
    Hydra::ModsArticleDatastream
RELS-EXT:
    ActiveFedora::RelsExtDatastream
rightsMetadata:
    Hydra::RightsMetadataDatastream
DC:
    ActiveFedora::Datastream
extraMetadataForFun:
    Marpa::DcDatastream
properties:
    ActiveFedora::Datastream

Notice that properties and DC have been loaded as ActiveFedora::Datastream, RELS-EXT has been loaded as ActiveFedora::RelsExtDatastream, and the other three have been loaded as the classes specified in the Model.

Where to Find More Information

You can examine the files in lib/active_fedora/samples to learn more about how to define ActiveFedora models and OM-based datastreams. We also suggest you read about OM-based NokogiriDatastreams to learn about manipulating XML contained in datastreams.

Jump to Line
Something went wrong with that request. Please try again.