Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Content negotiation in RDF.rb clients #12

Closed
njh opened this issue May 27, 2010 · 8 comments
Closed

Content negotiation in RDF.rb clients #12

njh opened this issue May 27, 2010 · 8 comments
Labels

Comments

@njh
Copy link

njh commented May 27, 2010

http://lists.w3.org/Archives/Public/public-rdf-ruby/2010May/0024.html

Implement content negotiation in RDF.rb clients. Ideally with q= values for each of the supported parsers.

I would like to be able to do this:

repo = RDF::Repository.new
repo.load('http://www.bbc.co.uk/programmes/b00jnwlc#programme')
repo.each { |s| s.inspect! }
@njh
Copy link
Author

njh commented Jun 3, 2010

It would be great if multiple HTTP clients were supported, for example http://github.com/toland/patron

@gkellogg
Copy link
Member

gkellogg commented Jun 7, 2010

Readers/format classes should implement simple Regexp test on content to be parsed, if format not detected from extension, mime-type or explicit request. For instance:

input.match(/<html/i) && RDF::RDFa::Format

@artob
Copy link
Member

artob commented Jun 11, 2010

Related issue #24 (with regards to improving the HTTP client functionality).

@gkellogg
Copy link
Member

More from a recent email response to hellekin at cepheide.org:

RDF::Reader.for needs to be somewhat smarter.

The symbol case is limited to using an element of the classname (e.g. RDF::RDFXML => :rdfxml). It would be nice to specify alternate symbols (e.g., :rdf). Of course, this can be done through for(:extension => "rdf").
RDF::Reader.open, when loading a remote resource, should look at the returned Mime-Type to do a format match, rather than requiring it be provided explicitly. Arto seems to be of the opinion that this is done via LinkedData, but it seems to be a fair thing to do directly in RDF.rb
I believe that Format specifications should also provide a RegExp to match against the beginning of the content (I use the first 1000 bytes in RdfContext). This would be used within RDF::Reader.open in case a format couldn't be found through other uses, consider the following:

Heuristically detect the input stream

def detect_format(stream)

Got to look into the file to see

if stream.respond_to?(:rewind)
stream.rewind
string = stream.read(1000)
stream.rewind
else
string = stream.to_s
end
case string
when /<(\w+:)?RDF/ then :rdfxml
when /<\w+:)?html/i then :rdfa
when /@prefix/i then :n3
else :ntriples
end
end

This could instead be found by looping through available Format subclasses and looking for a #match method. Within RDFXML::Format, I could perform the following:

class Format < RDF::Format
MATCH = %r(<(\w+:)?RDF))

content_type 'text/turtle', :extension => :ttl
content_type 'text/n3', :extension => :n3
content_encoding 'utf-8'

reader { RDF::N3::Reader }
writer { RDF::N3::Writer }

def match(content)
content.to_s.match(MATCH)
end
end

In RDF::Reader.open, first look for a reader using the options. Then, failing that, open the file and look for a mime-type, failing that, loop through Format instances and see if the Format matches the string content.

In most cases, this will do what the user expects.

@njh
Copy link
Author

njh commented Aug 31, 2010

@gkellogg
Copy link
Member

This one looks pretty interesting too:
http://github.com/eric1234/open_uri_db_cache

@gkellogg
Copy link
Member

Recently I needed to re-visit this issue in RdfContext to support RDFa 1.1 profiles. Profiles are a mechanism for defining RDF prefixes and terms in a separate document. The spec encourages implementer to cache these vocabularies, for obvious reasons. I implemented this using a ConjunctiveGraph, which is a graph over all quads within a Store (or Repository). When I see a profile, I look for it as a context within the ProfileGraph and download, parse it and add it to the store as necessary.

To do this in RDF.rb is difficult, because RDF::Reader.open inverts finding the reader and opening the resource. Ideally, the resource should be opened first so that, for example, mime-type can be retrieved to perform content-negotation, and the resource can be inspected to see if it is up-to-date. The following is a potential refactor of RDF::Reader.open that extracts the open and provides the same simple Kernel.open implementation. This makes it easier for another module to override this, or perhaps to register an alternative reader to provide better HTTP semantics.

module RDF
class Reader
def self.open(filename, options = {}, &block)
resource = URLResource.new(filename)
reader = self.for(options.slice(:format).merge(:content_type => resource.mime_type))
reader ||= self.for(filename)
raise FormatError.new("unknown RDF format: #{options[:format] || filename}") unless reader

    reader.new(resource.io, options, &block)
  end

  class URLResource
    attr_reader :url, :mime_type, :etag, :format
    attr_reader :modified_at, :checked_at,

    def initialize(url)
      @file = Kernel.open(url, "r")
    end

    def io; @file; end
  end
end

end

There still remains the question of how best to implement this in RDF::RDFa, but that is a different conversation.

@gkellogg
Copy link
Member

gkellogg commented Feb 6, 2012

Since 0.3.4, RDF.rb can perform format detection in RDF::Reader.for (or RDF::Format.for with :sample option or a block which returns a sample). Of course, content-negotiation is handled using rack-linkeddata or sinatra-linkeddata (and soon rack-sparql or sinatra-sparql).

artob pushed a commit that referenced this issue Jan 19, 2013
Try running the specs in Ruby 2.0.0 and head
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants