Feature Request: Populate from Parent #22

Closed
etc opened this Issue Sep 18, 2011 · 17 comments

Projects

None yet

2 participants

@etc
Contributor
etc commented Sep 18, 2011

Many BibTeX::Entry objects of type :inbook, :incollection, and :inproceedings have empty fields that should be populated from their parents as determined by :crossref. So it would be nice to have a method in BibTeX::Entry to allow all fields undefined in one entry to be populated from those defined in another entry.

I am happy to have a go at implementing such a method myself, but I am a Ruby novice and suspect I wouldn't do a very good job. I also suspect something quite simple may be possible by employing the convert() methods, but I am not sure. Any thoughts?

@inukshuk
Owner

You're right, it makes a lot of sense to support cross references; we'll try to include this in the upcoming 2.0 release due (hopefully) in a couple of weeks.

Needless to say, your help would be much appreciated. For example, the first thing we'll need is to define the expected behavior from a very high level of abstraction as a new cucumber feature: you could add a simple BibTeX file with a couple of cross references and describe what behavior you would expect (e.g., if there is no 'booktitle' set in an :incollection entry it should return the 'title' of the referenced element). Then we'll extract a feature specification from the examples and decide how to implement it.

@etc
Contributor
etc commented Sep 18, 2011

It would be excellent to see this in 2.0. I'm writing a Nesta plugin to support citations and reference lists, and this will make handling cross-references simple. I'll have a go at writing a Cucumber feature and get back to you.

@etc
Contributor
etc commented Sep 18, 2011

Pull request is here.

@etc
Contributor
etc commented Sep 18, 2011

Here's a quick fix I'm using to do this until there's a method built into the class itself: https://gist.github.com/1225498

@inukshuk
Owner

So what functionality should we support?

A BibTeX::Entry should

  • be able to tell you whether or not it contains a valid (i.e. resolvable in the current Bibliography) crossref: #has_crossref?
  • be able to tell you if there are any entries containing references to the entry: #crossref?
  • be able to list all entries containing a reference to the entry: #referenced_by (or perhaps just #references?)
  • when accessing a field that is not defined in an entry where #has_crossref? is true, return the 'corresponding' field value of the cross-referenced entry
  • in order to resolve the correspondent fields the entry should define a mapping (e.g., :title => :booktitle)
  • be able to populate all empty fields with the cross-referenced values permanently (i.e., resolve the reference): #resolve_references or #populate_references?

What do you think?

@inukshuk inukshuk added a commit that referenced this issue Sep 19, 2011
@inukshuk removed duplicate fields (#22)
it should not be necessary to duplicate title/booktitle entries in the parent entry
9762486
@inukshuk
Owner

I added example specs for the functionality described above (some are still pending); feel free to add additional examples and/or modify them!

@inukshuk
Owner

I've implemented the functionality suggested above. (This should make your cucumber features pass). Could you review the changes and let me know if you have any suggestions and/or additions or simply add any functionality you still require.

If these changes implement all your requirements it would be awesome if you could briefly document the new features in the Readme with one or two usage examples.

@etc
Contributor
etc commented Sep 19, 2011

Thanks for your very rapid implementation of this! Here are some initial thoughts on the methods you have suggested. I'll think more about what else might be useful and get back to you.

  • We might want to distinguish between having a crossref and having a valid crossref, perhaps by implementing both #has_crossref? and #has_valid_crossref?
  • I prefer #is_crossreferenced? to #crossref?, as the latter is too ambiguous.
  • I prefer #crossreferenced_by to #referenced_by, since I think it is good to keep all crossref functionality with "crossref" or "crossreference" in the names. After all, this is software to deal with bibliographic references, so the term "reference" can mean myriad things here.
  • For the same reason, I like #populate_crossreferences rather than #populate_references (can be aliased to #populate_crossrefs).

What do you think?

@etc
Contributor
etc commented Sep 19, 2011

I've confirmed that your new commits make crossref.feature pass. (Though in order to see this, I have to move crossref.feature from features/issues to features before running cucumber).

@inukshuk
Owner

Really? This is strange; if invoked without arguments, cucumber should pick-up all feature files in the feature directory. If you run

$ bundle exec cucumber

it ought to work. Which version of Ruby do you have installed?

@etc
Contributor
etc commented Sep 19, 2011

Yes, that works. I was running cucumber crossref.feature, which I now understand is not the way to do it...

@inukshuk
Owner

One thing to keep in mind is that, because the BibTeX field is called 'crossref', we have the standard accessor #crossref (or entry[:crossref]) that will return a BibTeX::Value which is more or less a string. Apart from that we need consistent naming for the functionality above.

It's definitely useful to distinguish between valid and missing cross references. At the moment #has_crossref? returns true only for valid cross references, so we could implement #has_invalid_crossref? straight away as

has_field?(:crossref) && !has_crossref?

But I'm not convinced 'invalid' is the best name. Perhaps something like #crossref_missing??

Good point about the 'reference' being misleading; I would prefer 'cross_reference' over 'crossref' although that involves slightly more typing (that's why I originally shortened it to reference). In Ruby, you should typically not use 'is_' for predicates (e.g., Object#nil? not Object#is_nil?) – this is beneficial when using other libraries, such as RSpec, and allows you to use automatic matchers, such as:

 Entry.new.should_not be_a_cross_reference

Having said that, how about these names:

  • #has_cross_reference? – I'm not too happy with the possessive though. What do you call it if an entry contains a cross reference to another one?
  • #has_missing_cross_reference? or #cross_reference_missing?
  • #cross_reference? – as explained above, this would be a good predicate for entries which are cross referenced by other entries.
  • #cross_reference this could return the element defined by the crossref. This is problematic though, because it seems to belong to the #cross_reference? predicate when in fact it is closer related to #has_cross_reference?
  • #cross_referenced_by
  • #cross_referenced_field_names or #inherited_field_names
  • #populate_cross_referenced_fields or #populate_inherited_fields or #resolve_cross_referenced_fields_permanently

Some of these methods become awfully long ;-)

We should add the populate/resolve methods to the Bibliography as well; furthermore, we should add an option to all export methods for whether or not to include the referenced fields.

@etc
Contributor
etc commented Sep 22, 2011

I've given this a little more thought, and I share your worry that the names are getting too long. I do think "valid" is the right term though. So here is another proposal for the names to consider: instead of "crossreference" and "crossreferenced" we could simply use "parent" and "children". So, in the same order as your last comment, we could simplify to:

  • #child?
  • #valid_child?
  • #parent?
  • #parent
  • #children
  • #inherited_fields
  • #fill_inherited_fields

Your other suggestions are good ones. What do you think of these method names? Final call is of course yours!

@inukshuk
Owner

I've mulled this over and arrived at similar conclusions; I'm not sure about the parent/child diction yet, but I think to focus more on viewing the field as 'inherited' makes a lot of sense. I'll refactor the names when I have the time and if these functions offer all the functionality you require, we're good to go. Thanks a lot for getting this into BibTeX-Ruby!

@inukshuk
Owner

Alright, I've opted for a parallel solution; basically, you can use the parent/child names or the longer cross_reference version. Here's a quick summary:

  • #has_parent? or #has_cross_reference?
  • #parent or #cross_reference – returns the actual Entry object
  • #parent_missing? or #cross_reference_missing? – true if you have a 'crossref' value but no corresponding entry in the Bibliography
  • #has_children? or #cross_referenced?
  • #children or #cross_referenced_by – these are based on a query; runtime is linear to Bibliography length
  • #inherited_fields and #save_inherited_fields
  • #provides?(name) and #provide(name) – returns the value for name, possibly resolving aliases; (i.e., #provide(:booktitle) might infer :booktitle from :title

Generally, using Array accessors [] or the new alias #get on an Entry will return the field's value or try to resolve it through the parent (and the parent's aliases). You can use #inherits?(name) to check if a value is inherited (i.e., not defined locally and provided by the parent). I also threw in a #fetch(name, default) which works like Hash#fetch.

@etc
Contributor
etc commented Sep 22, 2011

Sounds good to me. Thanks again for your very quick implementation—looking forward to cleaning up Maldini to take account.

@inukshuk
Owner

FYI, I've pushed a pre-release gem 2.0.0pre1 that includes crossref support. I'll have to take a look at Nesta and Maldini (looks very promising, plus two great players!)

@inukshuk inukshuk closed this Sep 25, 2011
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment