Skip to content

Loading…

Refactor Entry#to_rdf #83

Merged
merged 74 commits into from

3 participants

@tmaier

This pull request is a work in progress and related to #82.
Critics, suggestions or hints for the conversion of more fields are highly welcome.

Todo

  • Add conversion for more fields
  • Add tests
tmaier added some commits
@tmaier tmaier Remove current Entry#to_rdf 16d151f
@tmaier tmaier Refactor Entry#to_rdf by introducing Entry::RDFConverter
Every conversion is in its own method. This allows proper teasing for every conversion and keeps its scope clear.

* Cleaned up BIBO_TYPES
* Added support for more fields
ee044c7
@tmaier tmaier Update rdf gem dependency in Gemfile
There is no reason to limit ourself to this old version.
44c7202
@tmaier tmaier Add #number bb697b0
@tmaier tmaier Add #note 4d73962
@inukshuk
Owner

Looks great so far!

One minor suggestion: I would add an instance method #convert! or something like that that goes through all the methods instead of doing that in the constructor by default. I imagine this will give you more flexibility and make testing easier (you can create an instance and call individual converters or change the source object in between etc.).

Have you considered adding an RDF to BibTeX conversion as well? Would that be useful do you think?

@tmaier

The #convert! Thing is exactly something I thought about when I woke up today :)

Yes, I will definitely write a rdf to bibtex conversion. I just wonder if bibtex-ruby is the right place and where there exactly.

One more thing: I would like to avoid data loss when converting to rdf. Theoretically, it would be possible to mix in bibtexml into the rdf for fields which cannot be represented as expected. What do you think?

@inukshuk
Owner

Unfortunately, I have too little practical experience using RDF to have an informed opinion on field mappings or mixing in BibTeXML. Perhaps you could ask @bdarcus for advice?

One more thing I noticed: please make sure to require the rdf gem only the first time that to_rdf is called — otherwise users who do not use RDF will be confused by the load error message.

@bdarcus

RDF is an general data model. BibTeX is a very specific one. So it doesn't make sense to talk aboutconverting from RDF, unless you talk specific vocabularies. I'd leave it out.

@tmaier

Yeah, this is the reason why I'm not sure if bibtex-ruby is the right place for it.

But coming back to adding bibtexml to the rdf. Is this ok (as an addition to bibo)?

@bdarcus
tmaier added some commits
@tmaier tmaier Rename #output to #convert!. Move conversion call to #convert! 9d5cab4
@tmaier tmaier Fix spelling in comment f493c37
@tmaier tmaier Only show load error message when Entry#to_rdf is called 5ef44e3
@tmaier tmaier Add #remove_from_fallback and call where appropriate 196be4b
@tmaier tmaier Refactor Entry#to_xml by introducing Entry::BibTeXMLConverter c5837b4
@tmaier tmaier Fix #convert! f76549f
@tmaier tmaier Add BibTeXML fallback
The fields are added directly to the entry.
BibTeXML would usually require an <bibtex:entry> tag followed by e.g. <bibtex:book>. <bibtex:book> would then contain the fields.
As <bibtex:entry> nor <bibtex:book> are proper predicates, I omitted them for now.
5d6b2f2
@tmaier tmaier Add support for DC type f5c06a6
@tmaier tmaier Add support DC source a023717
@tmaier tmaier Add more BIBO types 8e7cd94
@tmaier tmaier Add support for pagination in DC source 7db8d02
@tmaier tmaier Add more thesis types db399c5
@tmaier tmaier Add DC identifiers b3cea2c
@tmaier tmaier Add bibo publisher 9d1de79
@tmaier tmaier Add DC isPartOf 65cfe13
@tmaier tmaier Add keywords a1fd92d
@tmaier tmaier Refactor Entry#to_citeproc by introducing Entry::CiteProcConverter 0116c42
@tmaier tmaier Add #chapter 0aae4c0
@tmaier tmaier Add BIBO shortTitle fbdd9b6
@tmaier tmaier Add parent 5da5528
@tmaier tmaier Add series
Will be just added if it does not have the same :title, :series or :issn as the parent
9c210bf
@tmaier tmaier Add booktitle 9ed2079
@tmaier tmaier Add children 32fcebf
@tmaier tmaier Add organization fd26ab1
@tmaier tmaier Refactor #autor and #editor using #agent and #create_agent c7efc9d
@tmaier tmaier Refactor #organization using #agent and #create_agent a04f15f
@tmaier tmaier Refactor #publisher. Add fallback to :organisation or :school
Using #agent and #create_agent
791c964
@tmaier tmaier Fix syntax error d61cb9e
@tmaier tmaier Add institution 60cd6bf
@tmaier tmaier Add school 4501f0d
@tmaier tmaier Add DC isPartOf for #journal 59956b1
@tmaier tmaier Build proper nodes for #booktitle and #series acd8b45
@tmaier tmaier Replace event:hasAgent with bibo:organizer in #organization 8d15b89
@tmaier tmaier Only add bibo:shortTitle if there is a subtitle 9bac2fe
@tmaier tmaier Add bibo:abstract to #abstract fc837f3
@tmaier tmaier Add #copyright 590a6d1
@tmaier tmaier Add #location e76e169
@tmaier tmaier Add #lccn 86e5589
@tmaier tmaier Add #url 031fa97
@tmaier tmaier Add #volumes b446b46
@tmaier tmaier Add #translator 055fa34
@tmaier tmaier Add standard to BIBO_TYPES b802ea8
@tmaier tmaier Add #pagetotal 0976da1
@tmaier tmaier Introduce #bibo_class
It is expected, that bibtex[:type] - if it exists - has a more accurate information about the type, than bibtex.type. Thus we prioritize this one in favour for the later.
5f02d8a
@tmaier tmaier Sort BIBO_TYPES alphabetically 681d9d0
@tmaier tmaier Add #howpublished 9adeb44
@tmaier tmaier Refactor fallback 73b43a2
@tmaier tmaier Fix #series 374dfaa
@tmaier tmaier Parse names and month before #convert! b0a8d02
@tmaier tmaier Fix some errors f1133b3
@tmaier tmaier Add basic tests 7b65c36
@tmaier

Almost done.

I didn't come up with a good way to test the graph. I would appreciate some ideas.

@inukshuk
Owner

Great job! Thanks also for moving the converters to separate files and tidying up.

What's your own use case for the RDF export? In my experience this usually gives you a good test case (e.g., some expectation you have on the graph object ) if it's possible to strip it down to the bare minimum. Any ideas?

Otherwise just let me know when you want this merged; I'll also push this out to rubygems afterwards.

tmaier added some commits
@tmaier tmaier Add graph to initializer 9e80a54
@tmaier tmaier Remove :crossref from fallback db81a0c
@tmaier tmaier Use Entry#identifier in #children and #parent a09dfef
@tmaier tmaier Add agent to initializer
key of #agent could be BibTeX::Name. And this would be always a different object. This is why we use then key#to_hash as the key.
a30f327
@tmaier tmaier Fix calls to #agent 41afdcf
@tmaier tmaier Fix #parent and #children
As we did not check, if an Entry was already in our graph, the converter would run infinitely.
b3e05f6
@tmaier tmaier Fix Entry#contained?
Some applications, like BibDesk set the title and the booktitle to the same value. This would lead to a false positive when calling #contained?
171983f
@tmaier tmaier Skip #booktitle if title == booktitle bc58702
@tmaier tmaier Fix #year
Output was "2014-feb". Is now "2014-2"
d605dda
@tmaier tmaier Remove bibo:created and bibo:issued from #year
They do not exist. (Links to Dublin Core)
2498a64
@tmaier tmaier Use DC.issued only in #year
bibo says about DC.issued: "Used to describe the issue date of a bibliographic resource"
whereas it says about DC.created "Used to describe the creation date of a bibliographic item".
12ba937
@tmaier tmaier Add #date_added 0609b3a
@tmaier tmaier Remove some bibtex fields from the fallback by default d17eb03
@tmaier tmaier Bibliography#to_rdf
Every entry was included multiple times in the graph returned, as each conversion of an entry did not know of each other when it created the children and parent

The code style follows Entry#to_rdf
d6c6dc5
@tmaier tmaier Improve code style 3ab4582
@tmaier tmaier Add missing basic tests a183500
@tmaier tmaier Add syntax highlighting for XML example in README cc3ba53
@tmaier

Alright. I consider it as done for now.

The only thing what should be highlighted in the README is, that one needs to convert the LaTeX code to UTF-8 before exporting to RDF or anything else.

convert(:latex).to_rdf

Please have an extra look at 171983f. This is a change which is not really related with this pull request. If you're unhappy with this, please let me know. I would remove it from here and open a new PR just for this change.

@inukshuk inukshuk merged commit 153c3cb into inukshuk:master

1 check passed

Details default The Travis CI build passed
@inukshuk
Owner

That's fine. Landed in 3.1.0. Cheers!

@tmaier tmaier referenced this pull request
Closed

RDF Export #51

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Jan 17, 2014
  1. @tmaier

    Remove current Entry#to_rdf

    tmaier committed
  2. @tmaier

    Refactor Entry#to_rdf by introducing Entry::RDFConverter

    tmaier committed
    Every conversion is in its own method. This allows proper teasing for every conversion and keeps its scope clear.
    
    * Cleaned up BIBO_TYPES
    * Added support for more fields
  3. @tmaier

    Update rdf gem dependency in Gemfile

    tmaier committed
    There is no reason to limit ourself to this old version.
  4. @tmaier

    Add #number

    tmaier committed
  5. @tmaier

    Add #note

    tmaier committed
Commits on Jan 18, 2014
  1. @tmaier
  2. @tmaier

    Fix spelling in comment

    tmaier committed
  3. @tmaier
Commits on Jan 19, 2014
  1. @tmaier
  2. @tmaier
  3. @tmaier

    Fix #convert!

    tmaier committed
  4. @tmaier

    Add BibTeXML fallback

    tmaier committed
    The fields are added directly to the entry.
    BibTeXML would usually require an <bibtex:entry> tag followed by e.g. <bibtex:book>. <bibtex:book> would then contain the fields.
    As <bibtex:entry> nor <bibtex:book> are proper predicates, I omitted them for now.
  5. @tmaier

    Add support for DC type

    tmaier committed
  6. @tmaier
  7. @tmaier
  8. @tmaier
  9. @tmaier

    Add more thesis types

    tmaier committed
  10. @tmaier

    Add DC identifiers

    tmaier committed
  11. @tmaier

    Add bibo publisher

    tmaier committed
  12. @tmaier

    Add DC isPartOf

    tmaier committed
  13. @tmaier

    Add keywords

    tmaier committed
  14. @tmaier
  15. @tmaier

    Add #chapter

    tmaier committed
  16. @tmaier

    Add BIBO shortTitle

    tmaier committed
  17. @tmaier

    Add parent

    tmaier committed
  18. @tmaier

    Add series

    tmaier committed
    Will be just added if it does not have the same :title, :series or :issn as the parent
  19. @tmaier

    Add booktitle

    tmaier committed
  20. @tmaier

    Add children

    tmaier committed
  21. @tmaier

    Add organization

    tmaier committed
  22. @tmaier
  23. @tmaier
  24. @tmaier

    Refactor #publisher. Add fallback to :organisation or :school

    tmaier committed
    Using #agent and #create_agent
  25. @tmaier

    Fix syntax error

    tmaier committed
  26. @tmaier

    Add institution

    tmaier committed
  27. @tmaier

    Add school

    tmaier committed
  28. @tmaier

    Add DC isPartOf for #journal

    tmaier committed
  29. @tmaier
  30. @tmaier
  31. @tmaier
  32. @tmaier

    Add bibo:abstract to #abstract

    tmaier committed
  33. @tmaier

    Add #copyright

    tmaier committed
  34. @tmaier

    Add #location

    tmaier committed
  35. @tmaier

    Add #lccn

    tmaier committed
  36. @tmaier

    Add #url

    tmaier committed
  37. @tmaier
  38. @tmaier

    Add #translator

    tmaier committed
  39. @tmaier

    Add standard to BIBO_TYPES

    tmaier committed
  40. @tmaier

    Add #pagetotal

    tmaier committed
  41. @tmaier

    Introduce #bibo_class

    tmaier committed
    It is expected, that bibtex[:type] - if it exists - has a more accurate information about the type, than bibtex.type. Thus we prioritize this one in favour for the later.
  42. @tmaier

    Sort BIBO_TYPES alphabetically

    tmaier committed
  43. @tmaier

    Add #howpublished

    tmaier committed
  44. @tmaier

    Refactor fallback

    tmaier committed
  45. @tmaier

    Fix #series

    tmaier committed
  46. @tmaier
  47. @tmaier

    Fix some errors

    tmaier committed
  48. @tmaier

    Add basic tests

    tmaier committed
Commits on Jan 20, 2014
  1. @tmaier

    Add graph to initializer

    tmaier committed
  2. @tmaier

    Remove :crossref from fallback

    tmaier committed
  3. @tmaier
  4. @tmaier

    Add agent to initializer

    tmaier committed
    key of #agent could be BibTeX::Name. And this would be always a different object. This is why we use then key#to_hash as the key.
  5. @tmaier

    Fix calls to #agent

    tmaier committed
  6. @tmaier

    Fix #parent and #children

    tmaier committed
    As we did not check, if an Entry was already in our graph, the converter would run infinitely.
  7. @tmaier

    Fix Entry#contained?

    tmaier committed
    Some applications, like BibDesk set the title and the booktitle to the same value. This would lead to a false positive when calling #contained?
  8. @tmaier
  9. @tmaier

    Fix #year

    tmaier committed
    Output was "2014-feb". Is now "2014-2"
  10. @tmaier

    Remove bibo:created and bibo:issued from #year

    tmaier committed
    They do not exist. (Links to Dublin Core)
  11. @tmaier

    Use DC.issued only in #year

    tmaier committed
    bibo says about DC.issued: "Used to describe the issue date of a bibliographic resource"
    whereas it says about DC.created "Used to describe the creation date of a bibliographic item".
  12. @tmaier

    Add #date_added

    tmaier committed
  13. @tmaier
  14. @tmaier

    Bibliography#to_rdf

    tmaier committed
    Every entry was included multiple times in the graph returned, as each conversion of an entry did not know of each other when it created the children and parent
    
    The code style follows Entry#to_rdf
  15. @tmaier

    Improve code style

    tmaier committed
  16. @tmaier

    Add missing basic tests

    tmaier committed
  17. @tmaier
  18. @tmaier
Something went wrong with that request. Please try again.