Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request to support bibtex import #329

Closed
ronaldtse opened this issue Dec 10, 2021 · 16 comments
Closed

Request to support bibtex import #329

ronaldtse opened this issue Dec 10, 2021 · 16 comments
Assignees
Labels
bug Something isn't working

Comments

@ronaldtse
Copy link
Contributor

ronaldtse commented Dec 10, 2021

From @cportele and @gbuehler

  • How can I activate support for bibtex citations?
    Not using bibtex in the future is no problem for me, I had only used it because this is what the OGC templates did.

That said, we already implement BibTeX support in Relaton and we have previously developed the asciidoctor-bibliography plugin.

It shouldn't be difficult to re-implement that support in Metanorma. At least we should let people have a way to migrate BibTeX files into either Relaton files or Metanorma ADoc.

And of course, we already have that ticket here: metanorma/metanorma-standoc#319.

@ronaldtse
Copy link
Contributor Author

From @ghobona:

To assist editors of OGC documents with creating and managing bibliographies when authoring documents in metanorma-asciidoc, a python script is provided (temporarily) in the bibliography_generation branch of the templates repo.

https://github.com/opengeospatial/templates/tree/bibliography_generation/bibliography_management

The script does the following three things that are currently not supported by metanorma:

  • The script reads in BibTeX files. Many research journals publish citation information in bibtex format, so being able to copy and paste a bibtex record saves editors time.
  • The script organises the references in the bibliography in order of appearance. This is one of the most time consuming steps when done manually, so the script saves time.
  • The script renders the bibliography according the LNCS style. This saves editors time and reduces the likelihood of error when formatting the bibliography.

Ideally, in the future, the same functionality would be provided directly in metanorma.

@opoudjis
Copy link
Contributor

opoudjis commented Sep 4, 2023

@ronaldtse do we consider this ticket implemented?

@ghobona
Copy link

ghobona commented Nov 6, 2023

I verified that a bibtex file can indeed be imported. The screenshot is below.

Screenshot 2023-11-06 at 12 56 30

The bibtex file contained the following content.

@article{VanZyl2009,
abstract = {Global Earth Observing System of Systems (GEOSS) presents a great challenge of System of Systems integration across organisational and political boundaries. One existing paradigm that can address the scale of the challenge is that of the Sensor Web. In this paradigm, the internet is evolving into an active, macro sensing instrument, capable of drawing sensory data from around the globe to the fingertips of individuals. The Sensor Web will support scientific research and facilitate transparent political decision making. This article presents some of the technologies explored and activities engaged in by the GEOSS Sensor Web community, towards achieving GEOSS goals. Keywords:},
author = {van Zyl, Terence and Simonis, Ingo and McFerren, Graeme},
doi = {10.1080/17538940802439549},
file = {:Users/isi/Library/Application Support/Mendeley Desktop/Downloaded/van Zyl, Simonis, McFerren - 2009 - The Sensor Web systems of sensor systems.pdf:pdf},
isbn = {1753894080243},
issn = {1753-8947},
journal = {International Journal of Digital Earth},
keywords = {Sensor Web,data acquisition,digital earth architecture,earth observation,systems of systems},
month = {mar},
number = {1},
pages = {16--30},
title = {{The Sensor Web: Systems of Sensor Systems}},
howpublished = "\publisher{Taylor {\&} Francis},\url{http://www.tandfonline.com/doi/abs/10.1080/17538940802439549}",
volume = {2},
year = {2009}
}

@techreport{Pross2018,
abstract = {This Engineering Report (ER) addresses the development of a consistent, flexible, adaptable workflow that will run behind the scenes.},
keywords = {workflow, testbed, bpmn, wps},
author = {Benjamin Pross and Christoph Stasch},
title = {OGC Testbed-13: Workflows Engineering Report},
number = {OGC 17-029r1},
howpublished = "\publisher{Open Geospatial Consortium},\url{http://docs.opengeospatial.org/per/17-029r1.html}",
month = {1},
year = {2018}
}

@URL{OGCTechTrends2018,
  title={OGC Technology Trends},
  author={{Open Geospatial Consortium}},
  url={https://github.com/opengeospatial/OGC-Technology-Trends},
  year={2018}
}

I followed the instructions described in this commit.

As shown in the screenshot above, although the bibtex file is indeed imported, the reference is not correctly formatted. There shouldn't be any curly brackets displayed around the title of the article.

@opoudjis opoudjis self-assigned this Nov 6, 2023
@opoudjis opoudjis added the bug Something isn't working label Nov 6, 2023
@opoudjis opoudjis assigned andrew2net and unassigned opoudjis Nov 12, 2023
@opoudjis
Copy link
Contributor

opoudjis commented Nov 12, 2023

Bug with BibTeX import.

@andrew2net Please use the attached files to replicate this.

Archive.zip

The Bibtex file:

@article{VanZyl2009,
abstract = {Global Earth Observing System of Systems (GEOSS) presents a great challenge of System of Systems integration across organisational and political boundaries. One existing paradigm that can address the scale of the challenge is that of the Sensor Web. In this paradigm, the internet is evolving into an active, macro sensing instrument, capable of drawing sensory data from around the globe to the fingertips of individuals. The Sensor Web will support scientific research and facilitate transparent political decision making. This article presents some of the technologies explored and activities engaged in by the GEOSS Sensor Web community, towards achieving GEOSS goals. Keywords:},
author = {van Zyl, Terence and Simonis, Ingo and McFerren, Graeme},
doi = {10.1080/17538940802439549},
file = {:Users/isi/Library/Application Support/Mendeley Desktop/Downloaded/van Zyl, Simonis, McFerren - 2009 - The Sensor Web systems of sensor systems.pdf:pdf},
isbn = {1753894080243},
issn = {1753-8947},
journal = {International Journal of Digital Earth},
keywords = {Sensor Web,data acquisition,digital earth architecture,earth observation,systems of systems},
month = {mar},
number = {1},
pages = {16--30},
title = {{The Sensor Web: Systems of Sensor Systems}},
howpublished = "\publisher{Taylor {\&} Francis},\url{http://www.tandfonline.com/doi/abs/10.1080/17538940802439549}",
volume = {2},
year = {2009}
}

@techreport{Pross2018,
abstract = {This Engineering Report (ER) addresses the development of a consistent, flexible, adaptable workflow that will run behind the scenes.},
keywords = {workflow, testbed, bpmn, wps},
author = {Benjamin Pross and Christoph Stasch},
title = {OGC Testbed-13: Workflows Engineering Report},
number = {OGC 17-029r1},
howpublished = "\publisher{Open Geospatial Consortium},\url{http://docs.opengeospatial.org/per/17-029r1.html}",
month = {1},
year = {2018}
}

is importing into Relaton with the following results:

<bibitem id="in" type="article" schema-version="v1.2.4">
<title type="main" format="text/plain">{The Sensor Web: Systems of Sensor Systems}</title>
  <uri type="doi">10.1080/17538940802439549</uri>  <docidentifier type="isbn">1753894080243</docidentifier>  <docidentifier type="issn">1753-8947</docidentifier>  <date type="published">    <on>2009-03-01</on>  </date>  <contributor>    <role type="author"/>    <person>
<name>        <forename>Terence</forename>        <surname>van Zyl</surname>      </name>
    </person>  </contributor>  <contributor>    <role type="author"/>    <person>
<name>        <forename>Ingo</forename>        <surname>Simonis</surname>      </name>
    </person>  </contributor>  <contributor>    <role type="author"/>    <person>
<name>        <forename>Graeme</forename>        <surname>McFerren</surname>      </name>
    </person>  </contributor>  <note type="howpublished">\publisher{Taylor {\&amp;} Francis},\url{http://www.tandfonline.com/doi/abs/10.1080/17538940802439549}</note>
  <series type="journal">
<title format="text/plain">International Journal of Digital Earth</title>
    <number>1</number>  </series>  <extent type="page">    <referenceFrom>16</referenceFrom>      </extent>  <extent type="volume">    <referenceFrom>2</referenceFrom>  </extent>  <keyword>Sensor Web,data acquisition,digital earth architecture,earth observation,systems of systems</keyword></bibitem>

The following issues arise:

  • The curly brackets around the title need to be removed.
  • keyword should be treated as comma delimited (actually, CSV parsed, so that "a, b" is parsed as a single token

howpublished is correctly treated as opaque by Relaton, but given that standards are being forced to use howpublished (https://tex.stackexchange.com/questions/470113/biblatex-how-to-ensure-to-also-print-publisher-field-when-using-misc-type), I wonder if we can attempt parsing the content of howpublished in case it is bibtex, anyway. @ghobona Is this treatment of howpublished common in your experience?

@andrew2net
Copy link
Contributor

  • The curly brackets around the title need to be removed.

@opoudjis we already remove curly brackets around titles, but the title has doubled curly brackets:

...
title = {{The Sensor Web: Systems of Sensor Systems}},
...

Is the title correct? Do we really need to remove inner brackets?

@opoudjis
Copy link
Contributor

Yes, because it turns out double brackets have a distinct function in BibTex:

https://tex.stackexchange.com/questions/294870/different-behavior-of-double-curly-braces-from-single-curly-braces-and-quotation

From what i'm seeing, braces can be nested indefinitely in BibTex, and BIbTex processors are meant to respect case in deeper-nested braces.

This is also mentioned (very briefly) in https://www.bibtex.org/Format/

This is news to me too :(

Please have a look at

https://mirror.cse.unsw.edu.au/pub/CTAN/biblio/bibtex/contrib/doc/btxFAQ.pdf
https://bibtex.eu

In case there are other hidden bad surprises like that.

@andrew2net
Copy link
Contributor

@opoudjis @ghobona I found that the howpublished field can only be used in @misc and @booklet BibTeX entries https://www.bibtex.com/f/howpublished-field/
Other BibTeX entries can use the note field for the same purpose https://www.bibtex.com/f/note-field/

Anyway, the bibtex-ruby gem doesn't parse the howpublished or note content. So we need to create our own parser.

@ghobona
Copy link

ghobona commented Nov 16, 2023

@andrew2net In the past we used howpublished for getting the url, publisher, and number (e.g. OGC 12-345) to appear in the bibliography. This was to address an apparent (past) limitation of the utility that we were using at the time.

Given that metanorma handles references to OGC Standards centrally, it is not necessary to implement a new parser for howpublished.

Supporting the properties listed at https://www.bibtex.com/e/entry-types/ would be sufficient.

andrew2net added a commit to relaton/relaton-bib that referenced this issue Nov 17, 2023
@andrew2net
Copy link
Contributor

@opoudjis @ghobona fixed in relaton-bib v1.16.5. The XML output now is:

<bibitem id="VanZyl2009" type="article" schema-version="v1.2.5">
  <title type="main" format="text/plain">The Sensor Web: Systems of Sensor Systems</title>
  <uri type="doi">10.1080/17538940802439549</uri>
  <docidentifier type="isbn">1753894080243</docidentifier>
  <docidentifier type="issn">1753-8947</docidentifier>
  <date type="published">
    <on>2009-03-01</on>
  </date>
  <contributor>
    <role type="author"/>
    <person>
      <name>
        <forename>Terence</forename>
        <surname>van Zyl</surname>
      </name>
    </person>
  </contributor>
  <contributor>
    <role type="author"/>
    <person>
      <name>
        <forename>Ingo</forename>
        <surname>Simonis</surname>
      </name>
    </person>
  </contributor>
  <contributor>
    <role type="author"/>
    <person>
      <name>
        <forename>Graeme</forename>
        <surname>McFerren</surname>
      </name>
    </person>
  </contributor>
  <note type="howpublished">\publisher{Taylor {\&amp;} Francis},\url{http://www.tandfonline.com/doi/abs/10.1080/17538940802439549}</note>
  <series type="journal">
    <title format="text/plain">International Journal of Digital Earth</title>
    <number>1</number>
  </series>
  <extent type="page">
    <referenceFrom>16</referenceFrom>
    <referenceTo/>
  </extent>
  <extent type="volume">
    <referenceFrom>2</referenceFrom>
  </extent>
  <keyword>Sensor Web</keyword>
  <keyword>data acquisition</keyword>
  <keyword>digital earth architecture</keyword>
  <keyword>earth observation</keyword>
  <keyword>systems of systems</keyword>
</bibitem>

@ronaldtse
Copy link
Contributor Author

@andrew2net there is a problem with the publisher, it should be parsed?

  <note type="howpublished">\publisher{Taylor {\&amp;} Francis},\url{http://www.tandfonline.com/doi/abs/10.1080/17538940802439549}</note>

@ronaldtse
Copy link
Contributor Author

The issue with howpublished is that it is only used in BibTeX's misc type, and it is meant to give textual content. It's a "hack" to render data fields that are not supported in misc.

In modern BibTeX, \url{...} is normally used in a separate key, and \publisher{...} is as well.

The problem is that the publisher = ... field is not used in @techreport:

Screenshot 2023-11-17 at 12 11 14 PM

And the url = ... field is only used in select styles:

I think what we can do here is to update the relaton-bibtex importer to parse howpublished and extract the publisher and url content. What do you think @andrew2net ?

@andrew2net
Copy link
Contributor

I think what we can do here is to update the relaton-bibtex importer to parse howpublished and extract the publisher and url content. What do you think @andrew2net ?

@ronaldtse I was going to implement the parser, but @ghobona said that we don't need it

Given that metanorma handles references to OGC Standards centrally, it is not necessary to implement a new parser for howpublished.

Supporting the properties listed at https://www.bibtex.com/e/entry-types/ would be sufficient.

@ronaldtse
Copy link
Contributor Author

@andrew2net while OGC standards don't need to be parsed this way, the other BibTeX entries like:

@article{VanZyl2009,
abstract = {Global Earth Observing System of Systems (GEOSS) presents a great challenge of System of Systems integration across organisational and political boundaries. One existing paradigm that can address the scale of the challenge is that of the Sensor Web. In this paradigm, the internet is evolving into an active, macro sensing instrument, capable of drawing sensory data from around the globe to the fingertips of individuals. The Sensor Web will support scientific research and facilitate transparent political decision making. This article presents some of the technologies explored and activities engaged in by the GEOSS Sensor Web community, towards achieving GEOSS goals. Keywords:},
author = {van Zyl, Terence and Simonis, Ingo and McFerren, Graeme},
doi = {10.1080/17538940802439549},
file = {:Users/isi/Library/Application Support/Mendeley Desktop/Downloaded/van Zyl, Simonis, McFerren - 2009 - The Sensor Web systems of sensor systems.pdf:pdf},
isbn = {1753894080243},
issn = {1753-8947},
journal = {International Journal of Digital Earth},
keywords = {Sensor Web,data acquisition,digital earth architecture,earth observation,systems of systems},
month = {mar},
number = {1},
pages = {16--30},
title = {{The Sensor Web: Systems of Sensor Systems}},
howpublished = "\publisher{Taylor {\&} Francis},\url{http://www.tandfonline.com/doi/abs/10.1080/17538940802439549}",
volume = {2},
year = {2009}
}

Have howpublished that need to be parsed.

@andrew2net
Copy link
Contributor

Have howpublished that need to be parsed.

@ronaldtse so the "Taylor & Francis" is a contributor[@type='publisher']/organizarion and the URL is a uri[@type='src'] in the Relaton data model, right?

@ronaldtse
Copy link
Contributor Author

I believe so.

andrew2net added a commit to relaton/relaton-bib that referenced this issue Dec 7, 2023
@andrew2net
Copy link
Contributor

Have howpublished that need to be parsed.

Implemented in v1.17.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Archived in project
Development

No branches or pull requests

4 participants