Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSIS export does not provide options to choose which elements to milestone (but always milestones verse elements) #8

Closed
j2l opened this issue Jun 15, 2017 · 5 comments
Assignees
Milestone

Comments

@j2l
Copy link

j2l commented Jun 15, 2017

Hi,

First of all, thank you very much for this tool, a true open source bible tool is fresh air!

My first try from Zefania XML to OSIS produced the XML file, but format is incorrect, here are details:

Zefania XML format for verse is:
<VERS vnumber="1">Au commencement, Dieu créa les cieux et la terre.</VERS>

Result from BibleMultiConverter is:
<verse osisID="Gen.1.1" sID="Gen.1.1"/>Au commencement, Dieu créa les cieux et la terre.<verse eID="Gen.1.1"/>
You can see that tags are self closing before and after verse, " is kept, and attributes are duplicated before and after. Result should be something like:
<verse osisID='Gen.1.1'>Au commencement, Dieu créa les cieux et la terre.</verse>

Additionally, header "work/title" tag is written in the header, but following header work tags are written to "a new first div", in place of Genesis.

Is it possible to fix? Or to indicate where is the XSL Template to fix it?
Thank you very much, God bless you!

@schierlm
Copy link
Owner

Hello Phil,

thank you for your feedback.

Actually, the fact that the verses are self closing and have a "sID", and there is a duplicate tag at the end with an "eID" is a feature of OSIS, called OSIS milestones: https://www.crosswire.org/wiki/OSIS_Bibles#OSIS_Milestones

According to the specification, there are several kinds of elements (chapter, verse, paragraph, line group, quote) that can be milestoned (but if a tag is milestoned, all occurrences of the same tag has to be milestoned).

The OSIS importer can (I believe) handle all cases of milestones, however the exporter is currently limited to a single format (Verses are milestoned, everything else is not).

The background for creating milestoned elements is that logical content (like verses) can span physical content (e. g. a verse can start in the middle of a paragraph and end in the middle of a line group, or a quote can start in the middle of one verse and end in the middle of another).

When converting from certain source formats (like Zefania XML or Haggai XML or TheWord), all these cases cannot happen though, as those formats do not support quotes and paragraphs need to be at the end of verses.

Therefore, I'll add an export option to the exporter to choose which tags you want to have milestoned (so you can choose if you prefer quotes or verses, or neither in case the bible does not have overlaps here).

In general case, removing milestones using XSL is impossible, as it may result in "non-valid XML". Therefore it is probably easier to add that option.

May I ask which program you try to import this OSIS file that claims to support OSIS but does not support milestoned verses? I know of a few programs that cannot import unmilestoned verses, but no one for the other way round.

For the suboptimal handling of metadata in some conversion directions (e. g. Zefania XML to OSIS) I've opened a new issue, #9, but I'm not planning to fix it immediately.

@schierlm schierlm self-assigned this Jun 15, 2017
@schierlm schierlm added this to the v0.0.6 milestone Jun 15, 2017
@schierlm schierlm changed the title Great tool! OSIS export is incorrectly formatted though OSIS export does not provide options to choose which elements to milestone (but always milestones verse elements) Jun 15, 2017
@schierlm
Copy link
Owner

Can you try the latest git version (compiled version attached)? When you pass "-" as second parameter to OSIS export, it should now create verse tags in the style you prefer.

BibleMultiConverter-0.0.5.3.zip

@j2l
Copy link
Author

j2l commented Jun 16, 2017

Wow! Fantastically fast, thank you Michael!
Actually, I'm new to OSIS and wanted to convert a few Bibles I didn't find in claimed OSIS format, here, for instance Hindi, because I didn't find any Bible in XML on Crosswire website. Where are Bibles in OSIS format?
I want JSON in the end (and BrowserBible doesn't output what I want), so the end is not near for me :)
I'll try your update and let you know here.
Thanks again.

@j2l
Copy link
Author

j2l commented Jun 16, 2017

It's working.
For others stumbling on this topic, command is java -jar BibleMultiConverter.jar ZefaniaXML "SF_2015-08-16_HIN_HINERV_(EASY-TO-READ VERSION (HINDI ERV)).xml" OSIS test.xml -
to get
<verse osisID="Gen.1.1">आदि में परमेश्वर ने आकाश और पृथ्वी को बनाया।</verse>

Note that you have to check/change manually ALL book abbreviated names to generate valid osisID since this Zefania Bible doesn't provide correct abbreviations (nor correct book names).

@schierlm
Copy link
Owner

I am not aware of any large amounts of free Bibles available in OSIS format. OSIS has become the de facto standard for publishing Bibles commercially, due to the flexibility of the format. I tend to find free Bibles in Zefania XML instead, which is a simpler format in case you want to convert to JSON anyway.

Crosswire only publishes Bibles in their own binary SWORD format. Some of them were converted from OSIS, others from ThML, others from Zefania XML. But it does not matter at the end since BibleMultiConverter also has an import filter for SWORD bibles.

If you try to build a web site with Bible texts, perhaps have a look at http://biblewebapp.com/study/ (which is available on GitHub, and BibleMultiConverter also has an export filter for their internal format). But having other alternatives will be great too :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants