Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

validation fails due to missing file #53

Closed
tcatapano opened this issue Apr 20, 2022 · 8 comments
Closed

validation fails due to missing file #53

tcatapano opened this issue Apr 20, 2022 · 8 comments
Assignees
Labels

Comments

@tcatapano
Copy link
Member

both @teodorgeorgiev and @gsautter report that validation is failing do to missing file

from @teodorgeorgiev: failed to load external entity "../nlm/JATS-mathmlsetup1.ent" on line 226

from @gsautter in plazi/ggxml2taxpub#43 (comment)

... I encountered one error: JATS-mathmlsetup1.ent doesn't seem to exist in the repo folder you point me to, and none of its subfolders, either ... is this an oversight during prior-version cleanup, or a missing repo file?

@tcatapano tcatapano added the bug label Apr 20, 2022
@tcatapano tcatapano self-assigned this Apr 20, 2022
@tcatapano
Copy link
Member Author

tcatapano commented Apr 20, 2022

Using oxygen, I am not getting this error when validating against a local copy of https://github.com/plazi/TaxPub/tree/v1.0-gamma using either the default engine (Xerces?) or xmllint (which does send warnings regarding duplicate models and the non-determistic nomenclature with its clumsy use of x

@tcatapano
Copy link
Member Author

tcatapano commented Apr 20, 2022

In case some of the base JATS files in the repo might have been lost, one could simply download the official JATS 1.1 files at: https://ftp.ncbi.nih.gov/pub/jats/publishing/1.1/JATS-Publishing-1-1-MathML3-DTD.zip

and then simply place the files:

tax-treatment-NS0-v1.dtd
taxpubcustom-classes-NS0-v1.ent
taxpubcustom-elements-NS0-v1.ent
taxpubcustom-mixes-NS0-v1.ent
taxpubcustom-models-NS0-v1.ent
taxpubcustom-modules-NS0-v1.ent

from https://github.com/plazi/TaxPub/tree/v1.0-gamma

and then validate against tax-treatment-NS0-v1.dtd in that context.

This is probably the preferred method anyway, as it insures that one is using the correct set of base JATS files being extended by TaxPub which is entirely done by the files listed above.

@tcatapano
Copy link
Member Author

Doing this, again, I am not able to replicate the missing file error. Perhaps in other validation scenarios and environments it does not work. @teodorgeorgiev and @gsautter, how are you performing validation?

@teodorgeorgiev
Copy link
Collaborator

@tcatapano we are using the standard PHP DOMDocument::validate. It takes the DTD from the XML, which in our case we store locally:
<!DOCTYPE article PUBLIC "-//TaxonX//DTD Taxonomic Treatment Publishing DTD v0 20100105//EN" "../../nlm/tax-treatment-NS0.dtd">

@gsautter
Copy link
Collaborator

@tcatapano also having a local copy of the DTD files from Pensoft available to the validator does fix the problem, and the server currently uses it this way ...
Mainly wanted to make sure I don't validate against any older and stricter versions of now-relaxed definitions (as with tp:material-citation) and thus tried to validate against https://github.com/plazi/TaxPub/tree/v1.0-gamma alone, which led to the JATS-mathmlsetup1.ent error ...
Could we make this repo self-contained, just to avoid similar scenarios with thrid-party Taxub users?

@teodorgeorgiev
Copy link
Collaborator

@tcatapano @gsautter

OK, now I see ... so far I was trying to validate it against tax-treatment-NS0.dtd and the result was:
failed to load external entity "../nlm/JATS-mathmlsetup1.ent" on line 226

I did as you suggested above (downloaded the official JATS 1.1 and added all "-NS0-v1" files).
I validate the XML against tax-treatment-NS0-v1.dtd and voilà ... I did not get this one anymore :)

However, now although I think my XML is valid I get the following error:

validity error : Content model of nomenclature is not determinist: (sec-meta? , label? , tp:taxon-name , x? , tp:taxon-authority? , x? , tp:taxon-status? , x? , tp:taxon-identifier* , xref* , x? , tp:nomenclature-citation-list* , x? , (tp:type-genus | tp:type-species)? , x? , tp:taxon-type-location? , x?)

Here is my test file
test_taxpub.zip

@tcatapano
Copy link
Member Author

@teodorgeorgiev: Yes. It's a known issue. See: #52. In the meantime, if at all possible try using the Xerces parser (https://xerces.apache.org/index.html) which I do not think will report this error. I'll prioritize a patch for this. Hope to get it out this weekend.

@tcatapano
Copy link
Member Author

Closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants