definition of NCName too strict #540

jensopetersen · 2015-03-23T14:22:55Z

I would normally use ASCII characters only for xml:id's and suchlike, but for a specific project it makes sense to use Chinese characters as xml:id's. Here eXist appears to me to be too strict. While it accepts characters in the Basic Multilingual Plane (the second one below), it refuses later additions to Unicode.

<graphs>
    <graph cp="3400" xml:id="&#x3400;"/>
    <graph cp="4E00" xml:id="&#x4e00;"/>
    <graph cp="20000" xml:id="&#x20000;"/>
    <graph cp="f0000" xml:id="&#xf0000;"/>
    <graph cp="100000" xml:id="&#x100000;"/>
</graphs>

The text was updated successfully, but these errors were encountered:

jensopetersen · 2016-07-13T09:25:17Z

<x xml:id="1"/>

is well-formed, but Xerces holds on to the original XML 1.0 definition of NCName which was quite quickly superseded in XML 1.0 by the definition made in XML 1.1. Since eXist uses Xerces, this means that xml:id’s beginning with digits are held to be malformed in eXist, though they are well-formed according to XML 1.0.

adamretter · 2016-07-13T10:46:48Z

@jensopetersen I wonder if the Xerces property http://xml.org/sax/properties/document-xml-version could help, or perhaps even the feature http://xml.org/sax/features/xml-1.1?

See:
https://xerces.apache.org/xerces2-j/properties.html
https://xerces.apache.org/xerces2-j/features.html

tuurma · 2017-01-06T13:58:06Z

In the same vein see discussion at relaxng/jing-trang#188

I have encountered the issue when wanting to use some polytonic Greek characters

ͷ 0377 GREEK SMALL LETTER PAMPHYLIAN DIGAMMA
ϝ 03DD GREEK SMALL LETTER DIGAMMA
Ͷ 0376 GREEK CAPITAL LETTER PAMPHYLIAN DIGAMMA

which are perfectly kosher NameChar under XML Spec fifth edition https://www.w3.org/TR/REC-xml/#d0e804, alas not a Letter according to fourth https://www.w3.org/TR/2006/REC-xml-20060816/#NT-Letter

duncdrum · 2018-11-30T13:25:12Z

this seems to have been fixed, using the OP examples in 4.5.0.
Please open a new issue if there are still problems with NCName handling.

wolfgangmm added this to the eXist-3.0 milestone Jun 3, 2015

wolfgangmm self-assigned this Jun 3, 2015

dizzzz modified the milestone: eXist-3.0 Feb 9, 2017

tuurma mentioned this issue Mar 23, 2017

patches validCName #1360

Closed

wolfgangmm mentioned this issue Apr 9, 2018

[bugfix] Align xml:id handling with saxon and basex #1809

Merged

joewiz added triage issue needs to be investigated and removed triage issue needs to be investigated labels Sep 17, 2018

duncdrum mentioned this issue Nov 30, 2018

FORG0001 cannot cast 'xs:anyAtomicType("ext")' to xs:NCName #2267

Open

duncdrum closed this as completed Nov 30, 2018

line-o removed the triage issue needs to be investigated label Apr 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

definition of NCName too strict #540

definition of NCName too strict #540

jensopetersen commented Mar 23, 2015 •

edited

jensopetersen commented Jul 13, 2016 •

edited

adamretter commented Jul 13, 2016

tuurma commented Jan 6, 2017

duncdrum commented Nov 30, 2018

definition of NCName too strict #540

definition of NCName too strict #540

Comments

jensopetersen commented Mar 23, 2015 • edited

jensopetersen commented Jul 13, 2016 • edited

adamretter commented Jul 13, 2016

tuurma commented Jan 6, 2017

duncdrum commented Nov 30, 2018

jensopetersen commented Mar 23, 2015 •

edited

jensopetersen commented Jul 13, 2016 •

edited