Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Entities in external DTD are neglected #15

Open
donum opened this issue Oct 5, 2018 · 4 comments
Open

Entities in external DTD are neglected #15

donum opened this issue Oct 5, 2018 · 4 comments
Assignees
Labels
Milestone

Comments

@donum
Copy link

donum commented Oct 5, 2018

Hi,

thank you for that cool package. I don't feel misfortunate. :)

Issue-8 was reported and fixed which I am very happy about. This one is related though.

I noted that entity declarations within external DTDs are neglected.

(1) This works:
messages.xml:

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE catalogue SYSTEM "catalogue.dtd" [
  <!ENTITY nbsp "&#160;">
  <!ENTITY shy "&#173;">
  <!ENTITY reg "&#174;">
  <!ENTITY trade "&#8482;">
  <!ENTITY ndash "&#8211;">
  <!ENTITY mdash "&#8212;">
  <!ENTITY rsquo "&#8217;">
]>
<catalogue xml:lang="de" name="messages">
  <message key="banner.title">Hello&shy;World</message>
</catalogue>

catalogue.dtd:

<!ELEMENT catalogue (message)>
<!ATTLIST catalogue name ID #REQUIRED>
<!ELEMENT message (#PCDATA)>
<!ATTLIST message key ID #REQUIRED>

(2) While this does not work:
messages.xml:

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE catalogue SYSTEM "catalogue.dtd">
<catalogue xml:lang="de" name="messages">
  <message key="banner.title">Hello&shy;World!</message>
</catalogue>

catalogue.dtd:

<!ELEMENT catalogue (message)>
<!ATTLIST catalogue name ID #REQUIRED>
<!ELEMENT message (#PCDATA)>
<!ATTLIST message key ID #REQUIRED>
<!ENTITY nbsp "&#160;">
<!ENTITY shy "&#173;">
<!ENTITY reg "&#174;">
<!ENTITY trade "&#8482;">
<!ENTITY ndash "&#8211;">
<!ENTITY mdash "&#8212;">
<!ENTITY rsquo "&#8217;">

(3) And also this more advanced example doesn't work:

messages.xml:

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE catalogue SYSTEM "catalogue.dtd">
<catalogue xml:lang="de" name="messages">
  <message key="banner.title">Hello&shy;World</message>
</catalogue>

catalogue.dtd:

<!ELEMENT catalogue (message)>
<!ATTLIST catalogue name ID #REQUIRED>
<!ELEMENT message (#PCDATA)>
<!ATTLIST message key ID #REQUIRED>
<!ENTITY % iso-lat1
	PUBLIC "ISO 8879:1986//ENTITIES Added Latin 1//EN//XML"
		"iso-lat1.ent">
%iso-lat1;
<!ENTITY % iso-lat2
	PUBLIC "ISO 8879:1986//ENTITIES Added Latin 2//EN//XML"
		"iso-lat2.ent">
%iso-lat2;

iso-lat1.ent:

...
<!ENTITY aacute	"&#x00E1;"> <!-- LATIN SMALL LETTER A WITH ACUTE -->
<!ENTITY Aacute	"&#x00C1;"> <!-- LATIN CAPITAL LETTER A WITH ACUTE -->
<!ENTITY acirc	"&#x00E2;"> <!-- LATIN SMALL LETTER A WITH CIRCUMFLEX -->
<!ENTITY Acirc	"&#x00C2;"> <!-- LATIN CAPITAL LETTER A WITH CIRCUMFLEX -->
<!ENTITY agrave	"&#x00E0;"> <!-- LATIN SMALL LETTER A WITH GRAVE -->
<!ENTITY Agrave	"&#x00C0;"> <!-- LATIN CAPITAL LETTER A WITH GRAVE -->
<!ENTITY aring	"&#x00E5;"> <!-- LATIN SMALL LETTER A WITH RING ABOVE -->
<!ENTITY Aring	"&#x00C5;"> <!-- LATIN CAPITAL LETTER A WITH RING ABOVE -->
<!ENTITY atilde	"&#x00E3;"> <!-- LATIN SMALL LETTER A WITH TILDE -->
...

Error message is always "Entity &shy; not defined".

Would be very nice if at least the second example would work.

Dan

@eerohele
Copy link
Owner

eerohele commented Oct 8, 2018

Thanks for the detailed bug report!

Entities in external DTDs should definitely be supported. I'll try to look into this as soon as I'm over this pesky flu.

eerohele pushed a commit that referenced this issue Oct 9, 2018
@eerohele
Copy link
Owner

eerohele commented Oct 9, 2018

v0.3.5 should fix the problem. When it appears in Package Control (there's usually a slight delay), could you give it a try and let me know whether it fixes the issue for you?

Note, though, that the example you posted still won't work as is: you must either specify the absolute path to catalogue.dtd or (preferably) use an XML catalog.

Exalt always operates on the contents of a Sublime Text view, which means it doesn't (and indeed can't) know the path where the XML document is stored. Also, if the document is unsaved, it has no path at all.

That means Exalt can't resolve catalogue.dtd because it doesn't know where it's located relative to the document it's validating.

@eerohele eerohele self-assigned this Oct 9, 2018
@eerohele eerohele added the bug label Oct 9, 2018
@eerohele eerohele added this to the 0.3.5 milestone Oct 9, 2018
@donum
Copy link
Author

donum commented Oct 11, 2018

Super, thank you eerohele!

It works like a charm. Thank you also for your hint regarding the absolute path requirement.

Using the absolute system path makes it work very nicely.

I tried it this way to prevent me from storing my local system path:
<!DOCTYPE catalogue SYSTEM "http://127.0.0.1:1338/static/xml-catalogue/dtd/catalogue.dtd">

That doesn't work though. Do you know, why? URL is accessible via the browser.

@eerohele
Copy link
Owner

eerohele commented Oct 12, 2018

I believe the issue is that lxml (which is what Exalt uses) doesn't load schemas over the network by default.

One option would be to enable network loading, but I'm not sure what sorts of doors for vulnerabilities I'd be opening Exalt to if I did that… although I think Exalt already loads certain resources over the network in some scenarios. I'll need to check.

I'll consider enabling network requests, but if you're looking to avoid using absolute file system paths to DTDs in your XML documents, have you considered using an XML catalog? I believe that'd be a better solution to the problem. Or maybe you're trying to solve some other problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants