Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jsoup.parse seems to remove system identifier in DOCTYPE #408

Closed
gzli92 opened this issue Apr 2, 2014 · 3 comments
Closed

Jsoup.parse seems to remove system identifier in DOCTYPE #408

gzli92 opened this issue Apr 2, 2014 · 3 comments

Comments

@gzli92
Copy link

gzli92 commented Apr 2, 2014

Specifically when I call:

Document doc = Jsoup.parse(xhtml, "", Parser.xmlParser());

on a xhtml document that has the following doctype:

<!DOCTYPE html SYSTEM "exampledtdfile.dtd">

I end up with the following result in the document (SYSTEM is now missing):

<!DOCTYPE html "exampledtdfile.dtd"> 

But this works fine on a document with:

 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> 

Since SYSTEM is a proper way of declaring a DTD, I believe this is an issue with Jsoup.

@ivanpgs
Copy link

ivanpgs commented Nov 19, 2015

@gzli92 ,

This thing is also happening to me, and not only has to do with the SYSTEM identifier.

Having a file header like:

Parsing my document twice:

Jsoup.parse(new FileInputStream(xmlFile), FILE_ENCODING, xmlFile.getPath(), Parser.xmlParser());

Will generate this in the first iteration (parse):

And then generate this in the second iteration (parse):

And I am using the latest version so far (1.8.3)

@zazi
Copy link

zazi commented Sep 6, 2016

@jhy can you make a statement to this issue, please.

@jhy jhy closed this as completed in c28e5bf Oct 25, 2016
@jhy
Copy link
Owner

jhy commented Oct 25, 2016

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants