Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parses html files as rdfxml #13

Closed
LaurensRietveld opened this issue May 1, 2019 · 3 comments
Closed

Parses html files as rdfxml #13

LaurensRietveld opened this issue May 1, 2019 · 3 comments
Labels
bug Something isn't working

Comments

@LaurensRietveld
Copy link

See the snippet below. When parsing with this parser we'd get 9 statements (mostly blank nodes). I'd expect an error though.

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
        "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<body>
  <table>
    <tr>
      <td>some col 1</td>
      <td>some col 2</td>
      <td>some col 3</td>
    </tr>
   </table>
</body>
</html>
@rubensworks
Copy link
Member

I don't think this should throw an error though (will double-check), as this is valid XML, just without any RDF triples.

In any case, no blank nodes should be emitted AFAICS, so something seems to be going wrong there.

@rubensworks rubensworks added the bug Something isn't working label May 2, 2019
@rubensworks
Copy link
Member

I just went through the RDF/XML spec again, and as far as I can see, this should actually be valid RDF/XML. So the parser returns correct output here.

Concretely:

  • html is the root, which is at the same time a Typed Node Element, which forms _:b a <...html>
  • body is a Property Element, which forms _:b1 <...body> _:b2.
  • table is a Typed Node Element
  • tr is a Property Element
  • ...

@LaurensRietveld
Copy link
Author

Thanks for the check @rubensworks , you're right! I'll close this one

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants