Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comments before html or doctype. #26

Open
Ferada opened this issue Aug 12, 2019 · 3 comments · May be fixed by #28
Open

Comments before html or doctype. #26

Ferada opened this issue Aug 12, 2019 · 3 comments · May be fixed by #28

Comments

@Ferada
Copy link

Ferada commented Aug 12, 2019

I've seen some documents in the wild that have a <!-- ... --> comment block before the first html node (but after <!doctype>). I'm not super sure if that's "valid", but it is annoying that that's causing an error. Since it's a comment, I could definitely see that the error is basically given a restart to ignore any nodes before actual start of the document, alternatively those could be stored and handled once the document has been created?

@bakketun
Copy link
Member

Comments before the html element is valid. Can you provide an example that causes the error? The following works fine for me.

(html5-parser:parse-html5 "<!doctype html><!-- comment --><html>hello</html>")

@Ferada
Copy link
Author

Ferada commented Aug 15, 2019

Ah, sorry, I should've tried it without CXML. It works like you said, only when :dom :cxml is added, using the mapping to the CXML DOM, does this fail:

There is no applicable method for the generic function:
  #<STANDARD-GENERIC-FUNCTION DOM:CREATE-COMMENT #x30200308948F>
when called with arguments:
  (NIL " hello, world! ")
   [Condition of type NO-APPLICABLE-METHOD-EXISTS]

...

Backtrace:
...
  2: ((:INTERNAL HTML5-PARSER::WALK (HTML5-PARSER:TRANSFORM-HTML5-DOM ((EQL :CXML) T))) #<COMMENT-NODE NIL #x30200343D07D> NIL NIL)
  3: (MAP NIL #<COMPILED-LEXICAL-CLOSURE (:INTERNAL HTML5-PARSER::WALK (HTML5-PARSER:TRANSFORM-HTML5-DOM (# T))) #x3020034491DF> (#<DOCUMENT-TYPE html #x30200343D16D> #<COMMENT-NODE NIL #x30200343D07D> ..)..
  4: (#<STANDARD-METHOD HTML5-PARSER:TRANSFORM-HTML5-DOM ((EQL :CXML) T)> :CXML #<DOCUMENT nodes: 5 #x30200344752D>)
      Locals:
        TO-TYPE = :CXML
        NODE = #<DOCUMENT nodes: 5 #x30200344752D>
        DOCUMENT-TYPE = #<RUNE-DOM::DOCUMENT-TYPE #x3020034490AD>
        DOCUMENT = NIL
        DOCUMENT-FRAGMENT = NIL
        #:WALK = #<COMPILED-LEXICAL-CLOSURE (:INTERNAL HTML5-PARSER::WALK (HTML5-PARSER:TRANSFORM-HTML5-DOM (# T))) #x3020034491DF>
  5: (NIL #<Unknown Arguments>)
  6: (HTML5-PARSER::PARSE-HTML5-FROM-SOURCE "<!doctype html><!-- hello, world! --><html></html>" :CONTAINER NIL :ENCODING NIL :STRICTP NIL :DOM :CXML)

@Ferada
Copy link
Author

Ferada commented Aug 15, 2019

Delaying adding the comment node until document exists in transform-html5-dom works to fix this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants