New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTML Tags: self-closing tags not handled properly #31

Closed
earwig opened this Issue May 3, 2013 · 1 comment

Comments

Projects
None yet
1 participant
@earwig
Owner

earwig commented May 3, 2013

  $ python
  Python 2.7.3 (default, Aug  1 2012, 05:14:39)
  [GCC 4.6.3] on linux2
  Type "help", "copyright", "credits" or "license" for more information.
  >>> import mwparserfromhell
  from mwparserfromhell.parser.tokenizer import Tokenizer
  from mwparserfromhell.parser.builder import Builder
  >>> >>> >>>

  # Without self-closing ref tag, works
  >>> wikicode = Builder().build(Tokenizer().tokenize('I has a template!<ref name=foo>{{bar}}</ref>'))
  >>> wikicode.filter_tags()
  [u'<ref name=foo>{{bar}}</ref>']
  >>> wikicode.filter_tags(recursive=True)
  [u'<ref name=foo>{{bar}}</ref>']

  # With self-closing tag, doesn't work
  >>> wikicode = Builder().build(Tokenizer().tokenize('I has a template!<ref name=foo>{{bar}}</ref><ref name=baz/>'))
  >>> wikicode.filter_tags()
  []
  >>> wikicode.filter_text()
  [u'baz']
  >>> wikicode.filter_tags(recursive=True)
  Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "/home/zad68/.local/lib/python2.7/site-packages/mwparserfromhell-0.2.dev-py2.7-linux-x86_64.egg/mwparserfromhell/wikicode.py", line 376, in filter_tags
      return list(self.ifilter_tags(recursive, matches, flags))
    File "/home/zad68/.local/lib/python2.7/site-packages/mwparserfromhell-0.2.dev-py2.7-linux-x86_64.egg/mwparserfromhell/wikicode.py", line 301, in ifilter
      for node in nodes:
    File "/home/zad68/.local/lib/python2.7/site-packages/mwparserfromhell-0.2.dev-py2.7-linux-x86_64.egg/mwparserfromhell/wikicode.py", line 82, in _get_all_nodes
      for child in self._get_children(node):
    File "/home/zad68/.local/lib/python2.7/site-packages/mwparserfromhell-0.2.dev-py2.7-linux-x86_64.egg/mwparserfromhell/wikicode.py", line 59, in _get_children
      for context, child in node.__iternodes__(self._get_all_nodes):
  AttributeError: 'NoneType' object has no attribute '__iternodes__'

  # Edge case with self-closing tag only:
  >>> wikicode = Builder().build(Tokenizer().tokenize('<ref name=foo/>'))
  >>> wikicode.filter_tags()
  []
  >>> wikicode.filter_text()
  [u'foo']

  # If the tag isn't "ref", different but still incorrect behavior:
  # it doesn't stack trace but doesn't work either...
  >>> wikicode = Builder().build(Tokenizer().tokenize('I has<bloop name=baz/> a template!'))
  >>> wikicode.filter_tags()
  []
  >>> wikicode.filter_tags(recursive=True)
  []
  >>>
  wikicode = Builder().build(Tokenizer().tokenize("==Epidemiology==\nFoo.<ref>hi<br />there</ref>"))
  # this looks OK:
  >>> wikicode.filter_tags()
  [u'<ref>hi<br />there</ref>']
  # but doing it recursively yields slightly different stack trace
  >>> wikicode.filter_tags(recursive=True)
  Traceback (most recent call last):
  ...
  AttributeError: 'NoneType' object has no attribute 'nodes'

[1]

@ghost ghost assigned earwig May 3, 2013

@earwig earwig closed this May 19, 2013

@earwig

This comment has been minimized.

Owner

earwig commented May 19, 2013

All of the above tests are now passing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment