Skip to content

[BUG] conflict with lxml v4.9.1 #42

@ghost

Description

Describe the bug
After upgrading lxml to v4.9.1 I am having the following issue.

Traceback (most recent call last):
  File "/Users/john/.pyenv/versions/3.10.5/lib/python3.10/site-packages/webchanges/filters.py", line 1220, in _get_filtered_elements
    root = etree.fromstring(self.data, self.parser)  # bandit B320: use defusedxml TODO
  File "src/lxml/etree.pyx", line 3254, in lxml.etree.fromstring
  File "src/lxml/parser.pxi", line 1908, in lxml.etree._parseMemoryDocument
ValueError: Unicode strings with encoding declaration are not supported. Please use bytes input or XML fragments without declaration.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/john/.pyenv/versions/3.10.5/bin/webchanges", line 8, in <module>
    sys.exit(main())
  File "/Users/john/.pyenv/versions/3.10.5/lib/python3.10/site-packages/webchanges/cli.py", line 307, in main
    urlwatch_command.run()
  File "/Users/john/.pyenv/versions/3.10.5/lib/python3.10/site-packages/webchanges/command.py", line 688, in run
    self.handle_actions()
  File "/Users/john/.pyenv/versions/3.10.5/lib/python3.10/site-packages/webchanges/command.py", line 626, in handle_actions
    self.test_job(self.urlwatch_config.test_job)
  File "/Users/john/.pyenv/versions/3.10.5/lib/python3.10/site-packages/webchanges/command.py", line 201, in test_job
    raise job_state.exception
  File "/Users/john/.pyenv/versions/3.10.5/lib/python3.10/site-packages/webchanges/handler.py", line 170, in process
    filtered_data = FilterBase.process(filter_kind, subfilter, self, filtered_data)
  File "/Users/john/.pyenv/versions/3.10.5/lib/python3.10/site-packages/webchanges/filters.py", line 240, in process
    return filtercls(job_state.job, job_state).filter(data, subfilter)
  File "/Users/john/.pyenv/versions/3.10.5/lib/python3.10/site-packages/webchanges/filters.py", line 1313, in filter
    return lxml_parser.get_filtered_data()
  File "/Users/john/.pyenv/versions/3.10.5/lib/python3.10/site-packages/webchanges/filters.py", line 1249, in get_filtered_data
    elements = self._get_filtered_elements()
  File "/Users/john/.pyenv/versions/3.10.5/lib/python3.10/site-packages/webchanges/filters.py", line 1227, in _get_filtered_elements
    root = etree.fromstring(self.data, self.parser)  # bandit B320: use defusedxml TODO
  File "src/lxml/etree.pyx", line 3254, in lxml.etree.fromstring
  File "src/lxml/parser.pxi", line 1913, in lxml.etree._parseMemoryDocument
  File "src/lxml/parser.pxi", line 1793, in lxml.etree._parseDoc
  File "src/lxml/parser.pxi", line 1082, in lxml.etree._BaseParser._parseUnicodeDoc
  File "src/lxml/parser.pxi", line 615, in lxml.etree._ParserContext._handleParseResultDoc
  File "src/lxml/parser.pxi", line 725, in lxml.etree._handleParseResult
  File "src/lxml/parser.pxi", line 654, in lxml.etree._raiseParseError
  File "<string>", line 1
lxml.etree.XMLSyntaxError: Char 0x0 out of allowed range, line 1, column 2

To Reproduce

Expected behavior

Screen scrape/screenshots

Version info
Please run webchanges -v and paste the version information as follows (first 3 lines):

  • webchanges 3.10.2
  • Python 3.10.5 or 3.10.3
  • System macOS-10.15.7-x86_64-i386-64bit

Additional context
Downgrading to lxml v4.8.0 or v4.9.0 resolves this issue for me.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions