Skip to content

Commit

Permalink
Merge branch lxml-4.2 into master.
Browse files Browse the repository at this point in the history
  • Loading branch information
scoder committed Sep 9, 2018
2 parents 085ccc2 + 1dee355 commit 3f3082e
Show file tree
Hide file tree
Showing 6 changed files with 29 additions and 14 deletions.
12 changes: 9 additions & 3 deletions CHANGES.txt
Expand Up @@ -3,14 +3,20 @@ lxml changelog
==============

4.3.0 (2018-??-??)
==================

Features added
--------------

* The module ``lxml.sax`` is compiled using Cython in order to speed it up.


4.2.5 (2018-09-09)
==================

Bugs fixed
----------

* Javascript URLs that used URL escaping were not removed by the HTML cleaner.
Security problem found by Omar Eissa.


4.2.4 (2018-08-03)
==================
Expand Down
10 changes: 7 additions & 3 deletions doc/main.txt
Expand Up @@ -157,8 +157,8 @@ Index <http://pypi.python.org/pypi/lxml/>`_ (PyPI). It has the source
that compiles on various platforms. The source distribution is signed
with `this key <pubkey.asc>`_.

The latest version is `lxml 4.2.4`_, released 2018-08-03
(`changes for 4.2.4`_). `Older versions <#old-versions>`_
The latest version is `lxml 4.2.5`_, released 2018-09-09
(`changes for 4.2.5`_). `Older versions <#old-versions>`_
are listed below.

Please take a look at the
Expand Down Expand Up @@ -250,7 +250,9 @@ See the websites of lxml
..
and the `latest in-development version <http://lxml.de/dev/>`_.

.. _`PDF documentation`: lxmldoc-4.2.4.pdf
.. _`PDF documentation`: lxmldoc-4.2.5.pdf

* `lxml 4.2.5`_, released 2018-09-09 (`changes for 4.2.5`_)

* `lxml 4.2.4`_, released 2018-08-03 (`changes for 4.2.4`_)

Expand All @@ -272,6 +274,7 @@ See the websites of lxml

* `older releases <http://lxml.de/3.7/#old-versions>`_

.. _`lxml 4.2.5`: /files/lxml-4.2.5.tgz
.. _`lxml 4.2.4`: /files/lxml-4.2.4.tgz
.. _`lxml 4.2.3`: /files/lxml-4.2.3.tgz
.. _`lxml 4.2.2`: /files/lxml-4.2.2.tgz
Expand All @@ -282,6 +285,7 @@ See the websites of lxml
.. _`lxml 4.0.0`: /files/lxml-4.0.0.tgz
.. _`lxml 3.8.0`: /files/lxml-3.8.0.tgz

.. _`changes for 4.2.5`: /changes-4.2.5.html
.. _`changes for 4.2.4`: /changes-4.2.4.html
.. _`changes for 4.2.3`: /changes-4.2.3.html
.. _`changes for 4.2.2`: /changes-4.2.2.html
Expand Down
2 changes: 1 addition & 1 deletion doc/rest2html.py
Expand Up @@ -38,7 +38,7 @@ def pygments_directive(name, arguments, options, content, lineno,
content_offset, block_text, state, state_machine):
try:
lexer = get_lexer_by_name(arguments[0])
except ValueError, e:
except ValueError:
# no lexer found - use the text one instead of an exception
lexer = TextLexer()
# take an arbitrary option if more than one is given
Expand Down
5 changes: 3 additions & 2 deletions src/lxml/html/clean.py
Expand Up @@ -8,9 +8,10 @@
import copy
try:
from urlparse import urlsplit
from urllib import unquote_plus
except ImportError:
# Python 3
from urllib.parse import urlsplit
from urllib.parse import urlsplit, unquote_plus
from lxml import etree
from lxml.html import defs
from lxml.html import fromstring, XHTML_NAMESPACE
Expand Down Expand Up @@ -477,7 +478,7 @@ def _kill_elements(self, doc, condition, iterate=None):

def _remove_javascript_link(self, link):
# links like "j a v a s c r i p t:" might be interpreted in IE
new = _substitute_whitespace('', link)
new = _substitute_whitespace('', unquote_plus(link))
if _is_javascript_scheme(new):
# FIXME: should this be None to delete?
return ''
Expand Down
6 changes: 3 additions & 3 deletions src/lxml/html/tests/test_clean.txt
Expand Up @@ -18,7 +18,7 @@
... <body onload="evil_function()">
... <!-- I am interpreted for EVIL! -->
... <a href="javascript:evil_function()">a link</a>
... <a href="j\x01a\x02v\x03a\x04s\x05c\x06r\x07i\x0Ep t:evil_function()">a control char link</a>
... <a href="j\x01a\x02v\x03a\x04s\x05c\x06r\x07i\x0Ep t%20:evil_function()">a control char link</a>
... <a href="data:text/html;base64,PHNjcmlwdD5hbGVydCgidGVzdCIpOzwvc2NyaXB0Pg==">data</a>
... <a href="#" onclick="evil_function()">another link</a>
... <p onclick="evil_function()">a paragraph</p>
Expand Down Expand Up @@ -51,7 +51,7 @@
<body onload="evil_function()">
<!-- I am interpreted for EVIL! -->
<a href="javascript:evil_function()">a link</a>
<a href="javascrip t:evil_function()">a control char link</a>
<a href="javascrip t%20:evil_function()">a control char link</a>
<a href="data:text/html;base64,PHNjcmlwdD5hbGVydCgidGVzdCIpOzwvc2NyaXB0Pg==">data</a>
<a href="#" onclick="evil_function()">another link</a>
<p onclick="evil_function()">a paragraph</p>
Expand Down Expand Up @@ -84,7 +84,7 @@
<body onload="evil_function()">
<!-- I am interpreted for EVIL! -->
<a href="javascript:evil_function()">a link</a>
<a href="javascrip%20t:evil_function()">a control char link</a>
<a href="javascrip%20t%20:evil_function()">a control char link</a>
<a href="data:text/html;base64,PHNjcmlwdD5hbGVydCgidGVzdCIpOzwvc2NyaXB0Pg==">data</a>
<a href="#" onclick="evil_function()">another link</a>
<p onclick="evil_function()">a paragraph</p>
Expand Down
8 changes: 6 additions & 2 deletions tools/manylinux/build-wheels.sh
Expand Up @@ -24,12 +24,16 @@ build_wheel() {
-w /io/$WHEELHOUSE
}

assert_importable() {
run_tests() {
# Install packages and test
for PYBIN in /opt/python/*/bin/; do
${PYBIN}/pip install $PACKAGE --no-index -f /io/$WHEELHOUSE

# check import as a quick test
(cd $HOME; ${PYBIN}/python -c 'import lxml.etree, lxml.objectify')

# run tests
(cd $HOME; ${PYBIN}/python /io/test.py)
done
}

Expand Down Expand Up @@ -74,5 +78,5 @@ show_wheels() {
prepare_system
build_wheels
repair_wheels
assert_importable
run_tests
show_wheels

0 comments on commit 3f3082e

Please sign in to comment.