Skip to content

Commit

Permalink
Merge pull request #6 from miki725/minify
Browse files Browse the repository at this point in the history
Minify
  • Loading branch information
miki725 committed Nov 22, 2017
2 parents 9134304 + 0990d64 commit 8b1cdd5
Show file tree
Hide file tree
Showing 5 changed files with 24 additions and 14 deletions.
8 changes: 7 additions & 1 deletion HISTORY.rst
Expand Up @@ -3,13 +3,19 @@
History
-------

0.1.2 (2017-11-22)
~~~~~~~~~~~~~~~~~~

* Fixed: Not removing all spaces between html tags.
Sometimes spaces matter for formatting.
For example ``<strong>Hello</strong> <i>World</i>`` cannot be minified any further.

0.1.1 (2016-09-26)
~~~~~~~~~~~~~~~~~~

* Fixed: Cache properties now allow to set cache value via ``foo = bar``
syntax when cache descriptor has ``as_property == True``


0.1.0 (2015-11-26)
~~~~~~~~~~~~~~~~~~

Expand Down
2 changes: 1 addition & 1 deletion django_auxilium/__init__.py
@@ -1,2 +1,2 @@
__version__ = '0.1.1'
__version__ = '0.1.2'
__author__ = 'Miroslav Shubernetskiy'
17 changes: 9 additions & 8 deletions django_auxilium/utils/html.py
Expand Up @@ -2,7 +2,6 @@
import re

import six
from django.utils.html import strip_spaces_between_tags
from six.moves.html_entities import name2codepoint
from six.moves.html_parser import HTMLParser

Expand All @@ -11,6 +10,7 @@

EXCLUDE_TAGS = ('textarea', 'pre', 'code', 'script',)
RE_WHITESPACE = re.compile(r'\s{2,}|\n')
RE_SPACE_BETWEEN_TAGS = re.compile(r'>(?:\s{2,}|\n)<')
RE_EXCLUDE_TAGS = re.compile(
"""( # group for results to be included in re.split
<(?:{0}) # match beginning of one of exclude tags
Expand All @@ -27,10 +27,11 @@ def simple_minify(html):
"""
Minify HTML with very simple algorithm.
This function tries to minify HTML by stripping all spaces between all html tags
(e.g. ``</div> <div>`` -> ``</div><div>``). This step is accomplished by using
Django's ``strip_spaces_between_tags`` method. In addition to that, this function
replaces all whitespace (more then two consecutive whitespace characters or new line)
This function tries to minify HTML by stripping most spaces between all html tags
(e.g. ``</div> <div>`` -> ``</div> <div>``). Note that not all spaces are removed
since sometimes that can adjust rendered HTML (e.g. ``<strong>Hello</strong> <i></i>``).
In addition to that, this function replaces all whitespace
(more then two consecutive whitespace characters or new line)
with a space character except inside excluded tags such as ``pre`` or ``textarea``.
**Though process**:
Expand All @@ -54,8 +55,7 @@ def simple_minify(html):
appended to final HTML since as explained above, they are guaranteed
to be content of excluded tags hence do not require minification.
#. All even indexed elements are minified by stripping whitespace between
tags by using Django's ``strip_spaces_between_tags`` and redundant
whitespace is stripped in general via simple regex.
tags and redundant whitespace is stripped in general via simple regex.
You can notice that the process does not involve parsing HTML since that
usually adds some overhead (e.g. using beautiful soup). By using 2 regex
Expand All @@ -65,7 +65,8 @@ def simple_minify(html):
html = ''
for i, component in enumerate(components):
if i % 2 == 0:
component = strip_spaces_between_tags(component.strip())
component = component.strip()
component = RE_SPACE_BETWEEN_TAGS.sub('> <', component)
component = RE_WHITESPACE.sub(' ', component)
html += component
else:
Expand Down
3 changes: 2 additions & 1 deletion requirements-dev.txt
Expand Up @@ -5,9 +5,10 @@ django-formtools
flake8
importanize
mock
pytest>=2.9
pdbpp
pytest-cov
pytest-django
pytest>=2.9
python-magic
sphinx
sphinx-autobuild
Expand Down
8 changes: 5 additions & 3 deletions tests/utils/test_html.py
Expand Up @@ -18,6 +18,8 @@
</script>
</head>
<body>
<strong>Hello</strong> <i>World</i>
<strong>Hello</strong><i>Mars</i>
<div>Content Here</div>
<textarea>
Input
Expand All @@ -42,11 +44,11 @@
</html>
"""

MINIFY_EXPECTED = """<html><head><title>Minify Test</title><script>
MINIFY_EXPECTED = """<html> <head> <title>Minify Test</title><script>
(function() {
console.log('hello world');
})();
</script></head><body><div>Content Here</div><textarea>
</script></head> <body> <strong>Hello</strong> <i>World</i> <strong>Hello</strong><i>Mars</i> <div>Content Here</div><textarea>
Input
Here
123
Expand All @@ -61,7 +63,7 @@
(function() {
console.log('inside body script');
})();
</script></body></html>"""
</script></body> </html>"""


EXTRACT_INPUT = """
Expand Down

0 comments on commit 8b1cdd5

Please sign in to comment.