Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Multiprocessing support to pyjscompressor.py #764

Closed
wants to merge 3 commits into from

2 participants

@happyalu

This adds a -j option to pyjscompress (like make), to speed up compression by using python's multiprocessing.

@happyalu happyalu closed this
@happyalu happyalu reopened this
@xtfxme
Owner

on the whole your changes look good/an improvement, but you've made far too many style changes for me to be able to review this without reviewing the file in entirety.

style changes must to be isolated in their own commit, free of any functional changes. if you can create a pull request with only the bare minimum needed to implement multiprocessing i will be happy to review and merge!

once that's merged, you can certainly submit a pull request to add option parsing, and/or better style, in two additional pull requests. in general, each request should encapsulate a single functional change else the review time grows exponentially.

lastly, pyjs needs to support python 2.5+, so things like argparse won't be available -- optparse must be used instead.

thanks @happyalu! just create another request with the multiprocessing bits and we'll go from there.

@xtfxme xtfxme closed this
@happyalu

@xtfxme Commit 6375156 only has style changes, and commit 5248c57 only has the stuff required for multiprocessing. I had to change some things (such as, using tempfile module instead of temp/ directory as a part of the multiprocessing stuff.

About the style changes: they are mostly related to lines longer than 80 chars.

I am not sure how to create pull requests of single commits :) Github seems to have pushed the entire branch out as pull request. What do you recommend?

@happyalu

Also, argparse falls back to optparse in 5248c57.. and I tested the fallback to work.. but I could only check it on Python 2.7

@xtfxme
Owner

@happyalu, ah ok, i see that now ... i didn't notice among all the green and red ;-)

a fallback in the manner you've written would be fine. as long as it added in an isolated commit/pull-request i've no problem pulling it.

@xtfxme
Owner

@happyalu, oops i missed you first response.

github links a pull request to a branch in your repository, so each pull request needs to be a separate branch; this also lets you work on issues independently. for projects i contribute to, i usually make a branch like defect/description-of-the-issue or feature/something-new-i-added then open a request from that branch.

don't get me wrong, from what i see your changes are all improvements, i just need them incrementally else i can't review well ... the file is currently 170 LOC but your pull req shows a doubling of that, essentially a rewrite. again not a problem in the end, but needs to be pulled in a controlled manner.

simply make a branch, commit the absolute bare minimum needed to support multiprocessing, then push to github + open a request. you can later start another branch adding option handling, and finally a third with style changes. i'd love to pull everything you have, but i'm tight on resources and need to be able to review quickly.

@happyalu

@xtfxme no I totally understand :) I created a separate branch for the style changes. It would be easier for me if the style changes (pep8 related) are merged first.. Since the multiprocessing code was written after that .. :-P

I hope that is ok!

Do you recommend I bundle option handling with multiprocessing changes? (since -j is the option for supporting multiproc) Or should I divide those changes into two?

@xtfxme
Owner

@happyalu, i would prefer the option changes to be separate from multiprocessing as well. you can add multiprocessing first and simply default to using it -- i see no reason why multiprocessing can't be used by default :-)

once thats's added, you can definitely clean up/add option handling, and thus add the ability to force a certain -j <num>.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
This page is out of date. Refresh to see the latest.
Showing with 239 additions and 85 deletions.
  1. +232 −84 contrib/pyjscompressor.py
  2. +7 −1 library/pyjamas/HTTPRequest.ie6.py
View
316 contrib/pyjscompressor.py
@@ -1,10 +1,14 @@
#!/usr/bin/env python
# Copyright (C) 2010 Sujan Shakya, suzan.shakya@gmail.com
#
+# Modified July 2012 Alok Parlikar, aup@cs.cmu.edu
+# to add multiprocessing
+#
# This script works with the google closure compiler
# http://closure-compiler.googlecode.com/files/compiler-latest.zip
#
-# The closure compiler requires java to be installed and an entry for your java directory in your system PATH
+# The closure compiler requires java to be installed and an entry for
+# your java directory in your system PATH
#
# The script needs the path to your google closure compiler.jar file:
# Pass the path to your compiler as the second argument or
@@ -12,15 +16,36 @@
# Then run this script. This will reduce the output size to ~50%.
-# To run type:
-# python pyjscompressor.py <path_to_your_pyjamas_output_directory> [<compiler path>]
-# from command line in the directory of this script
-import re, os, sys, shutil, subprocess
+# Usage:
+# python pyjscompressor.py [-c COMPILER] [-j NUM] <pyjs_output_directory>
+#
+# optional arguments:
+# -h, --help show this help message and exit
+# -c COMPILER, --compiler COMPILER
+# Path to Google Closure compiler.jar
+#
+# -j NUM Run NUM processes in parallel
+
+
+import os
+import re
+import shutil
+import subprocess
+import sys
+import tempfile
+try:
+ import multiprocessing
+ enable_multiprocessing = True
+except ImportError:
+ enable_multiprocessing = False
-MERGE_SCRIPTS = re.compile('</script>\s*(?:<!--.*?-->\s*)*<script(?:(?!\ssrc).)*?>', re.DOTALL)
+
+MERGE_SCRIPTS = re.compile(
+ '</script>\s*(?:<!--.*?-->\s*)*<script(?:(?!\ssrc).)*?>', re.DOTALL)
SCRIPT = re.compile('<script(?:(?!\ssrc).)*?>(.*?)</script>', re.DOTALL)
+
def compile(js_file, js_output_file, html_file=''):
# SIMPLE_OPTIMIZATIONS has some problem with Opera, so we'll use
# WHITESPACE_ONLY for opera
@@ -28,40 +53,45 @@ def compile(js_file, js_output_file, html_file=''):
level = 'WHITESPACE_ONLY'
else:
level = 'SIMPLE_OPTIMIZATIONS'
- args = ['java', '-jar', COMPILER, '--compilation_level', level, '--js', js_file, '--js_output_file', js_output_file]
- error = subprocess.call(args=args, stdout=open(os.devnull, 'w'), stderr=subprocess.STDOUT)
+
+ global compiler_path
+ args = ['java',
+ '-jar', compiler_path,
+ '--compilation_level', level,
+ '--js', js_file,
+ '--js_output_file', js_output_file]
+
+ error = subprocess.call(args=args,
+ stdout=open(os.devnull, 'w'),
+ stderr=subprocess.STDOUT)
+
if error:
- shutil.rmtree("temp")
- raise Exception, 'Error(s) occurred while compiling %s, possible cause: file may be invalid javascript.' % js_file
+ raise Exception(' '.join([
+ 'Error(s) occurred while compiling %s' % js_file,
+ 'possible cause: file may be invalid javascript.']))
+
def compress_css(css_file):
- sys.stdout.write('Compressing %-40s' % css_file)
- sys.stdout.flush()
- css_output_file = 'temp/%s.ccss' % os.path.basename(css_file)
+ css_output_file = tempfile.NamedTemporaryFile()
f = open(css_file)
css = f.read()
css = re.sub(r"\s+([!{};:>+\(\)\],])", r"\1", css)
css = re.sub(r"([!{}:;>+\(\[,])\s+", r"\1", css)
css = re.sub(r"\s+", " ", css)
- f = open(css_output_file, 'w')
- f.write(css)
- f.close()
- return finish_compressors(css_output_file, css_file)
+ css_output_file.write(css)
+ css_output_file.flush()
+ return finish_compressors(css_output_file.name, css_file)
+
def compress_js(js_file):
- sys.stdout.write('Compressing %-40s' % js_file)
- sys.stdout.flush()
- js_output_file = 'temp/%s.cjs' % os.path.basename(js_file)
- compile(js_file, js_output_file)
- return finish_compressors(js_output_file, js_file)
+ js_output_file = tempfile.NamedTemporaryFile()
+ compile(js_file, js_output_file.name)
+ return finish_compressors(js_output_file.name, js_file)
+
def compress_html(html_file):
- sys.stdout.write('Compressing %-40s' % html_file)
- sys.stdout.flush()
- js_file = 'temp/pyjs%d.js'
- js_output_file = 'temp/pyjs%d.cjs'
- html_output_file = 'temp/compiled.html'
+ html_output_file = tempfile.NamedTemporaryFile()
f = open(html_file)
html = f.read()
@@ -73,98 +103,216 @@ def compress_html(html_file):
# now extract the merged scripts
template = '<!--compiled-js-%d-->'
scripts = []
+
def script_repl(matchobj):
scripts.append(matchobj.group(1))
return '<script type="text/javascript">%s</script>' % template % \
- (len(scripts)-1)
+ (len(scripts) - 1)
+
html = SCRIPT.sub(script_repl, html)
- # save js files in temp dir and compile them with simple optimizations
- for i, script in enumerate(scripts):
- f = open(js_file % i, 'w')
- f.write(script)
- f.close()
- compile(js_file % i, js_output_file % i, html_file)
+ # save js files as temporary files and compile them with simple
+ # optimizations
+
+ js_output_files = []
+ for script in scripts:
+ js_file = tempfile.NamedTemporaryFile()
+ js_file.write(script)
+ js_file.flush()
+ js_output_file = tempfile.NamedTemporaryFile()
+ js_output_files.append(js_output_file)
+ compile(js_file.name, js_output_file.name, html_file)
# now write all compiled js back to html file
- for i in xrange(len(scripts)):
- f = open(js_output_file % i)
- script = f.read()
- f.close()
- html = html.replace(template % i, script)
-
- f = open(html_output_file, 'w')
- f.write(html)
- f.close()
- return finish_compressors(html_output_file, html_file)
+ for idx, js_output_file in enumerate(js_output_files):
+ script = js_output_file.read()
+ html = html.replace(template % idx, script)
+
+ html_output_file.write(html)
+ html_output_file.flush()
+ return finish_compressors(html_output_file.name, html_file)
+
def finish_compressors(new_path, old_path):
- p_size, n_size = getsize(old_path),getsize(new_path)
- os.remove(old_path)
- os.rename(new_path, old_path)
- print ' Ratio: %4.1f%%'% getcompression(p_size, n_size)
- return p_size, n_size
+ p_size, n_size = getsize(old_path), getsize(new_path)
+ shutil.copyfile(new_path, old_path)
+ return p_size, n_size, old_path
+
def compress(path):
- ext = os.path.splitext(path)[1]
- if ext == '.css':
- return compress_css(path)
- elif ext == '.js':
- return compress_js(path)
- elif ext == '.html':
- return compress_html(path)
- uncomp_type_size = getsize(path)
- return (uncomp_type_size, uncomp_type_size)
+ try:
+ ext = os.path.splitext(path)[1]
+ if ext == '.css':
+ return compress_css(path)
+ elif ext == '.js':
+ return compress_js(path)
+ elif ext == '.html':
+ return compress_html(path)
+ uncomp_type_size = getsize(path)
+ return (uncomp_type_size, uncomp_type_size, path)
+ except KeyboardInterrupt:
+ pass
+
def getsize(path):
return os.path.getsize(path)
+
def getcompression(p_size, n_size):
try:
return n_size / float(p_size) * 100
except ZeroDivisionError:
return 100.0
-def compress_all(path):
- if not os.path.exists('temp'):
- os.makedirs('temp')
- print '%17s %45s' % ('Files', 'Compression')
+def compress_all(path, compiler_path, num_procs):
+ print('%45s %s' % ('Files', 'Compression'))
p_size = 0
n_size = 0
if os.path.isfile(path):
- p_size, n_size = compress(path)
+ p_size, n_size, oldpath = compress(path)
else:
+ files_to_compress = []
for root, dirs, files in os.walk(path):
- if 'temp' in root:
- continue
for file in files:
- dp, dn = compress(os.path.join(root, file))
+ files_to_compress.append(os.path.join(root, file))
+
+ if num_procs >= 0:
+ p = multiprocessing.Pool(num_procs)
+ result = p.imap(compress, files_to_compress, 1)
+
+ count_done = 0
+ count_total = len(files_to_compress)
+ try:
+ item = None
+ while count_done < count_total:
+ try:
+ item = result.next(0.5)
+ count_done += 1
+ except multiprocessing.TimeoutError:
+ continue
+ dp, dn, path = item
+ p_size += dp
+ n_size += dn
+ try:
+ ratio = dn / float(dp) * 100
+ except ZeroDivisionError:
+ ratio = 100.0
+ smallpath = os.path.basename(path)
+ smallpath = smallpath[:40] + (smallpath[40:] and '..')
+ print('%45s %4.1f%%' % (smallpath, ratio))
+ except KeyboardInterrupt:
+ p.terminate()
+ raise
+ else:
+ for file in files_to_compress:
+ try:
+ (dp, dn, path) = compress(file)
+ except TypeError:
+ break
p_size += dp
n_size += dn
+ try:
+ ratio = dn / float(dp) * 100
+ except ZeroDivisionError:
+ ratio = 100.0
+ smallpath = os.path.basename(path)
+ smallpath = smallpath[:40] + (smallpath[40:] and '..')
+ print('%45s %4.1f%%' % (smallpath, ratio))
compression = getcompression(p_size, n_size)
- shutil.rmtree("temp")
sizes = "Initial size: %.1fKB Final size: %.1fKB" % \
- (p_size/1024., n_size/1024.)
- print '%s %s' % (sizes.ljust(51), "%4.1f%%" % compression)
+ (p_size / 1024., n_size / 1024.)
+ print('%s %s' % (sizes.ljust(51), "%4.1f%%" % compression))
if __name__ == '__main__':
- if len(sys.argv) == 1:
- print('usage: python pyjs_compressor.py <pyjamas_output_dir> [<path to compiler.jar>]')
- sys.exit()
- elif len(sys.argv) == 2:
- dir = sys.argv[1]
- if not os.environ.has_key('COMPILER'):
- sys.exit('environment variable COMPILER is not defined.\n'
- 'In bash, export '
- 'COMPILER=/home/me/google/compiler/compiler.jar or pass the path to your compiler.jar as the second argument.')
- COMPILER = os.environ['COMPILER']
+ try:
+ import argparse
+ # Available only on Python 2.7+
+ mode = 'argparse'
+ except ImportError:
+ import optparse
+ mode = 'optparse'
+
+ # Take one position argument (directory)
+ # and optional arguments for compiler path and multiprocessing
+
+ num_procs = -1 # By default, disable multiprocessing
+
+ global compiler_path
+
+ if mode == 'argparse':
+ parser = argparse.ArgumentParser(
+ description='Compress HTML, CSS and JS in PYJS output')
+
+ parser.add_argument('directory', type=str,
+ help='Pyjamas Output Directory')
+ parser.add_argument('-c', '--compiler', type=str, default='',
+ help='Path to Google Closure compiler.jar')
+ parser.add_argument('-j', metavar='NUM', default=-1, type=int,
+ dest='num_procs',
+ help='Run NUM processes in parallel')
+ args = parser.parse_args()
+ directory = args.directory
+ compiler_path = args.compiler
+ try:
+ num_procs = args.num_procs
+ except:
+ num_procs = -1
else:
- dir = sys.argv[1]
- COMPILER = sys.argv[2]
+ # Use optparse
+ usage = 'usage: %prog [options] <pyjamas-output-directory>'
+ parser = optparse.OptionParser(usage=usage)
+ parser.add_option('-c', '--compiler', type=str, default='',
+ help='Path to Google Closure compiler.jar')
+ parser.add_option('-j', metavar='NUM', default=-1, type=int,
+ dest='num_procs',
+ help='Run NUM processes in parallel')
+ options, args = parser.parse_args()
+ if len(args) != 1:
+ parser.error('Please specify the directory to compress')
+
+ directory = args[0]
+ compiler_path = args.compiler
+ try:
+ num_procs = args.num_procs
+ except:
+ num_procs = -1
+
+ if not enable_multiprocessing:
+ num_procs = -1
+ print("multiprocessing not available.")
+
+ if num_procs == 0:
+ print("Detecting cpu_count")
+ try:
+ num_procs = multiprocessing.cpu_count()
+ except NotImplementedError:
+ print("Could not determine CPU Count. Using One process")
+ num_procs = 1
+
+ if num_procs > 0:
+ print("Running %d processes" % num_procs)
+
+ if not compiler_path:
+ # Not specified on command line
+ # Try environment
+ try:
+ compiler_path = os.environ['COMPILER']
+ except KeyError:
+ sys.exit('Closure compiler not found\n'
+ 'Either specify it using the -c option,\n'
+ 'or set the COMPILER environment variable to \n'
+ 'the location of compiler.jar')
+
+ if not os.path.isfile(compiler_path):
+ sys.exit('\n'.join([
+ 'Compiler path "%s" not valid.' % compiler_path,
+ 'Check the path to your compiler is correct.']))
- if not os.path.isfile(COMPILER):
- raise Exception, 'Compiler path "%s" not valid. Check the path to your compiler is correct.' % COMPILER
- compress_all(dir)
+try:
+ compress_all(directory, compiler_path, num_procs)
+except KeyboardInterrupt:
+ print()
+ print('Compression Aborted')
View
8 library/pyjamas/HTTPRequest.ie6.py
@@ -1,5 +1,11 @@
class HTTPRequest(object):
def doCreateXmlHTTPRequest(self):
- return JS("""new ActiveXObject("Msxml2['XMLHTTP']")""")
+ try:
+ return JS("""new ActiveXObject("Msxml2['XMLHTTP']")""")
+ except:
+ try:
+ return JS("""new ActiveXObject("Microsoft.XMLHTTP")""")
+ except:
+ return JS("""new window.XMLHttpRequest()""")
Something went wrong with that request. Please try again.