A fast python tool for creating permutations of alphanumerics
What it does
If you have an input file containing:
1 2 3
This will produce
1 2 3 12 13 21 23 31 32 123 132 213 231 312 321
How to run it
I recommend using pypy for maximum speed.
pypy3 permute.py <input file> <output base name>
This will create multiple files with the length of the strings in the file prepended e.g. if you give an output base name of "foo", then "7foo" will contain all permutations of length 7.
You can combine all permutations into one file with:
cat *<output base name> > <output base name>
Generating specific lengths
You can limit the lengths of output strings generated using permute-minmax.py and passing it the minimum and maximum lengths you'd like to generate. e.g.
pypy3 permute-minmax.py <input file> <output base name> <min length> <max length>
As with permute.py you can concatenate the multiple output files together if you'd like to combine them.
When I first started creating this tool, I was generating about 100M of output in 5s on my MBP running on a nearly dead battery. Currently, the code clocks in at 1.4G written in 5s on the same laptop. This is a list of the things that worked and didn't work.
pypy is much faster than CPython thanks to Just-In-Time compiler magic documented at http://speed.pypy.org/. For example, using the following code:
with open(argv,'r') as infile: start = infile.read().split("\n") with open(argv, 'w') as outfile: for i in range(0,len(start)+1): for x in permutations(start, i): outfile.write(''.join(x))
I got the following output written in 5s:
INTERPRETER=python SCRIPT=permute.py 117M out.txt INTERPRETER=python3 SCRIPT=permute.py 131M out.txt INTERPRETER=pypy SCRIPT=permute.py 261M out.txt INTERPRETER=pypy3 SCRIPT=permute.py 253M out.txt
That's a nearly 2x speedup from python3 to pypy3.
After reading https://www.reddit.com/r/unix/comments/6gxduc/how_is_gnu_yes_so_fast/ I decided to implement a simple counter that appended to a temporary buffer which was periodically flushed to disk. I hand bisected values from 10k to 1 and 18 worked best for my data set. I suspect this is quite specific to my data set and computer, and you may have luck with other values.
This gave a 1.3X speedup on the pypy3 best speed from above.
Multithreading was a complete disaster, multiprocessing worked well.
This presented three problems, the first was “what work do I spawn in a different thread”, the second was “how do I write to one file”, and the third was “how do I pass resources like file handles between processes”. I struggled to split up the work of the permutations() call into different processes, so settled instead on splitting each run of the for loop into a different process. Practically, this means you’ll have as many threads as lines in your input file, and that the first few threads will exit quickly (e.g. the first thread will just re-output your input file). So by the end of a long run, you’ll just have one thread taxing one core.
I didn’t handle the “how do I write to one file” and instead wrote to multiple files, then cat then together (aka a hack).
This gives another 4.4X speedup on top of the pypy and handbuffering speedups. Unbuffered, this still gives a 2.8X speedup.
What didn't work
Not using strings
In Python, strings are immutable (i.e. can’t be changed). That means appending to a string creates a whole new string object. Lists  aren’t immutable, and can be modified. The idea was that if I append()’ed to the buffer as a list, it would be less intensive. While this makes sense, the final list would need to be join()’ed which I suspect is the slow down on the larger lists. Thus having the buffer as a string made more sense.
Additionally, if you google for example usage of permutations() like a good programmer, you’ll see lots of people wrap the call in a list() so they get that instead of a list. I was doing that originally, and ended up with a program that would get killed by the OOM after the list took up multiple Gs. That’s why I chose to write things to a file as soon as I got them.
Using fancy Python
Python has two cool calls map() and lambda. map() effectively runs a function across a list in a single call instead of you hand looping. lambda lets you define functions on the fly. So instead of this code:
for x in permutations(start, i): outfile.write(''.join(x))
You could have this code:
map((lambda x: outfile.write(''.join(x)+'\n')), permutations(start, i))
This was unfortunately, slower.
Next up, I tried list comprehensions. This is another way to run a function across a list with an optional filter. I didn’t need the filter so tried using this:
[outfile.write(''.join(x)+'\n') for x in permutations(start, i)]
While pretty, it too was slower.
I mentioned this earlier, but threading turned into a huge failure, and I’m still not sure why, which bugs me. Python has some famous weirdness with multithreading due to the Global Interpreter Lock (GIL). I thought this wouldn’t be too much of a problem, given it’s not supposed to impact I/O. It turns out it does, or at least my code was bad
First I tried something really simple, to use the threaded map() call with my fancy Python from above. That practically meant I changed the map call from the above to:
... from multiprocessing.dummy import Pool as ThreadPool ... pool = ThreadPool(4) ... pool.map((lambda x: outfile.write(''.join(x)+'\n')), permutations(start, i)) ...
This was a spectacular failure. Next I tried more vanilla threading as described in https://pymotw.com/2/threading/. It looked so similar to the multiprocess example, that it’s not worth repeating here. Needless to say, it was the worst performer of any example with a 10x slowdown.