Sort functions by their asymptotic complexity in increasing order
>>> import complexity_sort >>> sample_funcs = ['1.000001 ** n', ... 'n**0.9999999 * log(n)', ... '1000000 * n', ... 'n**2', ... 'sin(n) + 10000000'] >>> complexity_sort.sort(sample_funcs, parser='sympy') [sin(n) + 10000000, n**0.9999999*log(n), 1000000*n, n**2, 1.000001**n]
- SymPy 1.0, although it may work on versions as early as 0.7.7 (untested)
The interface of complexity_sort comprises of a single function,
>>> help(complexity_sort.sort) Help on function sort in module complexity_sort.complexity_sort: sort(it, variable=None, parser=None) Sort an iterable of functions by their asymptotic complexity. Admissible functions: - Must have ranges restricted to the non-negative real numbers. Parameters: it (str or sympy.Expr) : Iterable of functions to be sorted variable (str or sympy.Expr, optional): Variable to compare complexities with respect to. Only necessary if there are multiple variables in the functions. Defaults to None. parser (str, optional): Parser with which to parse functions if iterable is of string type. Options are 'sympy' and 'mathematica'. Defaults to None. Returns: list : A sorted list of functions upon success. Raises: ValueError: - If the iterable does not contain enough comparable elements to sort.
The important thing to note here is that all functions must have ranges restricted to the positive reals in order to be admissible. This does include oscillating functions.
Furthermore, complexity_sort is not guaranteed to produce a sorted list for all admissible functions. It is guaranteed to produce a correct sorted list should it prove capable of returning a sorted list however, or at least that is the idea.
complexity_sort is nothing more than the application of a few ideas, leaning heavily on the symbolic mathematics library SymPy.
How it works
Direct approach first
We first try to sort the iterable simply using Python's built-in Timsort algorithm1 by applying a custom comparator between functions.
This custom comparator:
- Tries to directly apply Lemma 1 to determine the relation between two functions.
- If the limit for Lemma 1 does not exist (often because the function is oscillating), the comparator tries to apply Lemma 2 to determine the relation.
Falling back to topological sort
Should both attempts between two functions fail with the custom comparator, we reconsider our sorting with a directed graph2.
We let each function denote a vertex in our graph, and let the directed edges represent a "greater-than" relation between two functions.
We then attempt to apply a topological sort to the graph with the hope that it contains no directed cycles3. This is permissible due to the transitive relational properties of the asymptotic definitions.
Note: Although the comments above are the conventional way to describe topological sort, to me it is more intuitive to see it as a product of the Szpilrajn extension theorem, which intuitively just says that a set does not have to assert that every element is greater than or less than some other element.
Consider the list of functions from above:
We cannot easily determine the relation between
10000000n with Lemma 1 and Lemma 2. However as long as they are separable in asymptotic complexity order by at least one other function we can still we can sort the list.
More formally, we require that our corresponding graph is a directed acyclic graph4. Our corresponding directed graph in this case looks something like this.
This graph is evidently acyclic, as
10000000n is linear,
n^2 is quadratic, and
1.000001^n is exponential =>
n^2 ensures there is no cycle between the incomparable elements. Correspondingly,
complexity_sort.sort can successfully sort the functions.
Should both lemmas fail with direct sorting and our directed graph contain a cycle,
complexity_sort.sort will indicate that it does not have enough comparable elements to sort the iterable.
>>> a = ['1.000001 ** n', ... 'n**0.99999999 * log(n)', ... '10000000*n', ... 'sin(n) + 100000'] # remove the quadratic n^2 >>> complexity_sort.sort(a, parser='sympy') ValueError: List is not sortable with number of comparable elements.
To be implemented
More test cases on a greater diversity of function lists.
Provide proofs for these lemmas using basic limit definitions.
If you see anything else that could be improved, by all means go for it or drop me a mail and let me know your thoughts. My email can be find on my personal information page.
Copyright (c) 2016 Mitchell Edward Snaith
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
- 1 : Timsort (Wiki): https://en.wikipedia.org/wiki/Timsort
- 2 : Directed graph (Wiki) : https://en.wikipedia.org/wiki/Directed_graph
- 3 : Directed cycles (Wiki) : https://en.wikipedia.org/wiki/Cycle_graph#Directed_cycle_graph
- 4 : Directed acyclic graph (Wiki) : https://en.wikipedia.org/wiki/Directed_acyclic_graph