dgasmith/opt_einsum

update docs with function references

jcmgray committed Mar 4, 2018
1 parent 7da53b7 commit 47dac4d77eebf8d4916312b5cbb6204f44554706
 @@ -61,3 +61,5 @@ docs/_build/ target/ \.pytest_cache/ docs/source/autosummary/
 @@ -195,7 +195,7 @@ terms = ['a', 'a'] contraction = (0, 1) The most optimal path can be found by searching through every possible way to contract the tensors together, this includes all combinations with the new intermediate tensors as well. While this algorithm scales like N! and can often become more costly to compute than the unoptimized contraction itself, it provides an excellent benchmark. The function that computes this path in opt_einsum is called _path_optimal and works by iteratively finding every possible combination of pairs to contract in the current list of tensors. The function that computes this path in opt_einsum is called optimal and works by iteratively finding every possible combination of pairs to contract in the current list of tensors. This is iterated until all tensors are contracted together. The resulting paths are then sorted by total flop cost and the lowest one is chosen. This algorithm runs in about 1 second for 7 terms, 15 seconds for 8 terms, and 480 seconds for 9 terms limiting its overall usefulness for a large number of terms. By considering limited memory this can be sieved and can reduce the cost of computing the optimal function by an order of magnitude or more.
 @@ -0,0 +1,13 @@ ================== Function Reference ================== .. autosummary:: :toctree: autosummary opt_einsum.contract opt_einsum.contract_path opt_einsum.contract_expression opt_einsum.contract.ContractExpression opt_einsum.paths.optimal opt_einsum.paths.greedy
 @@ -19,6 +19,9 @@ # import os import sys import numpy import opt_einsum sys.path.insert(0, os.path.abspath('..')) # -- General configuration ------------------------------------------------ @@ -32,18 +35,23 @@ # ones. extensions = [ 'sphinx.ext.autodoc', 'sphinx.ext.autosummary', 'sphinx.ext.doctest', 'sphinx.ext.todo', 'sphinx.ext.coverage', 'sphinx.ext.mathjax', 'sphinx.ext.viewcode', 'sphinx.ext.napoleon', 'sphinx.ext.intersphinx', ] napoleon_google_docstring = False napoleon_use_param = False napoleon_use_ivar = True autosummary_generate = True autodoc_default_flags = ['members'] # Add any paths that contain templates here, relative to this directory. templates_path = ['_templates'] @@ -144,8 +152,8 @@ else: html_context = { 'css_files': [ '//media.readthedocs.org/css/sphinx_rtd_theme.css', '//media.readthedocs.org/css/readthedocs-doc-embed.css', '//media.readthedocs.org/css/sphinx_rtd_theme.css', '//media.readthedocs.org/css/readthedocs-doc-embed.css', '_static/theme_overrides.css' ] } @@ -187,7 +195,7 @@ # Add any paths that contain custom static files (such as style sheets) here, # relative to this directory. They are copied after the builtin static files, # so a file named "default.css" will overwrite the builtin "default.css". html_static_path = ['_static'] # html_static_path = ['_static'] # Add any extra paths that contain custom files (such as robots.txt or # .htaccess) here, relative to this directory. These files are copied @@ -388,3 +396,9 @@ def setup(app): 'issue': ('https://github.com/dgasmith/opt_einsum/issues/%s', 'GH#'), 'pr': ('https://github.com/dgasmith/opt_einsum/pull/%s', 'GH#') } intersphinx_mapping = { 'python': ('https://docs.python.org/3.6/', None), 'numpy': ('http://docs.scipy.org/doc/numpy/', None), }
 @@ -1,9 +1,9 @@ =========== Greedy Path =========== =============== The Greedy Path =============== Another way to find a path is to choose the best pair to contract at every iteration so that the formula scales like N^3. The "best" contraction pair is currently determined by the smallest of the tuple (-removed_size, cost) where removed size represents the product of the size of indices removed from the overall contraction and cost is the cost of the contraction. Another way to find a path is to choose the best pair to contract at every iteration so that the formula scales like N^3 - functionality provided by :func:~opt_einsum.paths.greedy. Basically, we want to remove the largest dimensions at the least cost. To prevent large outer products the results are sieved by the amount of memory available. Overall, this turns out to work extremely well and is only slower than the optimal path in several cases, and even then only by a factor of 2-4 while only taking 1 millisecond for terms of length 10.
 @@ -2,7 +2,7 @@ opt_einsum ========== Einsum is a very powerful function for contracting tensors of arbitrary :func:~numpy.einsum is a very powerful function for contracting tensors of arbitrary dimension and index. However, it is only optimized to contract two terms at a time resulting in non-optimal scaling. @@ -96,10 +96,19 @@ We can then view more details about the optimized contraction order: install examples .. toctree:: :maxdepth: 2 :caption: Path Information: path_finding reusing paths optimal_path greedy_path .. toctree:: :maxdepth: 1 :caption: Function Reference: api
 @@ -4,7 +4,7 @@ The Optimal Path The most optimal path can be found by searching through every possible way to contract the tensors together, this includes all combinations with the new intermediate tensors as well. While this algorithm scales like N! and can often become more costly to compute than the unoptimized contraction itself, it provides an excellent benchmark. The function that computes this path in opt_einsum is called _path_optimal and works by iteratively finding every possible combination of pairs to contract in the current list of tensors. The function that computes this path in opt_einsum is called :func:~opt_einsum.paths.optimal and works by iteratively finding every possible combination of pairs to contract in the current list of tensors. This is iterated until all tensors are contracted together. The resulting paths are then sorted by total flop cost and the lowest one is chosen. This algorithm runs in about 1 second for 7 terms, 15 seconds for 8 terms, and 480 seconds for 9 terms limiting its overall usefulness for a large number of terms. By considering limited memory this can be sieved and can reduce the cost of computing the optimal function by an order of magnitude or more. @@ -32,6 +32,7 @@ We can consider three possible combinations where we contract list positions (0, [ (9504, [(0, 1)], [set(['a', 'c']), set(['a', 'c', 'b', 'd']) ]), (1584, [(0, 2)], [set(['c', 'd']), set(['c', 'b']) ]), (864, [(1, 2)], [set(['a', 'c', 'b']), set(['a', 'c', 'd']) ])] We have now run through the three possible combinations, computed the cost of the contraction up to this point, and appended the resulting indices from the contraction to the list. As all contractions only have two remaining input sets the only possible contraction is (0, 1):
 @@ -21,20 +21,20 @@ This is a single possible path to the final answer (and notably, not the most op .. code:: python import opt_einsum as oe # Take a complex string einsum_string = 'bdik,acaj,ikab,ajac,ikbd->' # Build random views to represent this contraction unique_inds = set(einsum_string.replace(',', '')) index_size = [10, 17, 9, 10, 13, 16, 15, 14] sizes_dict = {c : s for c, s in zip(set(einsum_string), index_size)} views = oe.helpers.build_views(einsum_string, sizes_dict) path_info = oe.contract_path(einsum_string, *views) >>> print path_info [(1, 3), (0, 2), (0, 2), (0, 1)]   >>> print path_info
 @@ -0,0 +1,28 @@ ============= Reusing paths ============= If you expect to repeatedly use a particular contraction it can make things simpler and more efficient to not compute the path each time. Instead, supplying :func:~opt_einsum.contract_expression with the contraction string and the shapes of the tensors generates a :class:~opt_einsum.contract.ContractExpression which can then be repeatedly called with any matching set of arrays. For example: .. code:: python >>> my_expr = oe.contract_expression("abc,cd,dbe->ea", (2, 3, 4), (4, 5), (5, 3, 6)) >>> print(my_expr) for 'abc,cd,dbe->ea': 1. 'dbe,cd->bce' [GEMM] 2. 'bce,abc->ea' [GEMM] Now we can call this expression with 3 arrays that match the original shapes without having to compute the path again: .. code:: python >>> x, y, z = (np.random.rand(*s) for s in [(2, 3, 4), (4, 5), (5, 3, 6)]) >>> my_expr(x, y, z) array([[ 3.08331541, 4.13708916], [ 2.92793729, 4.57945185], [ 3.55679457, 5.56304115], [ 2.6208398 , 4.39024187], [ 3.66736543, 5.41450334], [ 3.67772272, 5.46727192]]) Note that few checks are performed when calling the expression, and while it will work for a set of arrays with the same ranks as the original shapes but differing sizes, it might no longer be optimal.
 @@ -25,14 +25,14 @@ def contract_path(*operands, **kwargs): - if a list is given uses this as the path. - 'greedy' An algorithm that chooses the best pair contraction at each step. Scales cubically with the number of terms in the contraction. at each step. Scales cubically with the number of terms in the contraction. - 'optimal' An algorithm that tries all possible ways of contracting the listed tensors. Scales exponentially with the number of terms in the contraction. contracting the listed tensors. Scales exponentially with the number of terms in the contraction. use_blas : bool Use BLAS functions or not memory_limit : int, optional (default: largest input or output array size) Maximum number of elements allowed in intermediate arrays. @@ -305,11 +305,11 @@ def contract(*operands, **kwargs): - if a list is given uses this as the path. - 'greedy' An algorithm that chooses the best pair contraction at each step. Scales cubically with the number of terms in the contraction. at each step. Scales cubically with the number of terms in the contraction. - 'optimal' An algorithm that tries all possible ways of contracting the listed tensors. Scales exponentially with the number of terms in the contraction. contracting the listed tensors. Scales exponentially with the number of terms in the contraction. memory_limit : int or None (default : None) The upper limit of the size of tensor created, by default this will be @@ -339,7 +339,7 @@ def contract(*operands, **kwargs): Examples -------- See opt_einsum.contract_path or numpy.einsum See :func:opt_einsum.contract_path or :func:numpy.einsum """ optimize_arg = kwargs.pop('optimize', True)
 @@ -32,7 +32,7 @@ def optimal(input_sets, output_set, idx_dict, memory_limit): >>> isets = [set('abd'), set('ac'), set('bdc')] >>> oset = set('') >>> idx_sizes = {'a': 1, 'b':2, 'c':3, 'd':4} >>> _path_optimal(isets, oset, idx_sizes, 5000) >>> optimal(isets, oset, idx_sizes, 5000) [(0, 2), (0, 1)] """ @@ -206,7 +206,7 @@ def greedy(input_sets, output_set, idx_dict, memory_limit): >>> isets = [set('abd'), set('ac'), set('bdc')] >>> oset = set('') >>> idx_sizes = {'a': 1, 'b':2, 'c':3, 'd':4} >>> _path_greedy(isets, oset, idx_sizes, 5000) >>> greedy(isets, oset, idx_sizes, 5000) [(0, 2), (0, 1)] """