allowing singletonize Symbols #6

mmatera · 2021-09-14T23:15:14Z

@rocky, maybe you can help me with this. For some reason, all the test pass if I enable the line that singletonizes the Symbol class, except for one test in combinatorica. As you know better that code, could you help me investigating why this happens?

If this starts to work, I guess that we can get an improvement in performance if we use the id() of the symbols instead of the string name the symbol in the pattern matching routines.

rocky · 2021-09-15T00:01:09Z

@rocky, maybe you can help me with this.

Sure, but next week sometime.

mmatera · 2021-09-15T12:15:42Z

Just to evaluate the possible advantage of using the singletonized version of symbols, here is a comparison of how much takes to do comparisons of strings and objects in my computer:

This is the code I used for the comparison:

a="s"
b="q"
for i in range(200):
    a = a + str(i)
    b = b + str(i+1)

a = a[1:]
b = "0"+b[1:-3]

print(a==b)


def test_timecall_nc(a,b):
    return True

def test_identity_0_nc(a,b):
    return a == b

def test_identity_1_nc(a,b):
    return id(a)==id(b)


def test_identity_2_nc(a,b):
    return a is b

rocky · 2021-09-15T12:44:44Z

Just to evaluate the possible advantage of using the singletonized version of symbols, here is a comparison of how much takes to do comparisons of strings and objects in my computer:

This is the code I used for the comparison:
a="s"
b="q"
for i in range(200):
    a = a + str(i)
    b = b + str(i+1)

a = a[1:]
b = "0"+b[1:-3]

print(a==b)


def test_timecall_nc(a,b):
    return True

def test_identity_0_nc(a,b):
    return a == b

def test_identity_1_nc(a,b):
    return id(a)==id(b)


def test_identity_2_nc(a,b):
    return a is b

Ok. This is for Python (what version?) versus Cython. Now how about a comparison using pyston and PyPy?

I hear that Microsoft is also coming out with a performance enhanced version as well. And does the performance change with Python version?

TiagoCavalcante · 2021-09-15T13:50:32Z

@rocky if Cython becomes mandatory PyPy won't be supported anymore.

rocky · 2021-09-15T14:16:20Z

@rocky if Cython becomes mandatory PyPy won't be supported anymore.

And maybe other JITs and Python implementations current and future. That is why we need to look at benchmarks of other systems and not be so cavalier about rushing off to solutions, especially ones that change the code base a lot, before we have carefully considered the possibilities and made little benchmark tests of them first.

Customizing for a specific kind of speedup complicates code (which when not documented is a mess) and is sometimes makes it harder to undo when the current assumptions should be superceded in light of changing understanding and technology.

Lastly, I should note that code can be written in such as way so that we get Cython speeded when desired without it being mandatory.

TiagoCavalcante · 2021-09-15T14:54:14Z

And maybe other JITs and Python implementations current and future. That is why we need to look at benchmarks of other systems and not be so cavalier about rushing off to solutions, especially ones that change the code base a lot, before we have carefully considered the possibilities and made little benchmark tests of them first.

Customizing for a specific kind of speedup complicates code (which when not documented is a mess) and is sometimes makes it harder to undo when the current assumptions should be superceded in light of changing understanding and technology.

Lastly, I should note that code can be written in such as way so that we get Cython speeded when desired without it being mandatory.

Yes, another way better solution is to mix .py files with type declarations and .pxd files with the customs declarations. Today I'm going to see what needs type declarations (they speed up the code a bit [when Cython is enabled], it can reach 20% in some cases...).

mmatera · 2021-09-15T16:00:47Z

And maybe other JITs and Python implementations current and future. That is why we need to look at benchmarks of other systems and not be so cavalier about rushing off to solutions, especially ones that change the code base a lot, before we have carefully considered the possibilities and made little benchmark tests of them first.
Customizing for a specific kind of speedup complicates code (which when not documented is a mess) and is sometimes makes it harder to undo when the current assumptions should be superceded in light of changing understanding and technology.
Lastly, I should note that code can be written in such as way so that we get Cython speeded when desired without it being mandatory.

Yes, another way better solution is to mix .py files with type declarations and .pxd files with the customs declarations. Today I'm going to see what needs type declarations (they speed up the code a bit [when Cython is enabled], it can reach 20% in some cases...).

My idea is that we could have for most critical parts of the code, different versions depending on what is the available interpreter.
In any case, to use object locations instead of strings to decide if two symbols are the same should be faster for any implementation, isn't it?

mmatera · 2021-09-15T16:01:13Z

Just to evaluate the possible advantage of using the singletonized version of symbols, here is a comparison of how much takes to do comparisons of strings and objects in my computer:

This is the code I used for the comparison:
a="s"
b="q"
for i in range(200):
    a = a + str(i)
    b = b + str(i+1)

a = a[1:]
b = "0"+b[1:-3]

print(a==b)


def test_timecall_nc(a,b):
    return True

def test_identity_0_nc(a,b):
    return a == b

def test_identity_1_nc(a,b):
    return id(a)==id(b)


def test_identity_2_nc(a,b):
    return a is b
Ok. This is for Python (what version?) versus Cython. Now how about a comparison using pyston and PyPy?

I hear that Microsoft is also coming out with a performance enhanced version as well. And does the performance change with Python version?

@rocky , I didn't check that.

TiagoCavalcante · 2021-09-15T18:12:21Z

@mmatera make -j8 check gives me OK.

rocky · 2021-09-15T18:16:08Z

@mmatera make -j8 check gives me OK.

This is going the wrong way. We don't want to have tests that start failing depending on options. There is some flakiness somewhere and we don't want that. We have had that in the past because it was given and we didn't know how to remove.

However something is works intermittently, in my opinion it is should be treated as a failure, unless there is a very good reason for why we need to tolerate this.

rocky · 2021-09-15T18:18:19Z

In any case, to use object locations instead of strings to decide if two symbols are the same should be faster for any implementation, isn't it?

Yes, I believe so. I am saying that performance gains for how it improves with Cython is not a reason to lock us into Cython.

Object locations should be faster in straight Python.

mmatera · 2021-09-15T18:51:57Z

@mmatera make -j8 check gives me OK.

This is going the wrong way. We don't want to have tests that start failing depending on options. There is some flakiness somewhere and we don't want that. We have had that in the past because it was given and we didn't know how to remove.

However something is works intermittently, in my opinion it is should be treated as a failure, unless there is a very good reason for why we need to tolerate this.

Totally agree: the other way would be just to skip the failing test, but it is not the idea. The good thing about having this extensive set of tests is that we can discover this flakiness when we change something a little bit.

TiagoCavalcante · 2021-09-15T18:57:14Z

This is going the wrong way. We don't want to have tests that start failing depending on options. There is some flakiness somewhere and we don't want that. We have had that in the past because it was given and we didn't know how to remove.

I don't understand, -j8 just change the number of "workers", the tests still are [should be] the same.

mmatera · 2021-09-15T19:08:04Z

This is going the wrong way. We don't want to have tests that start failing depending on options. There is some flakiness somewhere and we don't want that. We have had that in the past because it was given and we didn't know how to remove.

I don't understand, -j8 just change the number of "workers", the tests still are [should be] the same.

Yep, but the result of the test shouldn't depend on the number of threads. I think that with make -j8 pytest still fails, but doctest doesn't, and then the error code reported to the system comes from the last in finishing (doctest) that passes. Otherwise, or there is something wrong with the tests, or is something wrong in Mathics that only shows for certain corner cases.

TiagoCavalcante · 2021-09-15T19:31:28Z

Yep, but the result of the test shouldn't depend on the number of threads. I think that with make -j8 pytest still fails, but doctest doesn't, and then the error code reported to the system comes from the last in finishing (doctest) that passes. Otherwise, or there is something wrong with the tests, or is something wrong in Mathics that only shows for certain corner cases.

The tests also pass with make -j3 check (as expected).

The check rule includes pytest, gstest and doctest.

TiagoCavalcante · 2021-09-15T20:17:36Z

@mmatera I tested it manually.

MinimumChangePermutations[{a,b,c}] is working.

Edit: I run GitHub Actions again, and that's failing, what I'm doing wrong?

TiagoCavalcante · 2021-09-15T20:44:42Z

Yes, another way better solution is to mix .py files with type declarations and .pxd files with the customs declarations. Today I'm going to see what needs type declarations (they speed up the code a bit [when Cython is enabled], it can reach 20% in some cases...).

I tested it. A few type declarations don't change anything, in truth half of them just slow down the code (to be more specific: the ones complex, e.g.: tuple[tuple[str, ...], Any]).

Cython is more efficient when used by completely, but that is very hard to do (malloc, C++ Vectors, and more non-Python things).

mmatera · 2021-09-15T21:21:27Z

@mmatera I tested it manually.

MinimumChangePermutations[{a,b,c}] is working.

Edit: I run GitHub Actions again, and that's failing, what I'm doing wrong?

maybe the compilation of the Cython modules?

TiagoCavalcante · 2021-09-15T21:27:57Z

maybe the compilation of the Cython modules?

I ran make clean followed by make.

mmatera · 2021-09-15T23:06:35Z

Here are the results of profiling the run of docpipeline.py. I hope that this gives us some clue about what should be the priority for improving performance:

100 calls that takes more `total time`

ncalls	tottime	percall	cumtime	percall	filename:lineno(function)
8005288	21.473	0.0	359.838	0.0	pattern.py:176(match)
260903	15.363	0.0	16.168	0.0	clusters.py:830(remove)
10	14.251	1.425	31.244	3.124	clusters.py:860(reduce)
625952	12.986	0.0	387.027	0.022	expression.py:1351(evaluate_next)
4264462	12.627	0.0	331.853	0.0	pattern.py:200(yield_choice)
60565005	9.108	0.0	9.264	0.0	{built-in method builtins.isinstance}
4323487	8.047	0.0	364.302	0.0	rules.py:22(apply)
1659121	7.772	0.0	24.832	0.0	expression.py:720(new)
61017	6.952	0.0	7.227	0.0	libintmath.py:231(isqrt_fast_python)
12000024	6.282	0.0	12.864	0.0	definitions.py:430(get_definition)
876955	5.796	0.0	299.649	0.004	pattern.py:471(match_leaf)
8018506	5.757	0.0	8.316	0.0	expression.py:2091(sameQ)
1838672	5.263	0.0	13.676	0.0	expression.py:95(from_python)
4389077	5.242	0.0	39.355	0.0	pattern.py:72(does_match)
1311476	4.916	0.0	8.92	0.0	expression.py:1247(sameQ)
6085	4.861	0.001	10.745	0.002	gammazeta.py:1512(gamma_fixed_taylor)
8005288	4.821	0.0	17.731	0.0	pattern.py:114(get_attributes)
5356087	4.777	0.0	342.096	0.0	pattern.py:145(match)
564735	4.702	0.0	387.401	0.036	expression.py:1295(evaluate)
1361889	4.497	0.0	6.539	0.0	expression.py:862(_rebuild_cache)
2957	4.425	0.001	4.425	0.001	{method 'encode' of 'ImagingEncoder' objects}
8459165	4.295	0.0	13.313	0.0	expression.py:2056(get_attributes)
9728569	4.04	0.0	5.328	0.0	expression.py:44(ensure_context)
8461710	3.833	0.0	9.03	0.0	definitions.py:507(get_attributes)
3340	3.808	0.001	3.834	0.001	:1(hypsum_0_3__QRR_R)
22221142	3.726	0.0	5.727	0.0	{method 'get' of 'dict' objects}
5843890	3.38	0.0	6.515	0.0	expression.py:330(get_head_name)
4779243	3.333	0.0	8.886	0.0	expression.py:1436(rules)
4267359	3.244	0.0	336.165	0.0	pattern.py:275(yield_head)
6089	3.223	0.001	5.31	0.001	gammazeta.py:1455(gamma_taylor_coefficients)
1495612	2.942	0.0	5.9	0.0	expression.py:1999(new)
25	2.787	0.111	7.766	0.311	plot.py:1625(apply)
1496102	2.733	0.0	3.261	0.0	util.py:172(subranges)
1456118	2.722	0.0	297.731	0.001	patterns.py:867(match)
3673175	2.691	0.0	4.086	0.0	expression.py:260(new)
4264462	2.671	0.0	333.902	0.0	pattern.py:335(get_pre_choices)
450884	2.548	0.0	290.736	0.005	rules.py:31(yield_match)
9	2.53	0.281	2.53	0.281	{method 'quantize' of 'ImagingCore' objects}
1838672	2.39	0.0	5.177	0.0	numbers.py:78(get_type)
939456	2.284	0.0	3.81	0.0	definitions.py:691(get_config_value)
4999090	2.225	0.0	2.225	0.0	{built-in method new of type object at 0x55ccb0870020}
38952	2.135	0.0	3.318	0.0	libelefun.py:1011(exponential_series)
444	2.068	0.005	2.073	0.005	:1(hypsum_1_1_R_R_R)
18662350	2.058	0.0	2.058	0.0	{built-in method builtins.id}
1654266	1.958	0.0	3.94	0.0	expression.py:992(has_form)
814530	1.936	0.0	296.54	0.004	pattern.py:642(yield_wrapping)
12702763	1.907	0.0	1.907	0.0	expression.py:956(get_head)
9	1.89	0.21	1.89	0.21	{method 'rankfilter' of 'ImagingCore' objects}
10329258	1.843	0.0	1.843	0.0	evaluation.py:568(check_stopped)
804435	1.736	0.0	294.435	0.003	pattern.py:621(match_yield)
322710	1.645	0.0	287.352	0.007	rules.py:118(do_replace)
14358974	1.622	0.0	1.626	0.0	{built-in method builtins.len}
913639	1.616	0.0	4.439	0.0	expression.py:65()
818490	1.559	0.0	297.281	0.003	pattern.py:446(get_wrappings)
10990548	1.543	0.0	1.543	0.0	expression.py:2059(get_name)
1101608	1.533	0.0	289.956	0.002	patterns.py:1072(match)
1272269	1.507	0.0	16.499	0.0	cache.py:69(wrapper)
3392765	1.468	0.0	13.082	0.0	expression.py:725()
8	1.466	0.183	7.126	0.891	libelefun.py:538(agm_fixed)
526947	1.344	0.0	240.705	0.004	expression.py:1367(eval_range)
241681	1.322	0.0	4.949	0.0	numbers.py:1031(new)
4	1.29	0.322	1.492	0.373	gammazeta.py:1374(zeta_array)
2292319	1.288	0.0	6.059	0.0	base.py:773(get_attributes)
1234	1.21	0.001	1.22	0.001	:1(hypsum_0_1__R_R)
875727	1.18	0.0	1.71	0.0	expression.py:906(_timestamp_cache)
4913518	1.157	0.0	1.157	0.0	pattern.py:32(init)
1761399	1.1	0.0	1.578	0.0	patterns.py:864(get_match_count)
2016262	1.082	0.0	1.528	0.0	random.py:237(_randbelow_with_getrandbits)
49444	1.078	0.0	2.512	0.0	facts.py:499(deduce_all_facts)
1003944	1.066	0.0	2.251	0.0	sympify.py:92(sympify)
5657541	1.051	0.0	1.051	0.0	expression.py:738(leaves)
256502	1.046	0.0	2.047	0.0	expression.py:663(__cmp)
944155	1.038	0.0	4.694	0.0	definitions.py:364(lookup_name)
1096334	1.006	0.0	1.537	0.0	expression.py:25(fully_qualified_symbol_name)
2726608	1.002	0.0	1.37	0.0	expression.py:322(get_lookup_name)
738110	0.998	0.0	2.659	0.0	sympify.py:486(_sympify)
1892276	0.996	0.0	0.996	0.0	expression.py:196(init)
454524	0.967	0.0	2.125	0.0	libmpf.py:291(from_man_exp)
812974	0.964	0.0	1.685	0.0	patterns.py:885(get_match_candidates)
7305	0.956	0.0	1.251	0.0	libelefun.py:634(log_taylor_cached)
617686	0.923	0.0	3.101	0.0	evaluation.py:572(inc_recursion_depth)
74089	0.916	0.0	3.187	0.0	convert.py:114(from_sympy)
869153	0.914	0.0	0.914	0.0	libmpf.py:153(_normalize)
4701140	0.914	0.0	0.925	0.0	{built-in method builtins.hasattr}
702701	0.909	0.0	4.439	0.0	expression.py:2065(get_sort_key)
261120	0.877	0.0	1.776	0.0	linalg.py:2362(norm)
777395	0.86	0.0	1.195	0.0	expression.py:57(strip_context)
824500	0.844	0.0	38.54	0.0	expression.py:2106(evaluate)
49688	0.839	0.0	54.457	0.001	arithmetic.py:92(apply)
833677	0.826	0.0	4.107	0.0	expression.py:1955(get_head)
271706	0.822	0.0	3.293	0.0	expression.py:1116(get_sort_key)
407297	0.799	0.0	0.989	0.0	numbers.py:151(mpf_norm)
123881	0.795	0.0	2.333	0.0	random.py:348(shuffle)
175833	0.782	0.0	3.556	0.0	evalf.py:1425(evalf)
1583173	0.78	0.0	14.247	0.0	{built-in method builtins.all}
826647	0.757	0.0	1.674	0.0	:1033(_handle_fromlist)
133550	0.756	0.0	0.88	0.0	definitions.py:762(init)
1687352	0.756	0.0	1.549	0.0	expression.py:989(get_lookup_name)
23596	0.754	0.0	8.05	0.0	mul.py:178(flatten)
156054	0.727	0.0	9.503	0.001	assumptions.py:464(_ask)

mmatera · 2021-09-15T23:08:30Z

100 most called functions

ncalls	tottime	cumtime	percall	filename:lineno(function)
60565005	9.108	9.264	0.0	{built-in method builtins.isinstance}
22221142	3.726	5.727	0.0	{method 'get' of 'dict' objects}
18662350	2.058	2.058	0.0	{built-in method builtins.id}
14358974	1.622	1.626	0.0	{built-in method builtins.len}
12702763	1.907	1.907	0.0	expression.py:956(get_head)
12000024	6.282	12.864	0.0	definitions.py:430(get_definition)
10990548	1.543	1.543	0.0	expression.py:2059(get_name)
10329258	1.843	1.843	0.0	evaluation.py:568(check_stopped)
9728569	4.04	5.328	0.0	expression.py:44(ensure_context)
8461710	3.833	9.03	0.0	definitions.py:507(get_attributes)
8459165	4.295	13.313	0.0	expression.py:2056(get_attributes)
8018506	5.757	8.316	0.0	expression.py:2091(sameQ)
8005288	4.821	17.731	0.0	pattern.py:114(get_attributes)
8005288	21.473	359.838	0.0	pattern.py:176(match)
5843890	3.38	6.515	0.0	expression.py:330(get_head_name)
5742656	0.701	0.701	0.0	expression.py:345(is_atom)
5657541	1.051	1.051	0.0	expression.py:738(leaves)
5356087	4.777	342.096	0.0	pattern.py:145(match)
4999090	2.225	2.225	0.0	{built-in method new of type object at 0x55ccb0870020}
4913518	1.157	1.157	0.0	pattern.py:32(init)
4779243	3.333	8.886	0.0	expression.py:1436(rules)
4701140	0.914	0.925	0.0	{built-in method builtins.hasattr}
4389077	5.242	39.355	0.0	pattern.py:72(does_match)
4323487	8.047	364.302	0.0	rules.py:22(apply)
4267359	3.244	336.165	0.0	pattern.py:275(yield_head)
4264462	12.627	331.853	0.0	pattern.py:200(yield_choice)
4264462	2.671	333.902	0.0	pattern.py:335(get_pre_choices)
3731920	0.48	0.48	0.0	pattern.py:443(get_match_count)
3673175	2.691	4.086	0.0	expression.py:260(new)
3392765	1.468	13.082	0.0	expression.py:725()
3133993	0.286	0.286	0.0	{method 'getrandbits' of '_random.Random' objects}
3122532	0.45	0.45	0.0	expression.py:1943(has_form)
2726608	1.002	1.37	0.0	expression.py:322(get_lookup_name)
2709421	0.641	0.823	0.0	facts.py:533()
2602960	0.302	0.302	0.0	expression.py:1930(is_atom)
2292319	1.288	6.059	0.0	base.py:773(get_attributes)
2286316	0.288	0.288	0.0	expression.py:963(get_leaves)
2236413	0.715	0.938	0.0	{built-in method builtins.getattr}
2119623	0.281	0.281	0.0	expression.py:311(get_name)
2089483	0.459	0.459	0.0	{method 'copy' of 'dict' objects}
2038293	0.516	0.516	0.0	{method 'startswith' of 'str' objects}
2016342	0.159	0.159	0.0	{method 'bit_length' of 'int' objects}
2016262	1.082	1.528	0.0	random.py:237(_randbelow_with_getrandbits)
1917919	0.656	1.807	0.0	{built-in method builtins.hash}
1892276	0.996	0.996	0.0	expression.py:196(init)
1838672	5.263	13.676	0.0	expression.py:95(from_python)
1838672	2.39	5.177	0.0	numbers.py:78(get_type)
1819494	0.289	0.29	0.0	{method 'items' of 'dict' objects}
1761399	1.1	1.578	0.0	patterns.py:864(get_match_count)
1687352	0.756	1.549	0.0	expression.py:989(get_lookup_name)
1659121	7.772	24.832	0.0	expression.py:720(new)
1654266	1.958	3.94	0.0	expression.py:992(has_form)
1645076	0.246	0.246	0.0	{method 'append' of 'list' objects}
1583173	0.78	14.247	0.0	{built-in method builtins.all}
1532896	0.261	0.35	0.0	{method 'add' of 'set' objects}
1496102	2.733	3.261	0.0	util.py:172(subranges)
1495612	2.942	5.9	0.0	expression.py:1999(new)
1456118	2.722	297.731	0.001	patterns.py:867(match)
1424548	0.473	0.757	0.0	{built-in method builtins.max}
1361889	4.497	6.539	0.0	expression.py:862(_rebuild_cache)
1317698	0.235	0.235	0.0	basic.py:712(args)
1311476	4.916	8.92	0.0	expression.py:1247(sameQ)
1304405	0.569	0.785	0.0	expr.py:125(hash)
1278479	0.189	0.189	0.0	base.py:767(get_match_count)
1272269	1.507	16.499	0.0	cache.py:69(wrapper)
1181974	0.624	1.532	0.0	{built-in method builtins.min}
1101608	1.533	289.956	0.002	patterns.py:1072(match)
1096334	1.006	1.537	0.0	expression.py:25(fully_qualified_symbol_name)
1003944	1.066	2.251	0.0	sympify.py:92(sympify)
997063	0.21	0.21	0.0	{method 'endswith' of 'str' objects}
991472	0.195	0.195	0.0	expression.py:2274(get_int_value)
944155	1.038	4.694	0.0	definitions.py:364(lookup_name)
939456	2.284	3.81	0.0	definitions.py:691(get_config_value)
913639	0.524	4.963	0.0	expression.py:64(system_symbols)
913639	1.616	4.439	0.0	expression.py:65()
913082	0.181	0.181	0.0	expression.py:730(head)
898616	0.228	0.228	0.0	sympify.py:64()
883613	0.356	0.459	0.0	basic.py:2001(_preorder_traversal)
876955	5.796	299.649	0.004	pattern.py:471(match_leaf)
876320	0.118	0.118	0.0	base.py:770(get_match_candidates)
875727	1.18	1.71	0.0	expression.py:906(_timestamp_cache)
873226	0.291	0.291	0.0	facts.py:482(_tell)
869153	0.914	0.914	0.0	libmpf.py:153(_normalize)
837065	0.481	1.351	0.0	definitions.py:510(get_ownvalues)
833677	0.826	4.107	0.0	expression.py:1955(get_head)
826647	0.757	1.674	0.0	:1033(_handle_fromlist)
824500	0.844	38.54	0.0	expression.py:2106(evaluate)
818490	1.559	297.281	0.003	pattern.py:446(get_wrappings)
814530	1.936	296.54	0.004	pattern.py:642(yield_wrapping)
812974	0.964	1.685	0.0	patterns.py:885(get_match_candidates)
804435	1.736	294.435	0.003	pattern.py:621(match_yield)
789515	0.413	1.267	0.0	rules.py:125()
781431	0.241	0.332	0.0	numbers.py:2282(hash)
781229	0.348	0.348	0.0	{built-in method _bisect.bisect_right}
780924	0.502	1.073	0.0	libintmath.py:91(python_bitcount)
777395	0.86	1.195	0.0	expression.py:57(strip_context)
764255	0.289	0.29	0.0	{method 'update' of 'set' objects}
749188	0.335	0.335	0.0	{method 'rindex' of 'str' objects}
738110	0.998	2.659	0.0	sympify.py:486(_sympify)
723981	0.439	1.636	0.0	patterns.py:747(match)

mmatera · 2021-09-15T23:10:16Z

100 functions that cost most "total time" (time inside the functions + the functions that are called)

ncalls	tottime	percall	cumtime	percall	filename:lineno(function)
1	0.0	0.0	401.336	401.336	docpipeline.py:428(main)
1	0.0	0.0	401.336	401.336	fake:1()
2193	0.156	0.0	401.336	401.336	{built-in method builtins.exec}
1	0.005	0.005	399.043	399.043	docpipeline.py:312(test_all)
826	0.029	0.0	399.017	0.483	docpipeline.py:157(test_tests)
4369	0.125	0.0	398.945	0.091	docpipeline.py:83(test_case)
4340	0.077	0.0	395.173	0.091	evaluation.py:296(evaluate)
4340	0.011	0.0	394.151	0.091	evaluation.py:106(run_with_timeout_and_stack)
4340	0.081	0.0	394.14	0.091	evaluation.py:326(evaluate)
564735	4.702	0.0	387.401	0.036	expression.py:1295(evaluate)
625952	12.986	0.0	387.027	0.022	expression.py:1351(evaluate_next)
4323487	8.047	0.0	364.302	0.0	rules.py:22(apply)
8005288	21.473	0.0	359.838	0.0	pattern.py:176(match)
5356087	4.777	0.0	342.096	0.0	pattern.py:145(match)
4267359	3.244	0.0	336.165	0.0	pattern.py:275(yield_head)
4264462	2.671	0.0	333.902	0.0	pattern.py:335(get_pre_choices)
4264462	12.627	0.0	331.853	0.0	pattern.py:200(yield_choice)
876955	5.796	0.0	299.649	0.004	pattern.py:471(match_leaf)
1456118	2.722	0.0	297.731	0.001	patterns.py:867(match)
818490	1.559	0.0	297.281	0.003	pattern.py:446(get_wrappings)
814530	1.936	0.0	296.54	0.004	pattern.py:642(yield_wrapping)
804435	1.736	0.0	294.435	0.003	pattern.py:621(match_yield)
450884	2.548	0.0	290.736	0.005	rules.py:31(yield_match)
1101608	1.533	0.0	289.956	0.002	patterns.py:1072(match)
322710	1.645	0.0	287.352	0.007	rules.py:118(do_replace)
177349	0.474	0.0	277.087	0.006	pattern.py:606(leaf_yield)
526947	1.344	0.0	240.705	0.004	expression.py:1367(eval_range)
5618	0.367	0.0	197.881	0.055	patterns.py:1446(match)
17219	0.141	0.0	133.678	0.008	scoping.py:46(dynamic_scoping)
60	0.169	0.003	117.935	1.966	plot.py:384(apply)
15870	0.105	0.0	116.366	0.007	plot.py:272(quiet_f)
15890	0.156	0.0	105.554	0.007	inout.py:1434(apply)
4476	0.037	0.0	99.88	0.022	evaluation.py:437(format_output)
4722	0.023	0.0	96.757	0.021	expression.py:516(format)
18816	0.018	0.0	89.998	0.005	plot.py:2134(eval_f)
214595	0.3	0.0	75.213	0.001	patterns.py:1179(match)
30866	0.062	0.0	56.985	0.002	patterns.py:1125(match)
49688	0.839	0.0	54.457	0.001	arithmetic.py:92(apply)
55002	0.067	0.0	52.437	0.002	patterns.py:536(match)
54967	0.303	0.0	52.293	0.002	patterns.py:538(yield_match)
37515	0.037	0.0	49.19	0.001	pattern.py:361(per_name)
67122	0.272	0.0	43.017	0.001	expression.py:1885(numerify)
69354	0.107	0.0	40.021	0.001	numbers.py:53(get_precision)
4389077	5.242	0.0	39.355	0.0	pattern.py:72(does_match)
824500	0.844	0.0	38.54	0.0	expression.py:2106(evaluate)
24611	0.44	0.0	38.428	0.002	basic.py:827(apply)
67360	0.22	0.0	38.199	0.002	numeric.py:68(apply_N)
9583	0.021	0.0	37.874	0.004	patterns.py:1368(match)
17544	0.056	0.0	37.793	0.002	numbers.py:41(_get_float_inf)
1891	0.005	0.0	37.184	0.02	numeric.py:946(apply_with_prec)
8	0.052	0.006	36.361	4.545	color_operations.py:314(apply)
9092	0.028	0.0	33.358	0.004	patterns.py:1371(yield_match)
10	0.018	0.002	31.263	3.126	clusters.py:738(agglomerate)
10	14.251	1.425	31.244	3.124	clusters.py:860(reduce)
23587	0.046	0.0	29.713	0.001	arithmetic.py:53(call_mpmath)
6423	0.261	0.0	25.829	0.004	hypergeometric.py:58(hypercomb)
1659121	7.772	0.0	24.832	0.0	expression.py:720(new)
50708	0.125	0.0	23.214	0.0	ctx_mp_python.py:987(f)
21555	0.14	0.0	22.118	0.001	basic.py:548(apply_check)
1222	0.016	0.0	21.763	0.034	procedural.py:189(apply)
2090	0.006	0.0	19.994	0.01	numeric.py:125()
4946	0.01	0.0	19.627	0.004	plot.py:2195(eval_f)
8005288	4.821	0.0	17.731	0.0	pattern.py:114(get_attributes)
1272269	1.507	0.0	16.499	0.0	cache.py:69(wrapper)
29	0.001	0.0	16.334	0.563	calculus.py:568(apply)
260903	15.363	0.0	16.168	0.0	clusters.py:830(remove)
947	0.039	0.0	16.124	0.029	simplify.py:411(simplify)
1583173	0.78	0.0	14.247	0.0	{built-in method builtins.all}
520	0.005	0.0	13.852	0.239	algebra.py:410(apply)
1838672	5.263	0.0	13.676	0.0	expression.py:95(from_python)
8960	0.052	0.0	13.51	0.004	ctx_mp_python.py:1015(f_wrapped)
8459165	4.295	0.0	13.313	0.0	expression.py:2056(get_attributes)
3392765	1.468	0.0	13.082	0.0	expression.py:725()
12000024	6.282	0.0	12.864	0.0	definitions.py:430(get_definition)
111079	0.226	0.0	12.667	0.0	patterns.py:594(match)
799	0.01	0.0	11.94	0.015	inference.py:354(evaluate_predicate)
14760	0.101	0.0	11.71	0.001	hypergeometric.py:194(hyper)
5458	0.134	0.0	10.986	0.002	expression.py:388(apply_rules)
32	0.002	0.0	10.94	0.342	calculus.py:859(apply)
14116	0.075	0.0	10.897	0.001	gammazeta.py:1694(mpf_gamma)
6085	4.861	0.001	10.745	0.002	gammazeta.py:1512(gamma_fixed_taylor)
325	0.009	0.0	10.495	0.036	scoping.py:262(apply)
11020	0.037	0.0	10.35	0.001	libelefun.py:668(mpf_log)
613233	0.332	0.0	9.86	0.0	assumptions.py:452(getit)
43561	0.506	0.0	9.848	0.0	operations.py:46(new)
6	0.006	0.001	9.755	1.626	procedural.py:600(apply)
7	0.002	0.0	9.561	1.366	numeric.py:1693(apply_with_base)
5	0.0	0.0	9.552	1.91	numeric.py:1803(apply_with_base_and_length)
3	0.0	0.0	9.52	3.173	numeric.py:1816(apply_with_base_length_and_precision)
156054	0.727	0.0	9.503	0.001	assumptions.py:464(_ask)
23	0.0	0.0	9.44	0.41	calculus.py:975()
126	0.001	0.0	9.44	0.075	calculus.py:978()
14755	0.17	0.0	9.431	0.001	ctx_mp.py:666(hypsum)
141	0.001	0.0	9.398	0.067	functions.py:304(log)
60565005	9.108	0.0	9.264	0.0	{built-in method builtins.isinstance}
242	0.003	0.0	9.239	0.038	bessel.py:352(ker)
2008	0.015	0.0	9.165	0.005	constants.py:90(get_constant)
8461710	3.833	0.0	9.03	0.0	definitions.py:507(get_attributes)
160	0.001	0.0	8.933	0.308	algebra.py:417()
1311476	4.916	0.0	8.92	0.0	expression.py:1247(sameQ)

rocky · 2021-09-15T23:22:23Z

@mmatera Thanks! This is useful and interesting. It doesn't seem at all different from when I last looked at things.

It would be nice to have some place the code that was used to generate this or the basic data from it, so that we can rerun over time or as we adapt it to narrow focus.

mmatera · 2021-09-15T23:22:50Z

Here are the raw data:
https://drive.google.com/file/d/1275zuD6i9bWRek1hZ9reDBqnM6vQ0RPZ/view?usp=sharing

mmatera · 2021-09-15T23:24:24Z

The code to produce the profile

import sys
import trace
import cProfile

import mathics.docpipeline as docpipeline

cProfile.run(compile("docpipeline.main()","fake","exec",optimize=2))

This prints the profile to stdout. I collected the output and make some formatting by hand (changing spaces by "\t").

mmatera · 2021-09-15T23:24:57Z

The code to read the "almost raw" data:

with open("/tmp/test.txt","r") as f:
    s = f.readline()
    titles=s.split("\t")
    titles[-1]=titles[-1].split("\n",1)[0]
    data = []
    for s in f.readlines():
        row = s.split("\t")
        if row[0]=="":
            row = row[1:]
        if len(row)==5:
            try:
                t, fn = row[-1].split(" ",1)
            except:
                print(row)
                continue
            
            row = [int(row[0].split("/")[0]), 
                float(row[1]),
                float(row[2]),
                float(row[3]),
                float(t), fn.split("\n",1)[0]]
            data.append(row)
        elif len(row)==6:
            row = [int(row[0].split("/")[0]), 
                float(row[1]),
                float(row[2]),
                float(row[3]),
                float(row[4]),
                row[5].split("\n",1)[0]]
            data.append(row)
        else:
            print(len(row),"->", row)

mmatera · 2021-09-15T23:25:35Z

The code to build the tables

from IPython.display import Markdown
from IPython.display import display


def showtitle(t):
    return "\n|\t"+"\t|\t".join(t)+"\t|" + "\n" +  "|"+ 6*"  ------ |  "+"\n"

def showdataline(d):
    return "|"+("\t|".join([str(c) for c in d]))+"|\n"

def build_table(title, data):
    table = showtitle(title)
    for d in data:
        table += showdataline(d)
    return table

Markdown(build_table(titles,sorted(data,key=lambda x:-x[0])  ))

mmatera · 2021-09-15T23:26:05Z

The code to build the summary tables:

print(build_table(titles,sorted(data,key=lambda x:-x[3])[:100]  ))

mmatera · 2021-09-15T23:26:52Z

(I could also upload the ipynb file where I did all the experiments, but I was not sure where)

rocky · 2021-09-15T23:37:34Z

Thanks - I will delve into deeper next week sometime.

TiagoCavalcante · 2021-09-15T23:59:56Z

(I could also upload the ipynb file where I did all the experiments, but I was not sure where)

Maybe Gist?

mmatera · 2021-09-28T05:22:58Z

What is failing here is that

p={a,b,c}

{p[[1]],p[[2]]}={p[[2]],p[[1]]}

p

for some reason returns {a,b,c} instead of {b,a,c}. Again, something is failing in _SetOperator.assignment...

…ow are useless

mmatera · 2021-10-22T12:48:28Z

@rocky, @TiagoCavalcante, finally I made it works. To do that, I had to change the code in several @places. However, please notice that many of the changes come from #44. Maybe the best would be to start a new PR. What do you think?

rocky · 2021-10-22T13:16:18Z

@rocky, @TiagoCavalcante, finally I made it works. To do that, I had to change the code in several @places. However, please notice that many of the changes come from #44. Maybe the best would be to start a new PR. What do you think?

I suspect a new PR would be easier. In the past looking at your PRs feel like: let me try this, no that didn't work okay let me try something else, no not that either, okay now I'll merge in code from the main branch. Okay now let me add some debug stuff, ok, I"ll remove it now. Nope, didn't remove all of it, etc.

These kinds of things make it hard to follow the every hard to follow what the essence is.

Bonus points if you want to start filling out docstrings,

mmatera marked this pull request as draft September 14, 2021 23:42

mmatera closed this Sep 15, 2021

mmatera reopened this Sep 15, 2021

mmatera force-pushed the singletonize branch from 3c2168e to c919dfc Compare September 17, 2021 15:42

mmatera added 9 commits October 18, 2021 00:24

add subexpression support

6b109d8

walk_parts_new

d8fa1d2

adding comments

c67f39f

fix

02bfd09

merge

4f73034

ensuring that Module uses a copy of the variables

7c80b9c

removing set_position method and old ExpressionPointer class, which n…

351bbd2

…ow are useless

removing old walk_parts

0a77600

fix Graphics and Graphics3D

974d4fe

mmatera force-pushed the singletonize branch from c28ce76 to 974d4fe Compare October 21, 2021 23:16

mmatera added 2 commits October 22, 2021 08:04

fix graphics and strings

5c4b996

Using references to symbols instead of create them

351454e

update the name of the parameter

51953e4

mmatera closed this Oct 22, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

allowing singletonize Symbols #6

allowing singletonize Symbols #6

mmatera commented Sep 14, 2021

rocky commented Sep 15, 2021

mmatera commented Sep 15, 2021

rocky commented Sep 15, 2021

TiagoCavalcante commented Sep 15, 2021

rocky commented Sep 15, 2021 •

edited

TiagoCavalcante commented Sep 15, 2021 •

edited

mmatera commented Sep 15, 2021

mmatera commented Sep 15, 2021

TiagoCavalcante commented Sep 15, 2021

rocky commented Sep 15, 2021 •

edited

rocky commented Sep 15, 2021 •

edited

mmatera commented Sep 15, 2021

TiagoCavalcante commented Sep 15, 2021

mmatera commented Sep 15, 2021

TiagoCavalcante commented Sep 15, 2021 •

edited

TiagoCavalcante commented Sep 15, 2021 •

edited

TiagoCavalcante commented Sep 15, 2021 •

edited

mmatera commented Sep 15, 2021

TiagoCavalcante commented Sep 15, 2021

mmatera commented Sep 15, 2021

mmatera commented Sep 15, 2021

mmatera commented Sep 15, 2021

rocky commented Sep 15, 2021

mmatera commented Sep 15, 2021

mmatera commented Sep 15, 2021

mmatera commented Sep 15, 2021

mmatera commented Sep 15, 2021

mmatera commented Sep 15, 2021

mmatera commented Sep 15, 2021

rocky commented Sep 15, 2021

TiagoCavalcante commented Sep 15, 2021

mmatera commented Sep 28, 2021

mmatera commented Oct 22, 2021 •

edited

rocky commented Oct 22, 2021 •

edited

allowing singletonize Symbols #6

allowing singletonize Symbols #6

Conversation

mmatera commented Sep 14, 2021

rocky commented Sep 15, 2021

mmatera commented Sep 15, 2021

rocky commented Sep 15, 2021

TiagoCavalcante commented Sep 15, 2021

rocky commented Sep 15, 2021 • edited

TiagoCavalcante commented Sep 15, 2021 • edited

mmatera commented Sep 15, 2021

mmatera commented Sep 15, 2021

TiagoCavalcante commented Sep 15, 2021

rocky commented Sep 15, 2021 • edited

rocky commented Sep 15, 2021 • edited

mmatera commented Sep 15, 2021

TiagoCavalcante commented Sep 15, 2021

mmatera commented Sep 15, 2021

TiagoCavalcante commented Sep 15, 2021 • edited

TiagoCavalcante commented Sep 15, 2021 • edited

TiagoCavalcante commented Sep 15, 2021 • edited

mmatera commented Sep 15, 2021

TiagoCavalcante commented Sep 15, 2021

mmatera commented Sep 15, 2021

100 calls that takes more total time

mmatera commented Sep 15, 2021

100 most called functions

mmatera commented Sep 15, 2021

100 functions that cost most "total time" (time inside the functions + the functions that are called)

rocky commented Sep 15, 2021

mmatera commented Sep 15, 2021

mmatera commented Sep 15, 2021

mmatera commented Sep 15, 2021

mmatera commented Sep 15, 2021

mmatera commented Sep 15, 2021

mmatera commented Sep 15, 2021

rocky commented Sep 15, 2021

TiagoCavalcante commented Sep 15, 2021

mmatera commented Sep 28, 2021

mmatera commented Oct 22, 2021 • edited

rocky commented Oct 22, 2021 • edited

rocky commented Sep 15, 2021 •

edited

TiagoCavalcante commented Sep 15, 2021 •

edited

rocky commented Sep 15, 2021 •

edited

rocky commented Sep 15, 2021 •

edited

TiagoCavalcante commented Sep 15, 2021 •

edited

TiagoCavalcante commented Sep 15, 2021 •

edited

TiagoCavalcante commented Sep 15, 2021 •

edited

100 calls that takes more `total time`

mmatera commented Oct 22, 2021 •

edited

rocky commented Oct 22, 2021 •

edited