Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RestrictedPartitions is very slow and a huge memory hog #5478

Closed
dandrake opened this issue Mar 11, 2009 · 17 comments
Closed

RestrictedPartitions is very slow and a huge memory hog #5478

dandrake opened this issue Mar 11, 2009 · 17 comments

Comments

@dandrake
Copy link
Contributor

I have code that does the same as

RestrictedPartitions(3000, [1000, 500, 100, 50, 10])

but about 5.5 times faster...and on my system,

RestrictedPartitions(5000, [1000, 500, 100, 50, 10])

uses over two gigabytes of memory, whereas my code takes about 9.2 seconds with minimal memory usage.

I need to fiddle with my code so that it's a drop-in replacement for RestrictedPartitions, but I should have a patch up soon.

CC: @sagetrac-sage-combinat

Component: combinatorics

Keywords: combinat, partitions, gap

Issue created by migration from https://trac.sagemath.org/ticket/5478

@dandrake
Copy link
Contributor Author

comment:1

Maybe I should add that I'm interested, of course, in iterating through all those partitions. Just running "RestrictedPartitions(3000, [1000, 500, 100, 50, 10])" is very fast because it just returns an iterator. The actual code I'm running is stuff like

sum(1 for p in RestrictedPartitions(...))

@dandrake
Copy link
Contributor Author

comment:2

For ease of reviewing, I plan to have two patches here: the first one includes deprecation for RestrictedPartitions and fixes up some problems with the Partitions docstring. The second will actually implement the new Partitions(..., parts_in=...) code.

@dandrake
Copy link
Contributor Author

comment:3

The second patch gets doctests working and adds the new functionality. I'm not entirely sure I did the deprecation stuff correctly; I added a deprecation warning, and then fiddled with all the doctests in the old RestrictedPartitions code so that it passes doctests.

@dandrake
Copy link
Contributor Author

comment:4

A little ego update to the patch -- I added myself to the authors list. Anyone reviewing this might want to look at my test script: restricted_partitions_test_suite.sage. I've run over 10,000 successful tests with it.

@dandrake
Copy link
Contributor Author

comment:5

Some benchmarks (all on sage.math):

BEFORE
sage: get_memory_usage()
695.7421875
sage: ps = RestrictedPartitions(100, ([1,6..100] + [4,9..100]))
sage: %time sum(1 for p in ps)
CPU times: user 26.89 s, sys: 1.11 s, total: 28.00 s
Wall time: 28.69 s
74040
sage: get_memory_usage()
1781.41796875
AFTER
sage: get_memory_usage()
699.828125
sage: ps = Partitions(100, parts_in=([1,6..100] + [4,9..100]))
sage: %time sum(1 for p in ps)
CPU times: user 4.96 s, sys: 0.02 s, total: 4.98 s
Wall time: 4.98 s
74040
sage: get_memory_usage()
699.828125

The above example which prompted this ticket:

BEFORE
sage: get_memory_usage()
695.73828125
sage: ps = RestrictedPartitions(3000, [10,50,100,500,1000])
sage: %time sum(1 for p in ps)
CPU times: user 5.69 s, sys: 0.21 s, total: 5.90 s
Wall time: 6.05 s
3506
sage: get_memory_usage()
935.48046875
AFTER
sage: get_memory_usage()
699.82421875
sage: ps = Partitions(3000, parts_in=[10,50,100,500,1000])
sage: %time sum(1 for p in ps)
CPU times: user 1.08 s, sys: 0.00 s, total: 1.08 s
Wall time: 1.09 s
3506
sage: get_memory_usage()
699.82421875

@hivert
Copy link

hivert commented Apr 7, 2009

comment:6

Patch look good. However, before finishing my review I'm waiting it to be rebased on top of 3.4.1. Also some interface problem have been discussed by e-mail: It was decided more that one month ago that a combinatorial class should define:

  • __iter__ instead of iterator.
  • cardinality instead of count or len.

Florent

@hivert hivert changed the title RestrictedPartitions is very slow and a huge memory hog [with patch, needs minor work+rebase] RestrictedPartitions is very slow and a huge memory hog Apr 7, 2009
@sagetrac-mabshoff
Copy link
Mannequin

sagetrac-mabshoff mannequin commented Apr 7, 2009

comment:7

Please keep the subject standard so that the proper reports are picking up the right tickets.

Cheers,

Michael

@sagetrac-mabshoff sagetrac-mabshoff mannequin changed the title [with patch, needs minor work+rebase] RestrictedPartitions is very slow and a huge memory hog RestrictedPartitions is very slow and a huge memory hog Apr 7, 2009
@dandrake
Copy link
Contributor Author

dandrake commented Apr 9, 2009

deprecate RestrictedPartitions and fix up some documentation

@dandrake
Copy link
Contributor Author

dandrake commented Apr 9, 2009

Attachment: trac_5478-1.patch.gz

implement new Partitions class with parts_in keyword

@dandrake
Copy link
Contributor Author

dandrake commented Apr 9, 2009

comment:8

Attachment: trac_5478-2.patch.gz

I've updated the patches to apply on top of 3.4.1.rc1.

@hivert
Copy link

hivert commented Apr 9, 2009

comment:9

Anyone reviewing this might want to look at my test script: restricted_partitions_test_suite.sage. I've run over 10,000 successful tests with it.

This script is great. I should be put in sage in one place or one other. If someone tries to optimize your code (eg: Cythonize), he surely will be happy to have your test code. Why not shipping it into the patch with a more explicit name and a one-line example on how to use it with the comment " #longtest " ?

Florent

@hivert
Copy link

hivert commented Apr 9, 2009

comment:10

Some comment on the first patch:

I'm not a native English speaker but it seems to me that
If starting=p is passed, then the combinatorial class of partitions greater than or equal to p in lexicographic order is returned.
is clearer if phrased
If starting=p is passed, then the combinatorial class of partitions with part greater than or equal to p in lexicographic order is returned.

In the same way
If ending=p is passed, then the combinatorial class of partitions at most p in lexicographic order is returned.
is better phrased
If ending=p is passed, then the combinatorial class of partitions with part at most p in lexicographic order is returned.

Otherwise the rest of the first patch looks good. I'm working on the second one.

Florent

@dandrake
Copy link
Contributor Author

comment:11

Replying to @hivert:

Some comment on the first patch:

I'm not a native English speaker but it seems to me that
If starting=p is passed, then the combinatorial class of partitions greater than or equal to p in lexicographic order is returned.
is clearer if phrased
If starting=p is passed, then the combinatorial class of partitions with part greater than or equal to p in lexicographic order is returned.

"p" refers to a partition, not to a part, so the text is okay. Perhaps we should make that more clear, though. I plan on opening a ticket to improve the documentation in partition.py, so we can do that then.

Replying to @hivert:

Anyone reviewing this might want to look at my test script: restricted_partitions_test_suite.sage. I've run over 10,000 successful tests with it.

This script is great. I should be put in sage in one place or one other. If someone tries to
optimize your code (eg: Cythonize), he surely will be happy to have your test code. Why not
shipping it into the patch with a more explicit name and a one-line example on how to use it
with the comment " #longtest " ?

Well, right now, the script only halts if it finds an error, so it would be a really, really long test. :)

The tests also can use lots of memory, since it puts a list of the partitions into memory. But if anyone thinks (a mildly modified version of) the script should get run on a "long" test, I'm happy to have it included.

@hivert
Copy link

hivert commented Apr 12, 2009

Improved doctests for deprecation warnings.

@hivert
Copy link

hivert commented Apr 12, 2009

comment:13

Attachment: trac_5478-review.patch.gz

The patch looks good. The doctests passed except that depending on the base the doctests numbers may differ. So I added a review patch which replace the explicit doctest numbers with .... This should solve this problem.

I'm giving a positive review... Though maybe someone should reread my trivial review patch.

Concerning the script it is quite redundant with the .check() method we'll have with categories.

Florent

@dandrake
Copy link
Contributor Author

comment:14

Thanks for fixing up those line numbers in the doctests...I struggled a bit to get those right, and had forgotten about the "..." stuff.

@sagetrac-mabshoff
Copy link
Mannequin

sagetrac-mabshoff mannequin commented Apr 13, 2009

comment:15

Merged all three patches in Sage 3.4.1.rc3.

Cheers,

Michael

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants