/
enumerative.py
1068 lines (870 loc) · 37.9 KB
/
enumerative.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
"""
Algorithms and classes to support enumerative combinatorics.
Currently just multiset partitions, but more could be added.
Terminology (following Knuth, algorithm 7.1.2.5M TAOCP)
*multiset* aaabbcccc has a *partition* aaabc | bccc
The submultisets, aaabc and bccc of the partition are called
*parts*, or sometimes *vectors*. (Knuth notes that multiset
partitions can be thought of as partitions of vectors of integers,
where the ith element of the vector gives the multiplicity of
element i.)
The values a, b and c are *components* of the multiset. These
correspond to elements of a set, but in a multiset can be present
with a multiplicity greater than 1.
The algorithm deserves some explanation.
Think of the part aaabc from the multiset above. If we impose an
ordering on the components of the multiset, we can represent a part
with a vector, in which the value of the first element of the vector
corresponds to the multiplicity of the first component in that
part. Thus, aaabc can be represented by the vector [3, 1, 1]. We
can also define an ordering on parts, based on the lexicographic
ordering of the vector (leftmost vector element, i.e., the element
with the smallest component number, is the most significant), so
that [3, 1, 1] > [3, 1, 0] and [3, 1, 1] > [2, 1, 4]. The ordering
on parts can be extended to an ordering on partitions: First, sort
the parts in each partition, left-to-right in decreasing order. Then
partition A is greater than partition B if A's leftmost/greatest
part is greater than B's leftmost part. If the leftmost parts are
equal, compare the second parts, and so on.
In this ordering, the greatest partition of a given multiset has only
one part. The least partition is the one in which the components
are spread out, one per part.
The enumeration algorithms in this file yield the partitions of the
argument multiset in decreasing order. The main data structure is a
stack of parts, corresponding to the current partition. An
important invariant is that the parts on the stack are themselves in
decreasing order. This data structure is decremented to find the
next smaller partition. Most often, decrementing the partition will
only involve adjustments to the smallest parts at the top of the
stack, much as adjacent integers *usually* differ only in their last
few digits.
Knuth's algorithm uses two main operations on parts:
Decrement - change the part so that it is smaller in the
(vector) lexicographic order, but reduced by the smallest amount possible.
For example, if the multiset has vector [5,
3, 1], and the bottom/greatest part is [4, 2, 1], this part would
decrement to [4, 2, 0], while [4, 0, 0] would decrement to [3, 3,
1]. A singleton part is never decremented -- [1, 0, 0] is not
decremented to [0, 3, 1]. Instead, the decrement operator needs
to fail for this case. In Knuth's psuedocode, the decrement
operator is step m5.
Spread unallocated multiplicity - Once a part has been decremented,
it cannot be the rightmost part in the partition. There is some
multiplicity that has not been allocated, and new parts must be
created above it in the stack to use up this multiplicity. To
maintain the invariant that the parts on the stack are in
decreasing order, these new parts must be less than or equal to
the decremented part.
For example, if the multiset is [5, 3, 1], and its most
significant part has just been decremented to [5, 3, 0], the
spread operation will add a new part so that the stack becomes
[[5, 3, 0], [0, 0, 1]]. If the most significant part (for the
same multiset) has been decremented to [2, 0, 0] the stack becomes
[[2, 0, 0], [2, 0, 0], [1, 3, 1]]. In the psuedocode, the spread
operation for one part is step m2. The complete spread operation
is a loop of steps m2 and m3.
In order to facilitate the spread operation, Knuth stores, for each
component of each part, not just the multiplicity of that component
in the part, but also the total multiplicity available for this
component in this part or any lesser part above it on the stack.
One added twist is that Knuth does not represent the part vectors as
arrays. Instead, he uses a sparse representation, in which a
component of a part is represented as a component number (c), plus
the multiplicity of the component in that part (v) as well as the
total multiplicity available for that component (u). This saves
time that would be spent skipping over zeros.
"""
class PartComponent:
"""Internal class used in support of the multiset partitions
enumerators and the associated visitor functions.
Represents one component of one part of the current partition.
A stack of these, plus an auxiliary frame array, f, represents a
partition of the multiset.
Knuth's psuedocode makes c, u, and v separate arrays.
"""
def __init__(self):
"""Initialize self."""
# Component number
self.c = 0
# The as yet unpartitioned amount in component c
# *before* it is allocated by this triple
self.u = 0
# Amount of c component in the current part
# (v<=u). An invariant of the representation is
# that the next higher triple for this component
# (if there is one) will have a value of u-v in
# its u attribute.
self.v = 0
def __eq__(self, other):
"""Define value oriented equality, which is useful for testers."""
return (isinstance(other, self.__class__) and
self.c == other.c and
self.u == other.u and
self.v == other.v)
# This function tries to be a faithful implementation of algorithm
# 7.1.2.5M in Volume 4A, Combinatoral Algorithms, Part 1, of The Art
# of Computer Programming, by Donald Knuth. This includes using
# (mostly) the same variable names, etc. This makes for rather
# low-level Python.
# Changes from Knuth's psuedocode include
# - use PartComponent struct/object instead of 3 arrays
# - make the function a generator
# - map (with some difficulty) the GOTOs to Python control structures.
# - Knuth uses 1-based numbering for components, this code is 0-based
# - renamed variable l to lpart.
# - flag variable x takes on values True/False instead of 1/0
#
def multiset_partitions_taocp(multiplicities):
"""Enumerates partitions of a multiset.
Parameters
==========
multiplicities
list of integer multiplicities of the components of the multiset.
Yields
======
state
Internal data structure which encodes a particular partition.
This output is then usually processed by a vistor function
which combines the information from this data structure with
the components themselves to produce an actual partition.
Unless they wish to create their own visitor function, users will
have little need to look inside this data structure. But, for
reference, it is a 3-element list with components:
f
is a frame array, which is used to divide pstack into parts.
lpart
points to the base of the topmost part.
pstack
is an array of PartComponent objects.
The ``state`` output offers a peek into the internal data
structures of the enumeration function. The client should
treat this as read-only; any modification of the data
structure will cause unpredictable (and almost certainly
incorrect) results. Also, the components of ``state`` are
modified in place at each iteration. Hence, the visitor must
be called at each loop iteration. Accumulating the ``state``
instances and processing them later will not work.
Examples
========
>>> # variables components and multiplicities represent the multiset 'abb'
>>> components = 'ab'
>>> multiplicities = [1, 2]
>>> states = multiset_partitions_taocp(multiplicities)
>>> [list_visitor(state, components) for state in states]
[[['a', 'b', 'b']],
[['a', 'b'], ['b']],
[['a'], ['b', 'b']],
[['a'], ['b'], ['b']]]
"""
# Important variables.
# m is the number of components, i.e., number of distinct elements
m = len(multiplicities)
# n is the cardinality, total number of elements whether or not distinct
n = sum(multiplicities)
# The main data structure, f segments pstack into parts. See
# list_visitor() for example code indicating how this internal
# state corresponds to a partition.
# Note: allocation of space for stack is conservative. Knuth's
# exercise 7.2.1.5.68 gives some indication of how to tighten this
# bound, but this is not implemented.
pstack = [PartComponent() for i in range(n * m + 1)]
f = [0] * (n + 1)
# Step M1 in Knuth (Initialize)
# Initial state - entire multiset in one part.
for j in range(m):
ps = pstack[j]
ps.c = j
ps.u = multiplicities[j]
ps.v = multiplicities[j]
# Other variables
f[0] = 0
a = 0
lpart = 0
f[1] = m
b = m # in general, current stack frame is from a to b - 1
while True:
while True:
# Step M2 (Subtract v from u)
j = a
k = b
x = False
while j < b:
pstack[k].u = pstack[j].u - pstack[j].v
if pstack[k].u == 0:
x = True
elif not x:
pstack[k].c = pstack[j].c
pstack[k].v = min(pstack[j].v, pstack[k].u)
x = pstack[k].u < pstack[j].v
k = k + 1
else: # x is True
pstack[k].c = pstack[j].c
pstack[k].v = pstack[k].u
k = k + 1
j = j + 1
# Note: x is True iff v has changed
# Step M3 (Push if nonzero.)
if k > b:
a = b
b = k
lpart = lpart + 1
f[lpart + 1] = b
# Return to M2
else:
break # Continue to M4
# M4 Visit a partition
state = [f, lpart, pstack]
yield state
# M5 (Decrease v)
while True:
j = b-1
while pstack[j].v == 0:
j = j - 1
if j == a and pstack[j].v == 1:
# M6 (Backtrack)
if lpart == 0:
return
lpart = lpart - 1
b = a
a = f[lpart]
# Return to M5
else:
pstack[j].v = pstack[j].v - 1
for k in range(j + 1, b):
pstack[k].v = pstack[k].u
break # GOTO M2
# --------------- Visitor functions for multiset partitions ---------------
# A visitor takes the partition state generated by
# multiset_partitions_taocp or other enumerator, and produces useful
# output (such as the actual partition).
def factoring_visitor(state, primes):
"""Use with multiset_partitions_taocp to enumerate the ways a
number can be expressed as a product of factors. For this usage,
the exponents of the prime factors of a number are arguments to
the partition enumerator, while the corresponding prime factors
are input here.
Examples
========
To enumerate the factorings of a number we can think of the elements of the
partition as being the prime factors and the multiplicities as being their
exponents.
>>> primes, multiplicities = zip(*factorint(24).items())
>>> primes
(2, 3)
>>> multiplicities
(3, 1)
>>> states = multiset_partitions_taocp(multiplicities)
>>> [factoring_visitor(state, primes) for state in states]
[[24], [8, 3], [12, 2], [4, 6], [4, 2, 3], [6, 2, 2], [2, 2, 2, 3]]
"""
f, lpart, pstack = state
factoring = []
for i in range(lpart + 1):
factor = 1
for ps in pstack[f[i]: f[i + 1]]:
if ps.v > 0:
factor *= primes[ps.c] ** ps.v
factoring.append(factor)
return factoring
def list_visitor(state, components):
"""Return a list of lists to represent the partition.
Examples
========
>>> states = multiset_partitions_taocp([1, 2, 1])
>>> s = next(states)
>>> list_visitor(s, 'abc') # for multiset 'a b b c'
[['a', 'b', 'b', 'c']]
>>> s = next(states)
>>> list_visitor(s, [1, 2, 3]) # for multiset '1 2 2 3
[[1, 2, 2], [3]]
"""
f, lpart, pstack = state
partition = []
for i in range(lpart+1):
part = []
for ps in pstack[f[i]:f[i+1]]:
if ps.v > 0:
part.extend([components[ps.c]] * ps.v)
partition.append(part)
return partition
class MultisetPartitionTraverser:
"""
Has methods to ``enumerate`` and ``count`` the partitions of a multiset.
This implements a refactored and extended version of Knuth's algorithm
7.1.2.5M.
The enumeration methods of this class are generators and return
data structures which can be interpreted by the same visitor
functions used for the output of ``multiset_partitions_taocp``.
See Also
========
multiset_partitions_taocp
Examples
========
>>> m = MultisetPartitionTraverser()
>>> m.count_partitions([4, 4, 4, 2])
127750
>>> m.count_partitions([3, 3, 3])
686
References
==========
* Algorithm 7.1.2.5M in Volume 4A, Combinatoral Algorithms,
Part 1, of The Art of Computer Programming, by Donald Knuth.
* On a Problem of Oppenheim concerning
"Factorisatio Numerorum" E. R. Canfield, Paul Erdős, Carl
Pomerance, JOURNAL OF NUMBER THEORY, Vol. 17, No. 1. August
1983. See section 7 for a description of an algorithm
similar to Knuth's.
* Generating Multiset Partitions, Brent Yorgey, The
Monad.Reader, Issue 8, September 2007.
"""
def __init__(self):
"""Initialize self."""
# TRACING variables. These are useful for gathering
# statistics on the algorithm itself, but have no particular
# benefit to a user of the code.
self.k1 = 0
self.k2 = 0
self.p1 = 0
self.pstack = None
self.f = None
self.lpart = None
self.discarded = None
self.pcount = None
self.dp_stack = None
self.dp_map = None
#
# Helper methods for enumeration
#
def _initialize_enumeration(self, multiplicities):
"""Allocates and initializes the partition stack.
This is called from the enumeration/counting routines, so
there is no need to call it separately.
"""
num_components = len(multiplicities)
# cardinality is the total number of elements, whether or not distinct
cardinality = sum(multiplicities)
# pstack is the partition stack, which is segmented by
# f into parts.
self.pstack = [PartComponent() for i in
range(num_components * cardinality + 1)]
self.f = [0] * (cardinality + 1)
# Initial state - entire multiset in one part.
for j in range(num_components):
ps = self.pstack[j]
ps.c = j
ps.u = multiplicities[j]
ps.v = multiplicities[j]
self.f[0] = 0
self.f[1] = num_components
self.lpart = 0
# The decrement_part() method corresponds to step M5 in Knuth's
# algorithm. This is the base version for enum_all(). Modified
# versions of this method are needed if we want to restrict
# sizes of the partitions produced.
def decrement_part(self, part):
"""Decrements part (a subrange of pstack), if possible, returning
True iff the part was successfully decremented.
If you think of the v values in the part as a multi-digit
integer (least significant digit on the right) this is
basically decrementing that integer, but with the extra
constraint that the leftmost digit cannot be decremented to 0.
Parameters
==========
part
The part, represented as a list of PartComponent objects,
which is to be decremented.
"""
plen = len(part)
for j in range(plen - 1, -1, -1):
if (j == 0 and part[j].v > 1) or (j > 0 and part[j].v > 0):
# found val to decrement
part[j].v -= 1
# Reset trailing parts back to maximum
for k in range(j + 1, plen):
part[k].v = part[k].u
return True
return False
# Version to allow number of parts to be bounded from above.
# Corresponds to (a modified) step M5.
def decrement_part_small(self, part, ub):
"""Decrements part (a subrange of pstack), if possible, returning
True iff the part was successfully decremented.
Parameters
==========
part
part to be decremented (topmost part on the stack)
ub
the maximum number of parts allowed in a partition
returned by the calling traversal.
Notes
=====
The goal of this modification of the ordinary decrement method
is to fail (meaning that the subtree rooted at this part is to
be skipped) when it can be proved that this part can only have
child partitions which are larger than allowed by ``ub``. If a
decision is made to fail, it must be accurate, otherwise the
enumeration will miss some partitions. But, it is OK not to
capture all the possible failures -- if a part is passed that
shouldn't be, the resulting too-large partitions are filtered
by the enumeration one level up. However, as is usual in
constrained enumerations, failing early is advantageous.
The tests used by this method catch the most common cases,
although this implementation is by no means the last word on
this problem. The tests include:
1) ``lpart`` must be less than ``ub`` by at least 2. This is because
once a part has been decremented, the partition
will gain at least one child in the spread step.
2) If the leading component of the part is about to be
decremented, check for how many parts will be added in
order to use up the unallocated multiplicity in that
leading component, and fail if this number is greater than
allowed by ``ub``. (See code for the exact expression.) This
test is given in the answer to Knuth's problem 7.2.1.5.69.
3) If there is *exactly* enough room to expand the leading
component by the above test, check the next component (if
it exists) once decrementing has finished. If this has
``v == 0``, this next component will push the expansion over the
limit by 1, so fail.
"""
if self.lpart >= ub - 1:
self.p1 += 1 # increment to keep track of usefulness of tests
return False
plen = len(part)
for j in range(plen - 1, -1, -1): # pragma: no branch
# Knuth's mod, (answer to problem 7.2.1.5.69)
if (j == 0) and (part[0].v - 1)*(ub - self.lpart) < part[0].u:
self.k1 += 1
return False
if (j == 0 and part[j].v > 1) or (j > 0 and part[j].v > 0):
# found val to decrement
part[j].v -= 1
# Reset trailing parts back to maximum
for k in range(j + 1, plen):
part[k].v = part[k].u
# Have now decremented part, but are we doomed to
# failure when it is expanded? Check one oddball case
# that turns out to be surprisingly common - exactly
# enough room to expand the leading component, but no
# room for the second component, which has v=0.
if (plen > 1 and (part[1].v == 0) and
(part[0].u - part[0].v) ==
((ub - self.lpart - 1) * part[0].v)):
self.k2 += 1
return False
return True
def decrement_part_large(self, part, lb):
"""Decrements part, while respecting size constraint.
A part can have no children which are of sufficient size (as
indicated by ``lb``) unless that part has sufficient
unallocated multiplicity. When enforcing the size constraint,
this method will decrement the part (if necessary) by an
amount needed to ensure sufficient unallocated multiplicity.
Returns True iff the part was successfully decremented.
Parameters
==========
part
part to be decremented (topmost part on the stack)
lb
The partitions produced by the calling enumeration must
have more parts than this value.
"""
# Next, perform any needed additional decrementing to respect
# "sufficient unallocated multiplicity" (or fail if this is
# not possible).
min_unalloc = lb - self.lpart
if min_unalloc <= 0:
return True
total_mult = sum(pc.u for pc in part)
total_alloc = sum(pc.v for pc in part)
if total_mult <= min_unalloc:
return False
deficit = min_unalloc - (total_mult - total_alloc)
if deficit <= 0:
return True
for i in range(len(part) - 1, -1, -1): # pragma: no branch
if i == 0:
assert part[0].v > deficit
part[0].v -= deficit
return True
if part[i].v >= deficit:
part[i].v -= deficit
return True
deficit -= part[i].v
part[i].v = 0
def decrement_part_range(self, part, lb, ub):
"""Decrements part (a subrange of pstack), if possible, returning
True iff the part was successfully decremented.
Parameters
==========
part
part to be decremented (topmost part on the stack)
ub
the maximum number of parts allowed in a partition
returned by the calling traversal.
lb
The partitions produced by the calling enumeration must
have more parts than this value.
Notes
=====
Combines the constraints of _small and _large decrement
methods. If returns success, part has been decremented at
least once, but perhaps by quite a bit more if needed to meet
the lb constraint.
"""
# Constraint in the range case is just enforcing both the
# constraints from _small and _large cases. Note the 0 as the
# second argument to the _large call -- this is the signal to
# decrement only as needed to for constraint enforcement. The
# short circuiting and left-to-right order of the 'and'
# operator is important for this to work correctly.
return self.decrement_part_small(part, ub) and \
self.decrement_part_large(part, lb)
def spread_part_multiplicity(self):
"""Returns True if a new part has been created, and
adjusts pstack, f and lpart as needed.
Notes
=====
Spreads unallocated multiplicity from the current top part
into a new part created above the current on the stack. This
new part is constrained to be less than or equal to the old in
terms of the part ordering.
This call does nothing (and returns False) if the current top
part has no unallocated multiplicity.
"""
k = self.f[self.lpart + 1] # ub of current; potential base of next
base = k # save for later comparison
# Set to true when the new part (so far) is
# strictly less than (as opposed to less than
# or equal) to the old.
changed = False
for j in range(self.f[self.lpart], self.f[self.lpart + 1]):
self.pstack[k].u = self.pstack[j].u - self.pstack[j].v
if self.pstack[k].u == 0:
changed = True
else:
self.pstack[k].c = self.pstack[j].c
if changed: # Put all available multiplicity in this part
self.pstack[k].v = self.pstack[k].u
else: # Still maintaining ordering constraint
if self.pstack[k].u < self.pstack[j].v:
self.pstack[k].v = self.pstack[k].u
changed = True
else:
self.pstack[k].v = self.pstack[j].v
k = k + 1
if k > base:
# Adjust for the new part on stack
self.lpart = self.lpart + 1
self.f[self.lpart + 1] = k
return True
return False
def top_part(self):
"""Return current top part on the stack, as a slice of pstack."""
return self.pstack[self.f[self.lpart]:self.f[self.lpart + 1]]
# Same interface and functionality as multiset_partitions_taocp(),
# but some might find this refactored version easier to follow.
def enum_all(self, multiplicities):
"""Enumerate the partitions of a multiset.
Examples
========
>>> m = MultisetPartitionTraverser()
>>> states = m.enum_all([2, 2])
>>> [list_visitor(state, 'ab') for state in states]
[[['a', 'a', 'b', 'b']],
[['a', 'a', 'b'], ['b']],
[['a', 'a'], ['b', 'b']],
[['a', 'a'], ['b'], ['b']],
[['a', 'b', 'b'], ['a']],
[['a', 'b'], ['a', 'b']],
[['a', 'b'], ['a'], ['b']],
[['a'], ['a'], ['b', 'b']],
[['a'], ['a'], ['b'], ['b']]]
See also
========
multiset_partitions_taocp():
which provides the same result as this method, but is
about twice as fast. Hence, enum_all is primarily useful
for testing. Also see the function for a discussion of
states and visitors.
"""
self._initialize_enumeration(multiplicities)
while True:
while self.spread_part_multiplicity():
pass
# M4 Visit a partition
state = [self.f, self.lpart, self.pstack]
yield state
# M5 (Decrease v)
while not self.decrement_part(self.top_part()):
# M6 (Backtrack)
if self.lpart == 0:
return
self.lpart -= 1
def enum_small(self, multiplicities, ub):
"""Enumerate multiset partitions with no more than ``ub`` parts.
Equivalent to enum_range(multiplicities, 0, ub)
See also
========
enum_all, enum_large, enum_range
Parameters
==========
multiplicities
list of multiplicities of the components of the multiset.
ub
Maximum number of parts
Examples
========
>>> m = MultisetPartitionTraverser()
>>> states = m.enum_small([2, 2], 2)
>>> [list_visitor(state, 'ab') for state in states]
[[['a', 'a', 'b', 'b']],
[['a', 'a', 'b'], ['b']],
[['a', 'a'], ['b', 'b']],
[['a', 'b', 'b'], ['a']],
[['a', 'b'], ['a', 'b']]]
The implementation is based, in part, on the answer given to
exercise 69, in Knuth.
References
==========
* Algorithm 7.1.2.5M in Volume 4A, Combinatoral Algorithms,
Part 1, of The Art of Computer Programming, by Donald Knuth.
"""
# Keep track of iterations which do not yield a partition.
# Clearly, we would like to keep this number small.
self.discarded = 0
if ub <= 0:
return
self._initialize_enumeration(multiplicities)
while True:
good_partition = True
while self.spread_part_multiplicity():
if self.lpart >= ub:
self.discarded += 1
good_partition = False
self.lpart = ub - 2
break
# M4 Visit a partition
if good_partition:
state = [self.f, self.lpart, self.pstack]
yield state
# M5 (Decrease v)
while not self.decrement_part_small(self.top_part(), ub):
# M6 (Backtrack)
if self.lpart == 0:
return
self.lpart -= 1
def enum_large(self, multiplicities, lb):
"""Enumerate the partitions of a multiset with lb < num(parts)
See also
========
enum_all, enum_small, enum_range
Parameters
==========
multiplicities
list of multiplicities of the components of the multiset.
lb
Number of parts in the partition must be greater than
this lower bound.
Examples
========
>>> m = MultisetPartitionTraverser()
>>> states = m.enum_large([2, 2], 2)
>>> [list_visitor(state, 'ab') for state in states]
[[['a', 'a'], ['b'], ['b']],
[['a', 'b'], ['a'], ['b']],
[['a'], ['a'], ['b', 'b']],
[['a'], ['a'], ['b'], ['b']]]
"""
return self.enum_range(multiplicities, lb, sum(multiplicities))
def enum_range(self, multiplicities, lb, ub):
"""Enumerate the partitions of a multiset with
``lb < num(parts) <= ub``.
In particular, if partitions with exactly ``k`` parts are
desired, call with ``(multiplicities, k - 1, k)``. This
method generalizes enum_all, enum_small, and enum_large.
Examples
========
>>> m = MultisetPartitionTraverser()
>>> states = m.enum_range([2, 2], 1, 2)
>>> [list_visitor(state, 'ab') for state in states]
[[['a', 'a', 'b'], ['b']],
[['a', 'a'], ['b', 'b']],
[['a', 'b', 'b'], ['a']],
[['a', 'b'], ['a', 'b']]]
"""
# combine the constraints of the _large and _small
# enumerations.
self.discarded = 0
if ub <= 0 or lb >= sum(multiplicities):
return
self._initialize_enumeration(multiplicities)
self.decrement_part_large(self.top_part(), lb)
while True:
good_partition = True
while self.spread_part_multiplicity():
assert self.decrement_part_large(self.top_part(), lb)
if self.lpart >= ub:
self.discarded += 1
good_partition = False
self.lpart = ub - 2
break
# M4 Visit a partition
if good_partition:
state = [self.f, self.lpart, self.pstack]
yield state
# M5 (Decrease v)
while not self.decrement_part_range(self.top_part(), lb, ub):
# M6 (Backtrack)
if self.lpart == 0:
return
self.lpart -= 1
def count_partitions_slow(self, multiplicities):
"""Returns the number of partitions of a multiset whose elements
have the multiplicities given in ``multiplicities``.
Primarily for comparison purposes. It follows the same path as
enumerate, and counts, rather than generates, the partitions.
See Also
========
count_partitions
Has the same calling interface, but is much faster.
"""
# number of partitions so far in the enumeration
self.pcount = 0
self._initialize_enumeration(multiplicities)
while True:
while self.spread_part_multiplicity():
pass
# M4 Visit (count) a partition
self.pcount += 1
# M5 (Decrease v)
while not self.decrement_part(self.top_part()):
# M6 (Backtrack)
if self.lpart == 0:
return self.pcount
self.lpart -= 1
def count_partitions(self, multiplicities):
"""Returns the number of partitions of a multiset whose components
have the multiplicities given in ``multiplicities``.
For larger counts, this method is much faster than calling one
of the enumerators and counting the result. Uses dynamic
programming to cut down on the number of nodes actually
explored. The dictionary used in order to accelerate the
counting process is stored in the ``MultisetPartitionTraverser``
object and persists across calls. If the the user does not
expect to call ``count_partitions`` for any additional
multisets, the object should be cleared to save memory. On
the other hand, the cache built up from one count run can
significantly speed up subsequent calls to ``count_partitions``,
so it may be advantageous not to clear the object.
Examples
========
>>> m = MultisetPartitionTraverser()
>>> m.count_partitions([9, 8, 2])
288716
>>> m.count_partitions([2, 2])
9
>>> del m
Notes
=====
If one looks at the workings of Knuth's algorithm M, it
can be viewed as a traversal of a binary tree of parts. A
part has (up to) two children, the left child resulting from
the spread operation, and the right child from the decrement
operation. The ordinary enumeration of multiset partitions is
an in-order traversal of this tree, and with the partitions
corresponding to paths from the root to the leaves. The
mapping from paths to partitions is a little complicated,
since the partition would contain only those parts which are
leaves or the parents of a spread link, not those which are
parents of a decrement link.
For counting purposes, it is sufficient to count leaves, and
this can be done with a recursive in-order traversal. The
number of leaves of a subtree rooted at a particular part is a
function only of that part itself, so memoizing has the
potential to speed up the counting dramatically.
This method follows a computational approach which is similar
to the hypothetical memoized recursive function, but with two
differences:
1) This method is iterative, borrowing its structure from the
other enumerations and maintaining an explicit stack of
parts which are in the process of being counted. (There
may be multisets which can be counted reasonably quickly by
this implementation, but which would overflow the default
Python recursion limit with a recursive implementation.)
2) Instead of using the part data structure directly, a more
compact key is constructed. This saves space, but more
importantly coalesces some parts which would remain
separate with physical keys.
Unlike the enumeration functions, there is currently no _range
version of count_partitions. If someone wants to stretch
their brain, it should be possible to construct one by
memoizing with a histogram of counts rather than a single
count, and combining the histograms.
References
==========
* Algorithm 7.1.2.5M in Volume 4A, Combinatoral Algorithms,
Part 1, of The Art of Computer Programming, by Donald Knuth.
"""
# number of partitions so far in the enumeration
self.pcount = 0
# dp_stack is list of lists of (part_key, start_count) pairs
self.dp_stack = []
# dp_map is map part_key-> count, where count represents the
# number of multiset which are descendants of a part with this
# key, **or any of its decrements**