-
-
Notifications
You must be signed in to change notification settings - Fork 453
/
findstat.py
2881 lines (2212 loc) · 112 KB
/
findstat.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
r"""
FindStat - the Combinatorial Statistic Finder.
The FindStat database can be found at http://www.findstat.org .
Fix the following three notions:
- A *combinatorial collection* is a set `S` with interesting combinatorial properties,
- a *combinatorial map* is a combinatorially interesting map `f: S \to S'` between combinatorial collections, and
- a *combinatorial statistic* is a combinatorially interesting map `s: S \to \ZZ`.
You can use the sage interface to FindStat to:
- identify a combinatorial statistic from the values on a few small objects,
- obtain more terms, formulae, references, etc. for a given statistic,
- edit statistics and submit new statistics.
To access the database, use :class:`findstat<FindStat>`::
sage: findstat
The Combinatorial Statistic Finder (http://www.findstat.org/)
AUTHORS:
- Martin Rubey (2015): initial version.
A guided tour
-------------
Retrieving information
^^^^^^^^^^^^^^^^^^^^^^
The most straightforward application of the FindStat interface is to
gather information about a combinatorial statistic. To do this, we
supply :class:`findstat<FindStat>` with a list of `(object, value)`
pairs. For example::
sage: PM8 = PerfectMatchings(8)
sage: r = findstat([(m, m.number_of_nestings()) for m in PM8]); r # optional -- internet,random
0: (St000041: The number of nestings of a perfect matching. , [], 105)
...
The result of this query is a list (presented as a
:class:`sage.databases.oeis.FancyTuple`) of triples. The first
element of each triple is a :class:`FindStatStatistic` `s: S \to
\ZZ`, the second element a list of :class:`FindStatMap`'s `f_i: S_i
\to S_{i+1}`, and the third element is an integer::
sage: (s, list_f, quality) = r[0] # optional -- internet
The precise meaning of the result is as follows:
The composition `f_n \circ ... \circ f_2 \circ f_1` applied to
the objects sent to FindStat agrees with `quality` many `(object,
value)` pairs of `s` in the database. Moreover, there are no
other `(object, value)` pairs of `s` stored in the database,
i.e., there is no disagreement of values.
Put differently, if `quality` is not too small it is likely that the
statistic sent to FindStat equals `s \circ f_n \circ ... \circ f_2 \circ f_1`.
In the case at hand, the list of maps is empty and the integer
`quality` equals the number of `(object, value)` pairs passed to
FindStat. This means, that the set of `(object, value)` pairs of the
statistic `s` as stored in the FindStat database is a superset of the
data sent. We can now retrieve the description from the database::
sage: print s.description() # optional -- internet,random
The number of nestings of a perfect matching.
<BLANKLINE>
<BLANKLINE>
This is the number of pairs of edges $((a,b), (c,d))$ such that $a\le c\le d\le b$. i.e., the edge $(c,d)$ is nested inside $(a,b)$.
and check the references::
sage: s.references() # optional -- internet,random
0: [1] [[MathSciNet:1288802]]
1: [2] [[MathSciNet:1418763]]
If you prefer, you can look at this information also in your browser::
sage: findstat(41).browse() # optional -- webbrowser
Another interesting possibility is to look for equidistributed
statistics. Instead of submitting a list of pairs, we pass a pair of
lists::
sage: r = findstat((PM8, [m.number_of_nestings() for m in PM8])); r # optional -- internet,random
0: (St000041: The number of nestings of a perfect matching. , [], 105)
1: (St000042: The number of crossings of a perfect matching. , [], 105)
...
This results tells us that the database contains another entriy that is
equidistributed with the number of nestings on perfect matchings of
length `8`, namely the number of crossings.
Let us now look at a slightly more complicated example, where the
submitted statistic is the composition of a sequence of combinatorial
maps and a statistic known to FindStat. We use the occasion to
advertise yet another way to pass values to FindStat::
sage: r = findstat(Permutations(4), lambda pi: pi.saliances()[0]); r # optional -- internet,random
0: (St000051: The size of the left subtree. , [Mp00069: complement, Mp00061: to increasing tree], 24)
...
sage: (s, list_f, quality) = r[0] # optional -- internet
To obtain the value of the statistic sent to FindStat on a given
object, apply the maps in the list in the given order to this object,
and evaluate the statistic on the result. For example, let us check
that the result given by FindStat agrees with our statistic on the
following permutation::
sage: pi = Permutation([3,1,4,5,2]); pi.saliances()[0]
3
We first have to find out, what the maps and the statistic actually do::
sage: print s.description() # optional -- internet,random
The size of the left subtree.
sage: print s.code() # optional -- internet,random
def statistic(T):
return T[0].node_number()
sage: print list_f[0].code() + "\r\n" + list_f[1].code() # optional -- internet,random
def complement(elt):
n = len(elt)
return elt.__class__(elt.parent(), map(lambda x: n - x + 1, elt) )
<BLANKLINE>
def increasing_tree_shape(elt, compare=min):
return elt.increasing_tree(compare).shape()
So, the following should coincide with what we sent FindStat::
sage: pi.complement().increasing_tree_shape()[0].node_number()
3
Editing and submitting statistics
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Of course, often a statistic will not be in the database::
sage: findstat([(d, randint(1,1000)) for d in DyckWords(4)]) # optional -- internet
a new statistic on Cc0005: Dyck paths
In this case, and if the statistic might be "interesting", please
consider submitting it to the database using
:meth:`FindStatStatistic.submit`.
Also, you may notice omissions, typos or even mistakes in the
description, the code and the references. In this case, simply
replace the value by using :meth:`FindStatStatistic.set_description`,
:meth:`FindStatStatistic.set_code` or
:meth:`FindStatStatistic.set_references`, and then
:meth:`FindStatStatistic.submit` your changes for review by the
FindStat team.
Classes and methods
-------------------
"""
#*****************************************************************************
# Copyright (C) 2015 Martin Rubey <martin.rubey@tuwien.ac.at>,
#
# Distributed under the terms of the GNU General Public License (GPL)
# as published by the Free Software Foundation; either version 2 of
# the License, or (at your option) any later version.
# http://www.gnu.org/licenses/
#*****************************************************************************
from sage.misc.inherit_comparison import InheritComparisonClasscallMetaclass
from sage.structure.element import Element
from sage.structure.parent import Parent
from sage.structure.unique_representation import UniqueRepresentation
from sage.misc.cachefunc import cached_method
from sage.misc.lazy_attribute import lazy_attribute
from sage.categories.sets_cat import Sets
from sage.structure.sage_object import SageObject
from sage.misc.misc import verbose
from sage.rings.integer import Integer
from sage.databases.oeis import FancyTuple
from string import join
from ast import literal_eval
from collections import OrderedDict
import re
import webbrowser
import tempfile
import time
import inspect
import json
import cgi
# import compatible with py2 and py3
from six.moves.urllib.parse import urlencode
from six.moves.urllib.request import Request, urlopen
from six.moves.urllib.error import HTTPError
# Combinatorial collections
from sage.combinat.alternating_sign_matrix import AlternatingSignMatrix, AlternatingSignMatrices
from sage.combinat.binary_tree import BinaryTree, BinaryTrees
from sage.combinat.core import Core, Cores
from sage.combinat.dyck_word import DyckWord, DyckWords
from sage.combinat.root_system.cartan_type import CartanType_abstract, CartanType
from sage.combinat.gelfand_tsetlin_patterns import GelfandTsetlinPattern, GelfandTsetlinPatterns
from sage.graphs.graph import Graph
from sage.combinat.composition import Composition, Compositions
from sage.combinat.partition import Partition, Partitions
from sage.combinat.ordered_tree import OrderedTree, OrderedTrees
from sage.combinat.parking_functions import ParkingFunction, ParkingFunction_class, ParkingFunctions
from sage.combinat.perfect_matching import PerfectMatching, PerfectMatchings
from sage.combinat.permutation import Permutation, Permutations
from sage.combinat.posets.posets import Poset, FinitePoset
from sage.combinat.posets.poset_examples import posets
from sage.combinat.tableau import SemistandardTableau, SemistandardTableaux, StandardTableau, StandardTableaux
from sage.combinat.set_partition import SetPartition, SetPartitions
from sage.graphs.graph_generators import graphs
from sage.combinat.words.word import Word
from sage.combinat.words.words import Words
from sage.combinat.words.abstract_word import Word_class
######################################################################
# the FindStat URLs
FINDSTAT_URL = 'http://www.findstat.org/'
FINDSTAT_URL_RESULT = FINDSTAT_URL + "StatisticFinder/Result/"
FINDSTAT_URL_LOGIN = FINDSTAT_URL + "StatisticFinder?action=login"
FINDSTAT_URL_NEW = FINDSTAT_URL + 'StatisticsDatabase/NewStatistic/'
FINDSTAT_URL_EDIT = FINDSTAT_URL + 'StatisticsDatabase/EditStatistic/'
FINDSTAT_URL_BROWSE = FINDSTAT_URL + 'StatisticsDatabase/'
FINDSTAT_URL_DOWNLOADS = 'http://downloads.findstat.org/'
FINDSTAT_URL_DOWNLOADS_STATISTICS = FINDSTAT_URL_DOWNLOADS + "statistics/%s.json"
FINDSTAT_URL_DOWNLOADS_COLLECTIONS = FINDSTAT_URL_DOWNLOADS + "collections.json"
FINDSTAT_URL_DOWNLOADS_MAPS = FINDSTAT_URL_DOWNLOADS + "maps.json"
######################################################################
# the number of values FindStat allows to search for at most
FINDSTAT_MAX_VALUES = 200
# the number of values FindStat needs at least to search for
FINDSTAT_MIN_VALUES = 3
# the number of maps that FindStat should compose at most to find a match
FINDSTAT_MAX_DEPTH = 5
# the number of values FindStat allows to submit at most
FINDSTAT_MAX_SUBMISSION_VALUES = 1200
# the fields of the FindStat database we expect
FINDSTAT_STATISTIC_IDENTIFIER = 'StatisticIdentifier'
FINDSTAT_STATISTIC_COLLECTION = 'StatisticCollection'
FINDSTAT_STATISTIC_DATA = 'StatisticData'
FINDSTAT_STATISTIC_GENERATING_FUNCTION = 'StatisticGeneratingFunction'
FINDSTAT_STATISTIC_NAME = 'StatisticTitle'
FINDSTAT_STATISTIC_DESCRIPTION = 'StatisticDescription'
FINDSTAT_STATISTIC_REFERENCES = 'StatisticReferences'
FINDSTAT_STATISTIC_CODE = 'StatisticCode'
FINDSTAT_STATISTIC_ORIGINAL_AUTHOR = 'StatisticOriginalAuthor' # unused, designates a dictionary with Name, Time
FINDSTAT_STATISTIC_UPDATE_AUTHOR = 'StatisticUpdateAuthor' # unused, designates a dictionary with Name, Time
FINDSTAT_POST_AUTHOR = 'StatisticAuthor' # designates the name of the author
FINDSTAT_POST_EMAIL = 'StatisticEmail'
FINDSTAT_POST_SAGE_CELL = 'SageCellField' # currently only used as post key
FINDSTAT_POST_EDIT = 'EDIT' # only used as post key
FINDSTAT_COLLECTION_IDENTIFIER = 'CollectionIdentifier'
FINDSTAT_COLLECTION_NAME = 'CollectionName'
FINDSTAT_COLLECTION_NAME_PLURAL = 'CollectionNamePlural'
FINDSTAT_COLLECTION_NAME_WIKI = 'CollectionNameWiki'
FINDSTAT_COLLECTION_LEVELS = 'CollectionLevelsSizes'
FINDSTAT_MAP_IDENTIFIER = 'MapIdentifier'
FINDSTAT_MAP_NAME = 'MapName'
FINDSTAT_MAP_DESCRIPTION = 'MapDescription'
FINDSTAT_MAP_DOMAIN = 'MapDomain'
FINDSTAT_MAP_CODOMAIN = 'MapCodomain'
FINDSTAT_MAP_CODE = 'MapCode'
FINDSTAT_MAP_CODE_NAME = 'MapSageName'
FINDSTAT_QUERY_MATCHES = 'QueryMatches'
FINDSTAT_QUERY_MATCHING_DATA = 'QueryMatchingData'
FINDSTAT_QUERY_MAPS = 'QueryMaps'
# the entries of this list are required as post arguments for submitting or editing a statistic
FINDSTAT_EDIT_FIELDS = set([FINDSTAT_STATISTIC_IDENTIFIER,
FINDSTAT_STATISTIC_COLLECTION,
FINDSTAT_STATISTIC_DATA,
FINDSTAT_STATISTIC_DESCRIPTION,
FINDSTAT_STATISTIC_REFERENCES,
FINDSTAT_STATISTIC_CODE,
FINDSTAT_POST_AUTHOR,
FINDSTAT_POST_EMAIL,
FINDSTAT_POST_SAGE_CELL,
FINDSTAT_POST_EDIT])
# separates name from description
FINDSTAT_SEPARATOR_NAME = "\r\n"
# separates references
FINDSTAT_SEPARATOR_REFERENCES = "\r\n"
######################################################################
# the format string for using POST
# WARNING: we use cgi.escape to avoid injection problems, thus we expect double quotes as field delimiters.
FINDSTAT_POST_HEADER = """
<script src="http://www.google.com/jsapi"></script>
<script>
google.load("jquery", "1.3.2");
</script>
<script>
$(document).ready(function() {$("#form").submit(); });
</script>
"""
FINDSTAT_NEWSTATISTIC_FORM_HEADER = '<form id="form" name="NewStatistic" action="%s" enctype="multipart/form-data" method="post" />'
FINDSTAT_NEWSTATISTIC_FORM_FORMAT = '<input type="hidden" name="%s" value="%s" />'
FINDSTAT_NEWSTATISTIC_FORM_FOOTER = '</form>'
######################################################################
class FindStat(SageObject):
r"""
The Combinatorial Statistic Finder.
:class:`FindStat` is a class representing results of queries to
the FindStat database. This class is also the entry point to
edit statistics and new submissions. Use the shorthand
:class:`findstat<FindStat>` to call it.
INPUT:
One of the following:
- an integer or a string representing a valid FindStat identifier
(e.g. 45 or 'St000045'). The keyword arguments ``depth`` and
``max_values`` are ignored.
- a list of pairs of the form (object, value), or a dictionary
from sage objects to integer values. The keyword arguments
``depth`` and ``max_values`` are passed to the finder.
- a list of pairs of the form (list of objects, list of values),
or a single pair of the form (list of objects, list of values).
In each pair there should be as many objects as values. The
keyword arguments ``depth`` and ``max_values`` are passed to
the finder.
- a collection and a list of pairs of the form (string, value),
or a dictionary from strings to integer values. The keyword
arguments ``depth`` and ``max_values`` are passed to the
finder. This should only be used if the collection is not yet
supported.
- a collection and a list of pairs of the form (list of strings,
list of values), or a single pair of the form (list of strings,
list of values). In each pair there should be as many strings
as values. The keyword arguments ``depth`` and ``max_values``
are passed to the finder. This should only be used if the
collection is not yet supported.
- a collection and a callable. The callable is used to generate
``max_values`` (object, value) pairs. The number of terms
generated may also be controlled by passing an iterable
collection, such as ``Permutations(3)``. The keyword arguments
``depth`` and ``max_values`` are passed to the finder.
OUTPUT:
An instance of a :class:`FindStatStatistic`, represented by
- the FindStat identifier together with its name, or
- a list of triples, each consisting of
- the statistic
- a list of strings naming certain maps
- a number which says how many of the values submitted agree
with the values in the database, when applying the maps in
the given order to the object and then computing the
statistic on the result.
EXAMPLES:
A particular statistic can be retrieved by its St-identifier or
number::
sage: findstat('St000041') # optional -- internet,random
St000041: The number of nestings of a perfect matching.
sage: findstat(51) # optional -- internet,random
St000051: The size of the left subtree.
The database can be searched by providing a list of pairs::
sage: q = findstat([(pi, pi.length()) for pi in Permutations(4)]); q # optional -- internet,random
0: (St000018: The [[/Permutations/Inversions|number of inversions]] of a permutation., [], 24)
1: (St000004: The [[/Permutations/Descents-Major|major index]] of a permutation., [Mp00062: inversion-number to major-index bijection], 24)
...
or a dictionary::
sage: p = findstat({pi: pi.length() for pi in Permutations(4)}); p # optional -- internet,random
0: (St000018: The [[/Permutations/Inversions|number of inversions]] of a permutation., [], 24)
1: (St000004: The [[/Permutations/Descents-Major|major index]] of a permutation., [Mp00062: inversion-number to major-index bijection], 24)
...
Note however, that the results of these two queries are not
necessarily the same, because we compare queries by the data
sent, and the ordering of the data might be different::
sage: p == q # optional -- internet
False
Another possibility is to send a collection and a function. In
this case, the function is applied to the first few objects of
the collection::
sage: findstat("Permutations", lambda pi: pi.length()) # optional -- internet,random
0: (St000018: The [[/Permutations/Inversions|number of inversions]] of a permutation., [], 200)
...
To search for a distribution, send a list of lists, or a single pair::
sage: S4 = Permutations(4); findstat((S4, [pi.length() for pi in S4])) # optional -- internet,random
0: (St000004: The [[/Permutations/Descents-Major|major index]] of a permutation., [], 24)
1: (St000018: The [[/Permutations/Inversions|number of inversions]] of a permutation., [], 24)
...
Note that there is a limit, ``FINDSTAT_MAX_DEPTH``, on the number
of elements that may be submitted to FindStat, which is currently
200. Therefore, the interface tries to truncate queries
appropriately, but this may be impossible, especially with
distribution searches::
sage: S6 = Permutations(6); S6.cardinality() # optional -- internet
720
sage: findstat((S6, [1 for a in S6])) # optional -- internet
Traceback (most recent call last):
...
ValueError: After discarding elements not in the range, too few (=0) values remained to send to FindStat.
"""
def __init__(self):
r"""
Initialize the database.
In particular, we set up a cache from integers to
:class:`FindStatStatistic` instances that avoids retrieving
the same statistic over and over again.
TESTS::
sage: findstat
The Combinatorial Statistic Finder (http://www.findstat.org/)
"""
self._statistic_cache = dict()
# user credentials if provided
self._user_name = ""
self._user_email = ""
def __call__(self, query_1, query_2=None, depth=2, max_values=FINDSTAT_MAX_VALUES):
r"""
Return an instance of a :class:`FindStatStatistic`.
This should be the only way to access
:class:`FindStatStatistic`. We do the preprocessing and
(some) checking of the data here. Then we call the
appropriate method of :class:`FindStatStatistic` to launch
the query. Thus, in principle :class:`FindStatStatistic`
could be called if no checking is desired.
EXAMPLES::
sage: from sage.databases.findstat import FindStatCollection
sage: findstat(FindStatCollection("Permutations"), lambda pi: pi.length()) # optional -- internet,random
0: (St000018: The [[/Permutations/Inversions|number of inversions]] of a permutation., [], 200)
1: (St000004: The [[/Permutations/Descents-Major|major index]] of a permutation., [Mp00062: inversion-number to major-index bijection], 200)
2: (St000067: The inversion number of the alternating sign matrix., [Mp00063: to alternating sign matrix], 152)
3: (St000246: The [[/Permutations/Inversions|number of non-inversions]] of a permutation., [Mp00064: reverse], 200)
4: (St000008: The major index of the composition., [Mp00062: inversion-number to major-index bijection, Mp00071: descent composition], 200)
TESTS::
sage: findstat("Permutations", lambda x: 1, depth="x")
Traceback (most recent call last):
...
ValueError: The depth of a FindStat query must be a non-negative integer less than or equal to 5.
sage: findstat("Permutations", lambda x: 1, depth=100)
Traceback (most recent call last):
...
ValueError: The depth of a FindStat query must be a non-negative integer less than or equal to 5.
sage: findstat("Permutations", 1) # optional -- internet
Traceback (most recent call last):
...
ValueError: The given arguments, Permutations and 1, cannot be used for a FindStat search.
sage: S = Permutation
sage: findstat([[S([1,2,3]),S([1,3,2])],[1,1]]) # optional -- internet
Traceback (most recent call last):
...
ValueError: After discarding elements not in the range, too few (=2) values remained to send to FindStat.
sage: findstat(([S([1,3,2]), S([1,2,3]), S([1,3,2])], [1,1,1])) # optional -- internet
Traceback (most recent call last):
...
ValueError: FindStat expects that every object occurs at most once.
sage: findstat([(S([1,2]), 1), ([S([1,3,2]), S([1,2])], [2,3])]) # optional -- internet
Traceback (most recent call last):
...
ValueError: FindStat expects that every object occurs at most once.
"""
try:
depth = int(depth)
assert 0 <= depth <= FINDSTAT_MAX_DEPTH
except (ValueError, AssertionError):
raise ValueError("The depth of a FindStat query must be a non-negative integer less than or equal to %i." %FINDSTAT_MAX_DEPTH)
def get_collection(collection=None, element=None):
if collection is None:
collection = FindStatCollection(element)
return (collection, collection.to_string())
else:
return (FindStatCollection(collection), lambda x: x)
def query_by_dict(query, collection=None):
# we expect a dictionary from objects or strings to
# integers
l = query.iteritems()
(key, value) = l.next()
(collection, to_str) = get_collection(collection, key)
data = ([([key], [to_str(key)], [Integer(value)])] +
[([key], [to_str(key)], [Integer(value)]) for (key, value) in l])
first_terms = [(key[0], value[0]) for (key, key_str, value) in data]
return FindStatStatistic(id=0, data=data,
first_terms=first_terms,
collection=collection,
depth=depth)._find_by_values(max_values=max_values)
def query_by_iterable(query, collection=None):
# either a pair (list of objects, list of integers)
# or a list of such or (object, integer) pairs
# values must always be converted to lists because
# otherwise we get a trailing comma when printing
if (len(query) == 2 and
isinstance(query[1], (list, tuple)) and
len(query[1]) != 0 and
isinstance(query[1][0], (int, Integer))):
# just a single pair, i.e., a pure distribution query
(collection, to_str) = get_collection(collection, query[0][0])
data = [(query[0], map(to_str, query[0]), map(Integer, query[1]))]
if len(data[0][0]) != len(data[0][2]):
raise ValueError("FindStat expects the same number of objects as values.")
if len(set(data[0][1])) != len(data[0][2]):
raise ValueError("FindStat expects that every object occurs at most once.")
return FindStatStatistic(id=0, data=data,
collection=collection,
depth=depth)._find_by_values(max_values=max_values)
else:
(key, value) = query[0]
if isinstance(value, (list, tuple)):
(collection, to_str) = get_collection(collection, key[0])
else:
(collection, to_str) = get_collection(collection, key)
data = []
is_statistic = True
for (key, value) in query:
if isinstance(value, (list, tuple)):
if len(key) != len(value):
raise ValueError("FindStat expects the same number of objects as values.")
if len(value) != 1:
is_statistic = False
data += [(key, map(to_str, key), map(Integer, value))]
else:
data += [([key], [to_str(key)], [Integer(value)])]
all_elements = [e for (elements, elements_str, values) in data for e in elements_str]
if len(set(all_elements)) != len(all_elements):
raise ValueError("FindStat expects that every object occurs at most once.")
if is_statistic:
return FindStatStatistic(id=0, data=data,
collection=collection,
first_terms=query,
depth=depth)._find_by_values(max_values=max_values)
else:
return FindStatStatistic(id=0, data=data,
collection=collection,
depth=depth)._find_by_values(max_values=max_values)
if query_2 is None:
if isinstance(query_1, str):
if re.match('^St[0-9]{6}$', query_1):
return self._statistic_find_by_id_cached(Integer(query_1[2:].lstrip("0")))
else:
raise ValueError("The value %s is not a valid FindStat statistic identifier." %query_1)
elif isinstance(query_1, (int, Integer)):
return self._statistic_find_by_id_cached(query_1)
elif isinstance(query_1, dict):
return query_by_dict(query_1)
elif isinstance(query_1, (list, tuple)):
return query_by_iterable(query_1)
else:
raise ValueError("The given query, %s, cannot be used for a FindStat search." %query_1)
else:
(collection, to_str) = get_collection(None, query_1)
if isinstance(query_2, dict):
return query_by_dict(query_2, collection)
elif isinstance(query_2, (list, tuple)):
return query_by_iterable(query_2, collection)
elif callable(query_2):
first_terms = collection.first_terms(query_2, max_values=max_values)
data = [([key], [to_str(key)], [Integer(value)]) for (key, value) in first_terms]
try:
code = inspect.getsource(query_2)
except IOError:
_ = verbose("inspect.getsource could not get code from function provided", caller_name='FindStat')
code = ""
return FindStatStatistic(id=0, first_terms=first_terms,
data=data, function=query_2, code=code,
collection=collection,
depth=depth)._find_by_values(max_values=max_values)
else:
raise ValueError("The given arguments, %s and %s, cannot be used for a FindStat search." %(query_1, query_2))
def __repr__(self):
r"""
Return the representation of ``self``.
EXAMPLES::
sage: findstat
The Combinatorial Statistic Finder (http://www.findstat.org/)
"""
return "The Combinatorial Statistic Finder (%s)" % FINDSTAT_URL
def browse(self):
r"""
Open the FindStat web page in a browser.
EXAMPLES::
sage: findstat.browse() # optional -- webbrowser
"""
webbrowser.open(FINDSTAT_URL)
def set_user(self, name=None, email=None):
r"""
Set the user for this session.
INPUT:
- ``name`` -- the name of the user.
- ``email`` -- an email address of the user.
This information is used when submitting a statistic with
:meth:`FindStatStatistic.submit`.
EXAMPLES::
sage: findstat.set_user(name="Anonymous", email="invalid@org")
.. NOTE::
It is usually more convenient to login into the FindStat
web page using the :meth:`login` method.
"""
if not isinstance(name, str):
raise ValueError("The given name is not a string.")
if not isinstance(email, str):
raise ValueError("The given email address is not a string.")
self._user_name = name
self._user_email = email
def login(self):
r"""
Open the FindStat login page in a browser.
EXAMPLES::
sage: findstat.login() # optional -- webbrowser
"""
webbrowser.open(FINDSTAT_URL_LOGIN)
######################################################################
def _statistic_find_by_id_cached(self, id):
r"""
INPUT:
- ``id`` -- an integer designating the FindStat id of a statistic.
OUTPUT:
An instance of :class:`FindStatStatistic`.
.. TODO::
this method caches the statistics. It may make sense to
provide a method that clears the cache, or reloads a
single statistic.
TESTS::
sage: findstat(41).set_description("") # optional -- internet, indirect doctest
sage: findstat(41).description() == "" # optional -- internet, indirect doctest
True
"""
if id > 0:
if id not in self._statistic_cache.keys():
self._statistic_cache[id] = FindStatStatistic(id)._find_by_id()
return self._statistic_cache[id]
else:
raise ValueError("A FindStat statistic identifier must be at least 1.")
######################################################################
class FindStatStatistic(SageObject):
r"""
The class of FindStat statistics.
Do not instantiate this class directly. Instead, use
:class:`findstat<FindStat>`.
"""
def __init__(self, id, first_terms=None, data=None, function=None, code="", collection=None, depth=None):
r"""
Initialize a FindStat query for a statistic from preprocessed
data.
INPUT:
- ``id`` -- an integer designating the FindStat id of a
statistic, or 0.
- ``first_terms`` -- (optional) a list of (object, value)
pairs, see :meth:`first_terms`.
- ``data`` -- (optional) a list of pairs of the form (list of
objects represented as strings, list of values), see
:meth:`data`.
- ``function`` -- (optional) a function taking a sage object
as input and returning the value of the statistic on this
object, see :meth:`function`.
- ``code`` -- (optional) a string containing code (possibly
pseudocode or code for a different computer algebra
system), see :meth:`code`.
- ``collection`` -- (optional) an instance of
:class:`FindStatCollection`, see :meth:`collection`.
- ``depth`` -- (optional) an integer between 0 and
FINDSTAT_MAX_DEPTH, which determines how many maps FindStat
should compose at most to find a match.
This method by itself does not launch the query, because
there are two rather different queries possible, see
:meth:`_find_by_id` and :meth:`_find_by_values`.
TESTS::
sage: from sage.databases.findstat import FindStatStatistic
sage: FindStatStatistic(1)._find_by_id() # optional -- internet
St000001: The number of ways to write a permutation as a minimal length product of simple transpositions.
"""
self._depth = depth
self._query = None
self._modified = False
self._id = id
self._result = None
self._first_terms = first_terms
self._data = data
self._function = function
self._code = code
self._collection = collection
self._description = ""
self._references = ""
def __repr__(self):
r"""
Return the representation of the FindStat query.
OUTPUT:
A string. If the query was by identifier, the identifier and
the name of the statistic. If the statistic was modified
(see :meth:`modified`) this is also indicated.
If the query was by values and at least one match was found,
the list of matches.
Otherwise, if no match was found, a message saying this.
EXAMPLES::
sage: findstat(1) # optional -- internet
St000001: ...
sage: findstat([(pi, randint(1,100)) for pi in Permutations(3)]) # optional -- internet
a new statistic on Cc0001: Permutations
sage: findstat([(pi, pi(1)) for pi in Permutations(3)]) # optional -- internet
0: (St000054: ...
1: ...
2: ...
"""
if self._query == "ID":
if self._modified:
return "%s(modified): %s" % (self.id_str(), self.name())
else:
return "%s: %s" % (self.id_str(), self.name())
elif self._query == "data":
if len(self._result) == 0:
return "a new statistic on " + repr(self._collection)
else:
return repr(self._result)
else:
raise ValueError("FindStatStatistic._query should be either 'ID' or 'data', but is %s. This should not happen. Please send an email to the developers." %self._query)
def __eq__(self, other):
"""Return ``True`` if ``self`` is equal to ``other`` and ``False``
otherwise.
INPUT:
- ``other`` -- a FindStat query, i.e., instance of :class:`FindStatStatistic`.
OUTPUT:
A boolean.
Two queries are considered equal if all of the following
applies:
- the queries are both of the same type, i.e., both are by
identifier or both are by values,
- if the queries are by identifier, the identifiers agree and
the statistics are both unmodified,
- if the queries are by values, the submitted data are the
same and the statistic data are both unmodified.
.. TODO::
this is *very* rudimentary
EXAMPLES::
sage: findstat(1) == findstat(41) # optional -- internet
False
sage: r1 = findstat(Permutations(3), lambda pi: pi.saliances()[0]) # optional -- internet
sage: r2 = findstat(Permutations(4), lambda pi: pi.saliances()[0]) # optional -- internet
sage: r1 == r2 # optional -- internet
False
"""
if (not isinstance(other, FindStatStatistic)):
return False
if self._query == "ID" and other._query == "ID":
if self._modified or other._modified:
return False
else:
return self._id == other._id
elif self._query == "data" and other._query == "data":
if self._modified or other._modified:
return False
else:
return self._data == other._data
else:
return False
def __ne__(self, other):
"""Determine whether ``other`` is a different query.
INPUT:
- ``other`` -- a FindStat query, i.e., instance of
:class:`FindStatStatistic`..
OUTPUT:
A boolean.
SEEALSO:
:meth:`__eq__`
EXAMPLES::
sage: r1 = findstat(Permutations(3), lambda pi: pi.saliances()[0]) # optional -- internet
sage: r2 = findstat(Permutations(4), lambda pi: pi.saliances()[0]) # optional -- internet
sage: r1 != r2 # optional -- internet
True
"""
return not self == other
######################################################################
# query = "ID"
######################################################################
def _find_by_id(self):
r"""
Retrieve the statistic matching ``self._id``.
OUTPUT:
The statistic ``self``.
Expects that ``_id`` is a valid identifier. Overwrites all
variables associated with the statistic, such as
``_description``, ``_code``, ``_references``, etc.
TESTS::
sage: findstat(999999) # optional -- internet, indirect doctest
Traceback (most recent call last):
...
ValueError: St999999 is not a FindStat statistic identifier.
"""
self._query = "ID"
# get the database entry from FindStat
url = FINDSTAT_URL_DOWNLOADS_STATISTICS %self.id_str()
_ = verbose("Fetching URL %s ..." %url, caller_name='FindStat')
try:
self._raw = json.load(urlopen(url), object_pairs_hook=OrderedDict)
except HTTPError as error:
if error.code == 404:
raise ValueError("%s is not a FindStat statistic identifier." %self.id_str())
else:
raise
self._description = self._raw[FINDSTAT_STATISTIC_DESCRIPTION].encode("utf-8")
self._name = self._raw[FINDSTAT_STATISTIC_NAME].encode("utf-8")
self._references = self._raw[FINDSTAT_STATISTIC_REFERENCES].encode("utf-8")
self._collection = FindStatCollection(self._raw[FINDSTAT_STATISTIC_COLLECTION])
self._code = self._raw[FINDSTAT_STATISTIC_CODE]
from ast import literal_eval
gf = self._raw[FINDSTAT_STATISTIC_GENERATING_FUNCTION]
self._generating_functions_dict = { literal_eval(key):
{ literal_eval(inner_key): inner_value
for inner_key, inner_value in value.iteritems() }
for key, value in gf.iteritems() }
from_str = self._collection.from_string()
# we want to keep FindStat's ordering here!
self._first_terms = [(from_str(obj), Integer(val)) for (obj, val) in self._raw[FINDSTAT_STATISTIC_DATA].iteritems()]
return self
######################################################################
# query = "data"
######################################################################
def _find_by_values(self, max_values=FINDSTAT_MAX_VALUES):
r"""
Retrieve the statistics matching ``self._data``.
OUTPUT:
The query ``self``.
Expects that ``_data`` is a list of triples of the form (list
of objects, list of objects represented as strings, list of
values), each containing as many values as objects, and that
``_collection`` is appropriately set.
TESTS::
sage: from sage.databases.findstat import FindStatCollection
sage: from sage.databases.findstat import FindStatStatistic
sage: collection = FindStatCollection("Dyck paths") # optional -- internet
sage: to_str = collection.to_string() # optional -- internet
sage: query = {dw:dw.area() for dw in DyckWords(4)}