## Sampling the space of pure data 



One easy way to produce sample data is with **ap const** which repeats it's arguments once for each input atom...

In [1]:
ap const a : 1 2 3 4 
ap const a b : 1 2 3 4 
ap const (:) : 1 2 3 4 

a a a a
a b a b a b a b
◎ ◎ ◎ ◎

In most sampling situations, we'll want some specified collection d1, d2, d3,.... where d1, d2, d3 are each data.  To produce a sequence like that, you don't want to produce d1 d2 d3 because concatenation destroys the sequence.  Instead, an easy solution is to produce a sequence 

* (:d1) (:d2) (:d3)

where each data in the collection has it's own container.  We'll do a bit of this by hand...

In [2]:
(put : x y z ) (put : a b c )

(:x y z) (:a b c)

For convenience, **sample.odd** and **sample.even** produces even and odd sequences of the simplest atom (:) ("Hydrogen"), packaged as above...

In [3]:
sample.odd : 4 
sample.even : 4

(:◎) (:◎ ◎ ◎) (:◎ ◎ ◎ ◎ ◎) (:◎ ◎ ◎ ◎ ◎ ◎ ◎)
◎ (:◎ ◎) (:◎ ◎ ◎ ◎) (:◎ ◎ ◎ ◎ ◎ ◎)

The **sample.pure** operator lets one sample the whole space of pure data via 

* sample.pure : width depth 

where **width** is the maximum data length and **depth** is the maximum depth.  For example...

In [29]:
#
#  sample.pure : <width> <depth> 
#
tab : pure : first 10 : sample.pure : 2 2 

 (:) 
 (:(:)) 
 (:(:(:))) 
 (:(:(:)(:))) 
 (:((:):)) 
 (:((:):(:))) 
 (:((:):(:)(:))) 
 (:((:)(:):)) 
 (:((:)(:):(:))) 
 (:((:)(:):(:)(:))) 


The direct approach is rarely useful because the universe of pure data grows enormously rapidly with **width** and **depth**.  So much so that

* sample.pure : 2 3 

is already challenging for a laptop.  

### Size of sample.pure versus width and depth:

|         | width 1  | width 2 | width 3 | width 4  | width 5 |
|---------|---------:|--------:|--------:|---------:|--------:|
|depth 1  |    2     |    3    |    4    |     5    |    6    |
|depth 2  |    5     |     91   |    4369   |   406,901    |    62,193,781    |
|depth 3  |    26     |     68,583,243   |    ?    |     ?    |    ?    |
|depth 4  |    677     |     ?   |    ?    |     ?    |    ?    |
|depth 5  |    458,330     |     ?   |    ?    |     ?    |    ?    |

Of course, the space of pure data includes all data, meaning all mathematical/logical/computational objects.  Undecided data ("variables"), for example, already appear in sample.pure : 2 2, which you can detect by, for example counting.

In [5]:
ap {count : get : B} : sample.pure : 2 2 

(count:(({get }:◎):({ B}:◎))) (count:(({get }:(:◎)):({ B}:(:◎)))) (count:(({get }:(:(:◎))):({ B}:(:(:◎))))) (count:(({get }:(:(:◎ ◎))):({ B}:(:(:◎ ◎))))) (count:(({get }:(:𝟬)):({ B}:(:𝟬)))) (count:(({get }:(:𝝞)):({ B}:(:𝝞)))) (count:(({get }:(:(◎:◎ ◎))):({ B}:(:(◎:◎ ◎))))) (count:(({get }:(:(◎ ◎:))):({ B}:(:(◎ ◎:))))) (count:(({get }:(:(◎ ◎:◎))):({ B}:(:(◎ ◎:◎))))) (count:(({get }:(:(◎ ◎:◎ ◎))):({ B}:(:(◎ ◎:◎ ◎))))) (count:(({get }:(:◎ ◎)):({ B}:(:◎ ◎)))) (count:(({get }:(:◎ (:◎))):({ B}:(:◎ (:◎))))) (count:(({get }:(:◎ (:◎ ◎))):({ B}:(:◎ (:◎ ◎))))) (count:(({get }:(:◎ 𝟬)):({ B}:(:◎ 𝟬)))) (count:(({get }:(:◎ 𝝞)):({ B}:(:◎ 𝝞)))) (count:(({get }:(:◎ (◎:◎ ◎))):({ B}:(:◎ (◎:◎ ◎))))) (count:(({get }:(:◎ (◎ ◎:))):({ B}:(:◎ (◎ ◎:))))) (count:(({get }:(:◎ (◎ ◎:◎))):({ B}:(:◎ (◎ ◎:◎))))) (count:(({get }:(:◎ (◎ ◎:◎ ◎))):({ B}:(:◎ (◎ ◎:◎ ◎))))) (count:(({get }:(:(:◎) ◎)):({ B}:(:(:◎) ◎)))) (count:(({get }:(:(:◎) (:◎))):({ B}:(:(:◎) (:◎))))) (count:(({get }:(:(:◎) (:◎ ◎))):({ B}:(:(:◎) (:◎ ◎))))) 

### Practical searching 

Although **sample.pure** samples all pure data and, therefore, all mathematical objects, these samples are typically far too huge to be practically helpful.  Instead, the operator **sample.data** is much more practical.  This operation works by providing a sequence of codas and a width.

* sample.data <...sequence of codas..> : width 

A few examples will illustrate. 

In [6]:
#
#   Just one argument coda "a" and width <= 5 gives 
#
sample.data a : 5 

◎ (:a a a a) (:a a a a a) (:a) (:a a) (:a a a)

In [7]:
#
#   Note that the sample is delivered, as usual, as (:data1) (:data2) ... 
#
#   If you provide two codas, a and b, all permutations are included. 
#
sample.data a b : 5

(:a b b b b) (:a b b b a) (:b b b a b) (:a a b a) (:b a b b a) (:b b b b a) (:b a b) ◎ (:b a b a) (:b a b a a) (:a b a) (:b b b a) (:a a) (:b b b a a) (:a a a a b) (:b b a b b) (:a b a b) (:a a b a a) (:a b a a b) (:b a a b) (:b b a a b) (:a b) (:a a a a) (:b b b b b) (:a a a b a) (:b a) (:a a a) (:b a a b a) (:a a b b b) (:b a a b b) (:b b) (:b b b) (:b a a a a) (:b b a a) (:a b b a b) (:b a b a b) (:a a b a b) (:a b b a) (:a b b b) (:a b b a a) (:a a a b) (:b a a) (:b b b b) (:b a a a b) (:b a b b b) (:a b a b a) (:b) (:a) (:b b a a a) (:a b b) (:a b a b b) (:b b a b a) (:b a a a) (:b b a b) (:b a b b) (:b b a) (:a b a a) (:a b a a a) (:a a b) (:a a b b a) (:a a a b b) (:a a a a a) (:a a b b)

In [8]:
#
#   You can add language literals <A> and <B> and a special symbol <{$}> to create language expressions.
#
sample.data <A> <B> <{$}> a b (defs:Basic) : 2 

(:star) (:sum {B : if}) (:pass {is B}) (:pass {const B}) (:{B : bin} B) (:B {B isnt}) (:star {B : star}) (:{B sum} pass) (:{B : if} a) (:if {B null}) (:isnt {B : pass}) (:{null B} star) (:{pass : B} if) (:{nif : B} A) (:{B : if} left) (:put {b B}) (:left {isnt : B}) (:const {B star}) (:{B : get} sum) (:hasnt {B : arg}) (:{B sum} plus) (:{prod : B} const) (:{sum B} has) (:left {const : B}) (:{B : isnt} null) (:{prod B} b) (:get {B}) (:{B : nif} star) (:{B const} if) (:sum {B const}) (:pass A) (:{is : B} B) (:{null B} a) (:{B plus} get) (:{domain B} const) (:star {B prod}) (:plus {B null}) (:star {B : hasnt}) (:{plus B} has) (:null {B : domain}) (:{B is} A) (:A {get B}) (:is {a : B}) (:{null : B} put) (:{has : B} arg) (:{prod B} bin) (:{B nif} right) (:{left : B} get) (:{prod : B} right) (:prod {B isnt}) (:{left B} a) (:{prod : B} get) (:{left : B} B) (:const {B : star}) (:has {B : plus}) (:B {left B}) (:{b : B} B) (:if get) (:isnt {B prod}) (:{has : B} star) (:bin has) (:{right B} left)

In [9]:
#
#   To meaningfully search, one should add domains representing established definitions.  To do this, first see 
#   all currently available definitions like so...
#
defs:

= Def Let ab allcodes alphabet and ap apby aq ar arg as assign bin bool by cache.clear cache.reset cases co coda codes collect const count d2s def defaultTime defs demo digits dir do do1 dom domain domulti down down1 endswith equal equiv eval first float_div float_inv float_max float_min float_prod float_sort float_sum floats get has hasnt head help home if iff imply import info int_div int_inv int_max int_min int_prod int_sort int_sum ints is isnt join ker kernel language last left let letters localdef log logging logs module more multi name nand nat nat_max nat_min nat_prod nat_sort nat_sum nats nchar nif nor not nth nth1 null once or pass pause permutation plus post pre printable prod pure put read readpath rep rev right ring s2d sample.atom sample.data sample.even sample.odd sample.pure sample.window skip some source sources split star startswith stat stata step stick sum swap tab tail term text_sort theorem uni up up1 use use1 with wrap write xnor xor ◎ 𝝞 𝟬

In [10]:
#
#    Each definition is in a "module" which is either a python file (.py) or a coda file (.co). 
#
once : module : defs:

Logic  Define Apply Text Basic Variable Sequence Evaluation Search Source Collect IO Help Evaluate Path Number Import Language Log Time Sample

In [11]:
#
#    Typically one would want some but not all definitions for a search.  For instance, one typically doesn't want to search 
#    over IO operations or Sample operations (to avoid recursion) or over help system operations.  A typical choice is...
#
defs : Apply Basic Logic Number Sequence  

= ab and ap apby aq ar arg as bin bool by co const count domain equal first float_div float_inv float_max float_min float_prod float_sort float_sum floats get has hasnt head if iff imply int_div int_inv int_max int_min int_prod int_sort int_sum ints is isnt ker kernel last left nand nat nat_max nat_min nat_prod nat_sort nat_sum nats nif nor not nth nth1 null once or pass plus post pre prod put rep rev right ring skip some star stick sum swap tail text_sort xnor xor

In [12]:
sample.data (defs:Basic) : 2

(:is sum) (:is prod) (:domain nif) (:if left) (:put null) (:star null) (:left domain) (:pass sum) (:has arg) (:get get) (:pass star) (:if nif) (:has star) (:left is) (:plus pass) (:put domain) (:bin bin) (:put bin) (:is get) (:right star) (:get put) (:put hasnt) (:prod isnt) (:plus is) (:sum bin) (:isnt get) (:arg) (:nif plus) (:domain has) (:left has) (:right hasnt) (:const plus) (:get null) (:put isnt) (:put get) (:get nif) (:left prod) (:nif star) (:domain prod) (:is has) (:domain) (:star has) (:put put) (:left bin) (:if) (:null domain) (:domain plus) (:const if) (:isnt pass) (:nif arg) (:left sum) (:const const) (:domain put) (:arg isnt) (:if has) (:plus isnt) (:if if) (:hasnt left) (:if const) (:if right) (:right) (:plus const) (:nif) (:put arg) (:has sum) (:prod prod) (:sum left) (:isnt) (:sum is) (:left get) (:put prod) (:const pass) (:right has) (:get plus) (:domain get) (:is arg) (:null) (:arg get) (:domain is) (:prod const) (:has has) (:left arg) (:left right) (:bin nif) (:pa

In [13]:
count : defs : Sequence

15

In [14]:
sample.data (defs:Sequence) : 2

(:rev once) (:head count) (:count rev) (:once rev) (:by swap) (:once nth1) (:nth1 nth1) (:first head) (:skip by) (:nth once) (:first last) (:by last) (:head head) (:nth1 post) (:tail once) (:nth1 tail) (:swap first) (:by nth) (:head nth) (:swap nth) (:head skip) (:head once) (:swap post) (:rep pre) (:first count) (:swap last) (:once) (:post tail) (:once swap) (:nth1 rev) (:once head) (:first by) (:nth1 first) (:nth) (:swap skip) (:skip tail) (:post skip) (:nth1 once) (:post by) (:head post) (:last nth) (:last count) (:count first) (:pre once) (:nth1 pre) (:pre) (:rep head) (:last first) (:pre post) (:count post) (:tail nth) (:nth post) (:by count) (:once post) (:rep skip) (:nth skip) (:nth last) (:rep post) (:last once) (:pre tail) (:first nth1) (:first skip) (:nth count) (:first post) (:pre first) (:skip swap) (:count tail) (:post count) (:tail rep) (:rep rev) (:pre swap) (:skip skip) (:tail last) (:head first) (:tail pre) (:nth nth1) (:once skip) (:count nth1) (:count swap) (:post re

In [15]:
defs:Sequence

by count first head last nth nth1 once post pre rep rev skip swap tail

In [16]:
sample.data rev first tail last skip rep nth1 once count nth pre post : 2

(:tail post) (:count count) (:nth1 nth) (:nth count) (:pre nth) (:first) (:pre first) (:skip rep) (:post rep) (:tail count) (:tail rev) (:count post) (:pre skip) (:first last) (:last) (:skip pre) (:rep skip) (:last post) (:nth1 skip) (:count once) (:pre) (:once rev) (:nth nth1) (:tail) (:first pre) (:rev pre) (:last pre) (:pre last) (:skip first) (:tail skip) (:once nth) (:nth1) (:rep tail) (:once tail) (:nth rep) (:once count) (:post skip) (:rep nth) (:rev skip) ◎ (:post nth1) (:pre nth1) (:nth1 tail) (:rep rep) (:nth pre) (:count tail) (:nth nth) (:first nth1) (:tail nth1) (:post once) (:pre tail) (:count pre) (:skip last) (:rep first) (:skip once) (:nth rev) (:skip count) (:tail tail) (:tail last) (:pre pre) (:post post) (:once first) (:pre rev) (:once last) (:post rev) (:count last) (:first nth) (:skip post) (:post) (:nth last) (:nth skip) (:nth tail) (:rev nth1) (:first post) (:nth once) (:skip skip) (:tail first) (:rev last) (:rev first) (:nth1 post) (:nth first) (:post first) (:

In [17]:
#
#    To create a typical sample, we start with A, B and {$} for language purposes, add a few "variables" (x? y? z?) and 
#    mix in the above definitions. 
#    
sample.data <A> <B> x? y? <{$}> (defs:Apply Basic) : 2 

(:{B : left} right) (:pass {B : (y:)}) (:bin {B get}) (:sum {bin : B}) (:if {apby : B}) (:apby {const B}) (:kernel get) (:right) (:is {B : ab}) (:null {kernel : B}) (:right {sum B}) (:prod {ker B}) (:null {nif B}) (:{B nif} co) (:has B) (:right ker) (:{ker B} plus) (:has const) (:if {B star}) (:(y:) {arg : B}) (:const {B : is}) (:ap {(x:) : B}) (:{B : pass} put) (:{B : plus} isnt) (:{B as} isnt) (:is {domain B}) (:has {domain : B}) (:nif {B nif}) (:ap pass) (:{B arg} prod) (:{B : bin} get) (:aq {B left}) (:{B : left} const) (:has {as : B}) (:(x:) {B : domain}) (:{sum : B} const) (:B {apby : B}) (:apby {B is}) (:{B pass} star) (:{put B} plus) (:{pass B} right) (:{B : ap} isnt) (:null put) (:is null) (:(x:) kernel) (:nif {plus : B}) (:{B : right} B) (:has {plus : B}) (:{has : B} B) (:const {B : domain}) (:{B : (y:)} kernel) (:put prod) (:left {is B}) (:{(y:) : B} is) (:kernel {isnt B}) (:has {B const}) (:{ap : B} get) (:put {B : domain}) (:{B : ab} get) (:has {B : prod}) (:has {pass : B}

In [18]:
Let Sample : sample.data <A> <B> x? y? <{$}>  : 2 



In [19]:
#
#   sample? is then a collection of data that can be used, for instance, to test for theorems or search for 
#   algebraic structures of interest, as shown in study tutorials. 
#
count : Sample? 

138

For larger scale searching, one would typically increase from width=2 and search in parallel and, perhaps add more variables.

This can be useful for theorem testing and for searching for algebraic structures of interest as illustrated in the theorem tutorial and in various study notebooks. 