## Sampling the space of pure data 



One easy way to produce sample data is with **rep** which repeats it's arguments once for each input atom...

In [1]:
rep a : 1 2 3 4 
rep a b : 1 2 3 4 
rep (:) : 1 2 3 4 

a a a a
a b a b a b a b
◎ ◎ ◎ ◎

In most sampling situations, we'll want some specified collection d1, d2, d3,.... where d1, d2, d3 are each data.  To produce a sequence like that, you don't want to produce d1 d2 d3 because concatenation destroys the sequence.  Instead, an easy solution is to produce a sequence 

* (:d1) (:d2) (:d3)

where each data in the collection has it's own container.  We'll do a bit of this by hand...

In [2]:
(put : x y z ) (put : a b c )

(:x y z) (:a b c)

In [3]:
Def repn : {rep A : first B : nat : 0} 



In [4]:
repn a : 5 

a a a a a

In [5]:
ap {put : repn a:B} : first 5 : nat : 0 

◎ (:a) (:a a) (:a a a) (:a a a a)

For convenience, **sample.odd** and **sample.even** produces even and odd sequences of the simplest atom (:) ("Hydrogen"), packaged as above...

In [6]:
sample.odd : 4 
sample.even : 4

(:◎) (:◎ ◎ ◎) (:◎ ◎ ◎ ◎ ◎) (:◎ ◎ ◎ ◎ ◎ ◎ ◎)
◎ (:◎ ◎) (:◎ ◎ ◎ ◎) (:◎ ◎ ◎ ◎ ◎ ◎)

The **sample.pure** operator lets one sample the whole space of pure data via 

* sample.pure : width depth 

where **width** is the maximum data length and **depth** is the maximum depth.  For example...

In [7]:
#
#  sample.pure : <width> <depth> 
#
sample.pure : 2 2 

◎ (:◎) (:(:◎)) (:(:◎ ◎)) (:𝟬) (:𝝞) (:◎◎) (:) (:◎) (:◎◎) (:◎ ◎) (:◎ (:◎)) (:◎ (:◎ ◎)) (:◎ 𝟬) (:◎ 𝝞) (:◎ ◎◎) (:◎ ) (:◎ ◎) (:◎ ◎◎) (:(:◎) ◎) (:(:◎) (:◎)) (:(:◎) (:◎ ◎)) (:(:◎) 𝟬) (:(:◎) 𝝞) (:(:◎) ◎◎) (:(:◎) ) (:(:◎) ◎) (:(:◎) ◎◎) (:(:◎ ◎) ◎) (:(:◎ ◎) (:◎)) (:(:◎ ◎) (:◎ ◎)) (:(:◎ ◎) 𝟬) (:(:◎ ◎) 𝝞) (:(:◎ ◎) ◎◎) (:(:◎ ◎) ) (:(:◎ ◎) ◎) (:(:◎ ◎) ◎◎) (:𝟬 ◎) (:𝟬 (:◎)) (:𝟬 (:◎ ◎)) (:𝟬 𝟬) (:𝟬 𝝞) (:𝟬 ◎◎) (:𝟬 ) (:𝟬 ◎) (:𝟬 ◎◎) (:𝝞 ◎) (:𝝞 (:◎)) (:𝝞 (:◎ ◎)) (:𝝞 𝟬) (:𝝞 𝝞) (:𝝞 ◎◎) (:𝝞 ) (:𝝞 ◎) (:𝝞 ◎◎) (:◎◎ ◎) (:◎◎ (:◎)) (:◎◎ (:◎ ◎)) (:◎◎ 𝟬) (:◎◎ 𝝞) (:◎◎ ◎◎) (:◎◎ ) (:◎◎ ◎) (:◎◎ ◎◎) (: ◎) (: (:◎)) (: (:◎ ◎)) (: 𝟬) (: 𝝞) (: ◎◎) (: ) (: ◎) (: ◎◎) (:◎ ◎) (:◎ (:◎)) (:◎ (:◎ ◎)) (:◎ 𝟬) (:◎ 𝝞) (:◎ ◎◎) (:◎ ) (:◎ ◎) (:◎ ◎◎) (:◎◎ ◎) (:◎◎ (:◎)) (:◎◎ (:◎ ◎)) (:◎◎ 𝟬) (:◎◎ 𝝞) (:◎◎ ◎◎) (:◎◎ ) (:◎◎ ◎) (:◎◎ ◎◎)

The direct approach is rarely useful because the universe of pure data grows enormously rapidly with **width** and **depth**.  So much so that

* sample.pure : 2 3 

is already challenging for a laptop.  

### Size of sample.pure versus width and depth:

|         | width 1  | width 2 | width 3 | width 4  | width 5 |
|---------|---------:|--------:|--------:|---------:|--------:|
|depth 1  |    2     |    3    |    4    |     5    |    6    |
|depth 2  |    5     |     91   |    4369   |   406,901    |    62,193,781    |
|depth 3  |    26     |     68,583,243   |    ?    |     ?    |    ?    |
|depth 4  |    677     |     ?   |    ?    |     ?    |    ?    |
|depth 5  |    458,330     |     ?   |    ?    |     ?    |    ?    |

### Practical searching 

Although **sample.pure** samples all pure data and, therefore, all mathematical objects, these samples are typically far too huge to be practically helpful.  Instead, the operator **sample.data** is much more practical.  This operation works by providing a sequence of codas and a width.

* sample.data <...sequence of codas..> : width 

A few examples will illustrate. 

In [8]:
#
#   Just one argument coda "a" and width <= 5 gives 
#
sample.data a : 5 

◎ (:a a) (:a a a) (:a) (:a a a a)

In [9]:
#
#   Note that the sample is delivered, as usual, as (:data1) (:data2) ... 
#
#   If you provide two codas, a and b, all permutations are included. 
#
sample.data a b : 5

(:a a a b) (:b a a) ◎ (:a) (:a b b b) (:a a) (:a a b a) (:b b b a) (:b a b b) (:b b a) (:a b a) (:a a a a) (:b) (:a b b a) (:b b a b) (:a b a b) (:b a a b) (:b b b b) (:a b b) (:b a a a) (:a b a a) (:b a b a) (:a a a) (:a a b) (:b a) (:a b) (:b b a a) (:b a b) (:b b) (:b b b) (:a a b b)

In [10]:
#
#   Including the coda <{$}> generates source code intermixed with permutations of the other coda
#   that you prescribe. 
#
sample.data <A> <B> <{$}> a : 3

(:A A) (:A) (:B a) (:B A) (:a) (:{B : A} A) (:A B) (:a {a : B}) (:{A : B}) (:a A) (:a a) (:B {B A}) (:{a : B}) (:{B : a}) (:a B) (:{B : a} B) (:B {A : B}) (:B {B : a}) (:{B A} B) (:A {a B}) (:A {A : B}) (:B {B a}) (:{B A} A) (:{B} B) (:A {B A}) (:a {B a}) (:{A B} a) (:{B : A} B) (:A a) (:{a B} a) (:{A B} B) (:{a : B} B) (:A {a : B}) (:A {B : A}) (:{B a}) (:a {B : a}) (:{A : B} B) (:{A B}) (:{A : B} A) (:a {B}) (:{a B}) (:a {B A}) (:B {B}) (:a {A : B}) (:{B A}) (:{A : B} a) (:a {B : A}) (:{a B} A) (:a {A B}) (:{a : B} A) (:B) (:{B} a) (:{a : B} a) (:A {B}) (:{B a} A) (:{B : A}) (:B {a B}) (:{B : a} a) (:a {a B}) (:B {B : A}) (:A {B a}) (:A {A B}) (:{A B} A) (:{B}) (:{B A} a) (:{B : a} A) (:B {A B}) (:{B : A} a) (:A {B : a}) ◎ (:{B a} a) (:{B a} B) (:{a B} B) (:B B) (:B {a : B}) (:{B} A)

In [11]:
#
#   One can add some chosen number of "variables" 
#
sample.data <A> <B> <{$}> x? y? (x:) : 2 

(:{B : (x:)}) ◎ (:{(y:) B}) (:{(x:) B}) (:{(x:) : B}) (:(x:({}:))) (:{A : B}) (:{B : (x:({}:))}) (:{B : (y:)}) (:{(y:) : B}) (:(x:)) (:{B : A}) (:{(x:({}:)) B}) (:B) (:A) (:{B A}) (:{B (y:)}) (:{B (x:({}:))}) (:{B (x:)}) (:{A B}) (:{(x:({}:)) : B}) (:{B}) (:(y:))

In [12]:
#
#   To meaningfully search, one should add domains representing established definitions.  To do this, first see 
#   all currently available definitions like so...
#
defs:

◎ 𝟬 𝝞 bool not = equal or and nor xor nand xnor iff imply some ap aq by ax pass null bin put get domain left right if nif * collect use1 def undefined invariant nat code_sort ints int_sum int_prod int_sort int_min int_max int_inv int_div nats floats float_sum float_prod float_sort float_min float_max float_inv float_div eval1 with readpath dir help sources demo info module defs language log logs log+ log- rev first tail last rep nth1 once count localdef coda source home codes digits letters printable alphabet allcodes pure wrap startswith endswith join split up1 down1 sample.even sample.odd sample.atom sample.pure sample.window sample.data nth pre post let in equiv use Let Def eval down up while kernel ker hasnt has isnt nchar is repn

In [13]:
#
#    Each definition is in a "module" which is either a python file (.py) or a coda file (.co). 
#
module : defs:

   Logic Logic Logic Logic Logic Logic Logic Logic Logic Logic Logic Logic Logic Apply Apply Apply Apply Basic Basic Basic Basic Basic Basic Basic Basic Basic Basic Basic Collect Define Define Define Define Number Number Number Number Number Number Number Number Number Number Number Number Number Number Number Number Number Number Number Evaluate Evaluate IO IO Help Help Help Help Help Help Language Log Log Log Log Sequence Sequence Sequence Sequence Sequence Sequence Sequence Sequence Source Source Source Source Text Text Text Text Text Text Text Text Text Text Text Text Path Path Sample Sample Sample Sample Sample Sample Sequence Sequence Sequence  Logic Collect Define Define  Evaluate Path Path Apply Apply Apply Basic Basic Basic Text Basic 

In [14]:
once : module : defs:

IO Define Number Apply Help  Logic Log Evaluate Text Path Sequence Collect Source Basic Sample Language

In [15]:
#
#    Typically one would want some but not all definitions for a search.  For instance, one typically doesn't want to search 
#    over IO operations or Sample operations (to avoid recursion) or over help system operations.  A typical choice is...
#
defs : Apply Basic Logic Number Sequence  

bool not = equal or and nor xor nand xnor iff imply some ap aq by ax pass null bin put get domain left right if nif * nat code_sort ints int_sum int_prod int_sort int_min int_max int_inv int_div nats floats float_sum float_prod float_sort float_min float_max float_inv float_div rev first tail last rep nth1 once count nth pre post in while kernel ker hasnt has isnt is

In [16]:
sample.data (defs:Basic) : 3

(:put right) (:isnt has) (:isnt nif) (:put null) (:nif) (:* domain) (:nif get) (:if has) (:* *) (:right get) (:right right) (:is is) (:left null) (:has *) (:domain has) (:nif nif) (:right isnt) (:nif isnt) (:if put) (:is put) (:if bin) (:if domain) (:put is) (:is if) (:hasnt right) (:is get) (:hasnt put) (:pass pass) (:* put) (:pass domain) (:get bin) (:bin put) (:bin bin) (:has) (:pass isnt) (:if *) (:right put) (:put left) (:hasnt domain) (:* pass) (:domain put) (:get hasnt) (:bin) (:isnt *) (:null left) (:left isnt) (:* bin) (:hasnt left) (:if right) (:domain get) (:pass right) (:right bin) (:isnt pass) (:if pass) (:* left) (:hasnt bin) (:domain right) (:domain bin) (:domain domain) (:get is) (:get nif) (:put hasnt) (:bin null) (:left bin) (:get *) (:right domain) (:domain) (:nif null) (:isnt null) (:pass is) (:null isnt) (:right *) (:pass null) (:* right) (:has is) (:left has) (:hasnt pass) (:domain is) (:is) (:right left) (:null right) (:hasnt has) (:* is) (:left if) (:is bin) (:n

In [17]:
#
#    To create a typical sample, we start with A, B and {$} for language purposes, add a few "variables" (x? y? z?) and 
#    mix in the above definitions. 
#    
sample.data <A> <B> x? y? <{$}> (defs:Apply Basic) : 2 

(:{B : has}) (:{B : bin}) (:{put : B}) (:{B : get}) (:kernel) (:{B : A}) (:has) (:(y:)) (:{nif B}) (:{B : nif}) (:{B is}) (:{by B}) (:{domain : B}) (:{B null}) (:{B : ap}) (:is) (:A) (:{B : kernel}) (:*) (:null) (:{ax : B}) (:bin) (:{put B}) (:{B : domain}) (:{by : B}) (:isnt) (:left) (:{B put}) (:{aq B}) (:{(y:) : B}) (:{B domain}) (:{kernel : B}) (:ap) (:{has : B}) (:{get : B}) (:{right : B}) (:{B : by}) (:{B : (y:)}) (:{nif : B}) (:{right B}) (:{B : null}) (:{B right}) (:{B : if}) (:right) (:{B : isnt}) (:{has B}) (:B) (:{B if}) (:{B (y:)}) (:{A : B}) (:{if : B}) (:if) (:{B while}) (:{B *}) (:{B : is}) (:{bin : B}) (:{B}) (:{B A}) (:{B : (x:)}) (:{B : left}) (:{B (x:)}) (:{B get}) (:{while : B}) (:{ker B}) (:{B bin}) (:{B ker}) (:{B ap}) (:{B left}) (:{isnt B}) (:pass) (:{A B}) (:{aq : B}) (:{B by}) (:{B : *}) (:domain) (:{B : aq}) (:while) (:ker) (:{if B}) (:get) (:{B isnt}) (:{* B}) (:{B : right}) (:{is B}) (:{B : hasnt}) (:{B : ker}) (:{* : B}) (:{B : while}) (:{B : ax}) ◎ (:(x:)

In [18]:
Let Sample : sample.data <A> <B> x? y? <{$}>  : 2 



In [19]:
#
#   sample? is then a collection of data that can be used, for instance, to test for theorems or search for 
#   algebraic structures of interest, as shown in study tutorials. 
#
count : Sample? 
undefined : Sample?

18
(x:) (y:)

For larger scale searching, one would typically increase from width=2 and search in parallel and, perhaps add more variables.

This can be useful for theorem testing and for searching for algebraic structures of interest as illustrated in the theorem tutorial and in various study notebooks. 

In [20]:
Def Idempotent : bool * ap {(A:(get:B))=(A:A:(get:B))}



In [21]:
Idempotent pass : atoms? (:a b c)

(bool pass:(ap {(A:(get:B))=(A:A:(get:B))} pass:(atoms:)) (ap {(A:(get:B))=(A:A:(get:B))} pass:))

In [22]:
Idempotent rev : atoms? (:a b c)

◎

In [23]:
Let 1 : (sample.data a b : 4) (:(:a b)) (bin a:b) (sample.atom:3)



In [24]:
1? 

(:b a) (:b b b) ◎ (:a) (:b b) (:b a b) (:a b a) (:b) (:a b b) (:a a) (:b a a) (:b b a) (:a a a) (:a a b) (:a b) (:(:a b)) (bin a:b) ◎ (:◎) (:◎ ◎)

In [25]:
defs : Basic 
defs : Apply

pass null bin put get domain left right if nif * hasnt has isnt is
ap aq by ax while kernel ker

In [26]:
pre 1 pass | : Idempotent pass : 1? 
pre 2 null | : Idempotent null : 1?
pre 3 bin  | : Idempotent bin : 1?
pre 4 put  | : Idempotent put : 1?
pre 5 get  | : Idempotent get : 1? 
pre 6 domain| : Idempotent domain : 1?
pre 7 left | : Idempotent left : 1? 
pre 8 right | : Idempotent right : 1? 
pre 9 if | : Idempotent if : 1?
pre 10 nif | : Idempotent nif : 1? 
pre 11 star | : Idempotent * : 1? 
pre 12 hasnt | : Idempotent hasnt : 1? 
pre 13 has | : Idempotent has : 1? 
pre 14 isnt | : Idempotent isnt : 1? 
pre 15 is | : Idempotent is : 1? 

1 pass |
2 null |
3 bin | ◎
4 put | ◎
5 get | ◎
6 domain| ◎
7 left |
8 right | ◎
9 if |
10 nif |
11 star | ◎
12 hasnt |
13 has |
14 isnt |
15 is |

In [30]:
#
#   Doing the same thing with one line 
#
ap {put bin B : Idempotent B : 1?} : first 5 : defs : Basic 

(bin pass:) (bin null:) (bin bin:◎) (bin put:◎) (bin get:◎)