## Sampling the space of pure data 



One easy way to produce sample data is with **rep** which repeats it's arguments once for each input atom...

In [1]:
rep a : 1 2 3 4 
rep a b : 1 2 3 4 
rep (:) : 1 2 3 4 

a a a a
a b a b a b a b
◎ ◎ ◎ ◎

In most sampling situations, we'll want some specified collection d1, d2, d3,.... where d1, d2, d3 are each data.  To produce a sequence like that, you don't want to produce d1 d2 d3 because concatenation destroys the sequence.  Instead, an easy solution is to produce a sequence 

* (:d1) (:d2) (:d3)

where each data in the collection has it's own container.  We'll do a bit of this by hand...

In [2]:
(put : x y z ) (put : a b c )

(:x y z) (:a b c)

In [4]:
Def repn : {rep A : first B : nat : 0} 



In [5]:
repn a : 5 

a a a a a

In [6]:
eval : repn a : 5

a a a a a

In [7]:
ap {put : repn a:B} : first 5 : nat : 0 

◎ (:a) (:a a) (:a a a) (:a a a a)

For convenience, **sample.odd** and **sample.even** produces even and odd sequences of the simplest atom (:) ("Hydrogen"), packaged as above...

In [8]:
sample.odd : 4 
sample.even : 4

(sample.odd:({ 4}:))
(sample.even:({ 4}:))

The **sample.pure** operator lets one sample the whole space of pure data via 

* sample.pure : width depth 

where **width** is the maximum data length and **depth** is the maximum depth.  For example...

In [None]:
#
#  sample.pure : <width> <depth> 
#
sample.pure : 2 2 

The direct approach is rarely useful because the universe of pure data grows enormously rapidly with **width** and **depth**.  So much so that

* sample.pure : 2 3 

is already challenging for a laptop.  

### Size of sample.pure versus width and depth:

|         | width 1  | width 2 | width 3 | width 4  | width 5 |
|---------|---------:|--------:|--------:|---------:|--------:|
|depth 1  |    2     |    3    |    4    |     5    |    6    |
|depth 2  |    5     |     91   |    4369   |   406,901    |    62,193,781    |
|depth 3  |    26     |     68,583,243   |    ?    |     ?    |    ?    |
|depth 4  |    677     |     ?   |    ?    |     ?    |    ?    |
|depth 5  |    458,330     |     ?   |    ?    |     ?    |    ?    |

### Practical searching 

Although **sample.pure** samples all pure data and, therefore, all mathematical objects, these samples are typically far too huge to be practically helpful.  Instead, the operator **sample.data** is much more practical.  This operation works by providing a sequence of codas and a width.

* sample.data <...sequence of codas..> : width 

A few examples will illustrate. 

In [None]:
#
#   Just one argument coda "a" and width <= 5 gives 
#
sample.data a : 5 

In [None]:
#
#   Note that the sample is delivered, as usual, as (:data1) (:data2) ... 
#
#   If you provide two codas, a and b, all permutations are included. 
#
sample.data a b : 5

In [None]:
#
#   Including the coda <{$}> generates source code intermixed with permutations of the other coda
#   that you prescribe. 
#
sample.data <A> <B> <{$}> a : 3

In [None]:
#
#   One can add some chosen number of "variables" 
#
sample.data <A> <B> <{$}> x? y? (x:) : 2 

In [None]:
#
#   To meaningfully search, one should add domains representing established definitions.  To do this, first see 
#   all currently available definitions like so...
#
defs:

In [None]:
#
#    Each definition is in a "module" which is either a python file (.py) or a coda file (.co). 
#
module : defs:

In [None]:
#
#    Typically one would want some but not all definitions for a search.  For instance, one typically doesn't want to search 
#    over IO operations or Sample operations (to avoid recursion) or over help system operations.  A typical choice is...
#
defs : Apply Basic Logic Number Sequence  

In [None]:
#
#    To create a typical sample, we start with A, B and {$} for language purposes, add a few "variables" (x? y? z?) and 
#    mix in the above definitions. 
#    
let sample : sample.data <A> <B> <{$}> (defs:Apply Basic Logic Number Sequence) : 2 
sample?

In [None]:
#
#   sample? is then a collection of data that can be used, for instance, to test for theorems or search for 
#   algebraic structures of interest, as shown in study tutorials. 
#
count : sample? 
undefined : sample?

For larger scale searching, one would typically increase from width=2 and search in parallel and, perhaps add more variables.

This can be useful for theorem testing and for searching for algebraic structures of interest as illustrated in the theorem tutorial and in various study notebooks. 

In [None]:
def idempotent : { (B:x?)=(B:B:x?) } 

In [None]:
let 1 : ap put : defs : Apply Basic Logic Number Sequence

In [None]:
let atoms : sample.atom : 10 

In [None]:
atoms?

In [None]:
1?

In [None]:
count : 1?

In [None]:
idempotent : bool

In [None]:
ap {put bin : idempotent : right : B} : nth 2 : 1?

In [None]:
nth 60 : 1? 

In [None]:
ap {put bin : idempotent : right : B} : nth 64 : 1?

In [None]:
nth 64 : 1?

In [None]:
step 30 : idempotent : while 

In [None]:
getDefaultDepth:2