## Sampling the space of pure data 



One easy way to produce sample data is with **ap co** which repeats it's arguments once for each input atom...

In [1]:
ap co a : 1 2 3 4 
ap co a b : 1 2 3 4 
ap co (:) : 1 2 3 4 

(co a:1) (co a:2) (co a:3) (co a:4)
(co a b:1) (co a b:2) (co a b:3) (co a b:4)
(co ◎:1) (co ◎:2) (co ◎:3) (co ◎:4)

In most sampling situations, we'll want some specified collection d1, d2, d3,.... where d1, d2, d3 are each data.  To produce a sequence like that, you don't want to produce d1 d2 d3 because concatenation destroys the sequence.  Instead, an easy solution is to produce a sequence 

* (:d1) (:d2) (:d3)

where each data in the collection has it's own container.  We'll do a bit of this by hand...

In [2]:
(put : x y z ) (put : a b c )

(:x y z) (:a b c)

For convenience, **sample.odd** and **sample.even** produces even and odd sequences of the simplest atom (:) ("Hydrogen"), packaged as above...

In [6]:
sample.odd : 4 
sample.even : 4

(:◎) (:◎ ◎ ◎) (:◎ ◎ ◎ ◎ ◎) (:◎ ◎ ◎ ◎ ◎ ◎ ◎)
◎ (:◎ ◎) (:◎ ◎ ◎ ◎) (:◎ ◎ ◎ ◎ ◎ ◎)

The **sample.pure** operator lets one sample the whole space of pure data via 

* sample.pure : width depth 

where **width** is the maximum data length and **depth** is the maximum depth.  For example...

In [7]:
#
#  sample.pure : <width> <depth> 
#
sample.pure : 2 2 

◎ (:◎) (:(:◎)) (:(:◎ ◎)) (:𝟬) (:𝝞) (:◎◎) (:) (:◎) (:◎◎) (:◎ ◎) (:◎ (:◎)) (:◎ (:◎ ◎)) (:◎ 𝟬) (:◎ 𝝞) (:◎ ◎◎) (:◎ ) (:◎ ◎) (:◎ ◎◎) (:(:◎) ◎) (:(:◎) (:◎)) (:(:◎) (:◎ ◎)) (:(:◎) 𝟬) (:(:◎) 𝝞) (:(:◎) ◎◎) (:(:◎) ) (:(:◎) ◎) (:(:◎) ◎◎) (:(:◎ ◎) ◎) (:(:◎ ◎) (:◎)) (:(:◎ ◎) (:◎ ◎)) (:(:◎ ◎) 𝟬) (:(:◎ ◎) 𝝞) (:(:◎ ◎) ◎◎) (:(:◎ ◎) ) (:(:◎ ◎) ◎) (:(:◎ ◎) ◎◎) (:𝟬 ◎) (:𝟬 (:◎)) (:𝟬 (:◎ ◎)) (:𝟬 𝟬) (:𝟬 𝝞) (:𝟬 ◎◎) (:𝟬 ) (:𝟬 ◎) (:𝟬 ◎◎) (:𝝞 ◎) (:𝝞 (:◎)) (:𝝞 (:◎ ◎)) (:𝝞 𝟬) (:𝝞 𝝞) (:𝝞 ◎◎) (:𝝞 ) (:𝝞 ◎) (:𝝞 ◎◎) (:◎◎ ◎) (:◎◎ (:◎)) (:◎◎ (:◎ ◎)) (:◎◎ 𝟬) (:◎◎ 𝝞) (:◎◎ ◎◎) (:◎◎ ) (:◎◎ ◎) (:◎◎ ◎◎) (: ◎) (: (:◎)) (: (:◎ ◎)) (: 𝟬) (: 𝝞) (: ◎◎) (: ) (: ◎) (: ◎◎) (:◎ ◎) (:◎ (:◎)) (:◎ (:◎ ◎)) (:◎ 𝟬) (:◎ 𝝞) (:◎ ◎◎) (:◎ ) (:◎ ◎) (:◎ ◎◎) (:◎◎ ◎) (:◎◎ (:◎)) (:◎◎ (:◎ ◎)) (:◎◎ 𝟬) (:◎◎ 𝝞) (:◎◎ ◎◎) (:◎◎ ) (:◎◎ ◎) (:◎◎ ◎◎)

The direct approach is rarely useful because the universe of pure data grows enormously rapidly with **width** and **depth**.  So much so that

* sample.pure : 2 3 

is already challenging for a laptop.  

### Size of sample.pure versus width and depth:

|         | width 1  | width 2 | width 3 | width 4  | width 5 |
|---------|---------:|--------:|--------:|---------:|--------:|
|depth 1  |    2     |    3    |    4    |     5    |    6    |
|depth 2  |    5     |     91   |    4369   |   406,901    |    62,193,781    |
|depth 3  |    26     |     68,583,243   |    ?    |     ?    |    ?    |
|depth 4  |    677     |     ?   |    ?    |     ?    |    ?    |
|depth 5  |    458,330     |     ?   |    ?    |     ?    |    ?    |

Of course, the space of pure data includes all data, meaning all mathematical/logical/computational objects.  Undecided data ("variables"), for example, already appear in sample.pure : 2 2, which you can detect by, for example counting.

In [8]:
ap {count : get : B} : sample.pure : 2 2 

(count:(({get }:◎):({ B}:◎))) (count:(({get }:(:◎)):({ B}:(:◎)))) (count:(({get }:(:(:◎))):({ B}:(:(:◎))))) (count:(({get }:(:(:◎ ◎))):({ B}:(:(:◎ ◎))))) (count:(({get }:(:𝟬)):({ B}:(:𝟬)))) (count:(({get }:(:𝝞)):({ B}:(:𝝞)))) (count:(({get }:(:◎◎)):({ B}:(:◎◎)))) (count:(({get }:(:)):({ B}:(:)))) (count:(({get }:(:◎)):({ B}:(:◎)))) (count:(({get }:(:◎◎)):({ B}:(:◎◎)))) (count:(({get }:(:◎ ◎)):({ B}:(:◎ ◎)))) (count:(({get }:(:◎ (:◎))):({ B}:(:◎ (:◎))))) (count:(({get }:(:◎ (:◎ ◎))):({ B}:(:◎ (:◎ ◎))))) (count:(({get }:(:◎ 𝟬)):({ B}:(:◎ 𝟬)))) (count:(({get }:(:◎ 𝝞)):({ B}:(:◎ 𝝞)))) (count:(({get }:(:◎ ◎◎)):({ B}:(:◎ ◎◎)))) (count:(({get }:(:◎ )):({ B}:(:◎ )))) (count:(({get }:(:◎ ◎)):({ B}:(:◎ ◎)))) (count:(({get }:(:◎ ◎◎)):({ B}:(:◎ ◎◎)))) (count:(({get }:(:(:◎) ◎)):({ B}:(:(:◎) ◎)))) (count:(({get }:(:(:◎) (:◎))):({ B}:(:(:◎) (:◎))))) (count:(({get }:(:(:◎) (:◎ ◎))):({ B}:(:(:◎) (:◎ ◎))))) (count:(({get }:(:(:◎) 𝟬)):({ B}:(:(:◎) 𝟬)))) (count:(({get }:(:(:◎) 𝝞)):({ B}:(:(:◎) 𝝞)))) (cou

### Practical searching 

Although **sample.pure** samples all pure data and, therefore, all mathematical objects, these samples are typically far too huge to be practically helpful.  Instead, the operator **sample.data** is much more practical.  This operation works by providing a sequence of codas and a width.

* sample.data <...sequence of codas..> : width 

A few examples will illustrate. 

In [9]:
#
#   Just one argument coda "a" and width <= 5 gives 
#
sample.data a : 5 

◎ (:a a a a a) (:a a a a) (:a a) (:a) (:a a a)

In [10]:
#
#   Note that the sample is delivered, as usual, as (:data1) (:data2) ... 
#
#   If you provide two codas, a and b, all permutations are included. 
#
sample.data a b : 5

(:b b a a b) (:b b a a) (:b a b a) (:a b b b a) (:a a b a b) (:a a b b) (:a b b a a) (:b a) (:a) (:a a a b) (:a b a b b) (:b a a b a) (:a a a a) (:b a a a b) (:a a b a) (:b a b) (:a b b a b) (:a a a b b) (:b a b b b) (:a b a b a) (:a a b a a) (:a a b b a) (:a a a b a) (:b b) (:b a b b) (:a b b a) (:b b b b a) (:a a b) (:a a a a a) (:b b b a b) (:b b b) (:a b) (:a a a) (:a b a a) (:a b a b) (:b b b b b) (:a b a a b) (:b b a a a) (:b a b b a) (:a a b b b) (:b a a) (:b a a a) (:b) (:b a b a b) (:a b b) (:a a) (:b b b a a) (:b b b a) (:b a a b b) (:b b a b) (:b a a b) (:b a b a a) (:b a a a a) (:b b a) (:b b a b b) (:a b b b) (:a b a a a) (:a b b b b) (:a a a a b) (:b b a b a) (:a b a) (:b b b b) ◎

In [11]:
#
#   You can add language literals <A> and <B> and a special symbol <{$}> to create language expressions.
#
sample.data <A> <B> <{$}> a b (defs:Basic) : 2 

(:right {B is}) (:{const B} isnt) (:{B right} const) (:const {get : B}) (:nif {isnt : B}) (:const {plus : B}) (:{domain B} null) (:left {B : prod}) (:{const B} get) (:{B star}) (:a null) (:{B : plus} is) (:{a B} const) (:{star : B} is) (:{B : a} is) (:{B prod} arg) (:hasnt {B pass}) (:{B : if} right) (:const {a B}) (:{B : plus} arg) (:left {B : put}) (:{B : is} B) (:{left B} if) (:{B : bin} B) (:const {B : if}) (:left {B nif}) (:domain {B a}) (:{a B} has) (:domain get) (:get {B plus}) (:is get) (:A left) (:pass {B : put}) (:nif A) (:{right B} get) (:{A : B} star) (:{const B} arg) (:{B null} prod) (:const {B : star}) (:{B : plus} plus) (:right {isnt : B}) (:{b : B} b) (:sum {hasnt : B}) (:{isnt : B} has) (:{sum : B} a) (:left a) (:{B : left} null) (:if isnt) (:A const) (:{B : put} pass) (:hasnt {if B}) (:{const B} right) (:{hasnt : B} null) (:star {isnt B}) (:{B : b} domain) (:const bin) (:sum {null B}) (:{hasnt B} sum) (:is {bin B}) (:{bin : B} const) (:sum right) (:{star : B} put) (:{

In [12]:
#
#   To meaningfully search, one should add domains representing established definitions.  To do this, first see 
#   all currently available definitions like so...
#
defs:

= Def Let allcodes alphabet ap aq ar arg as assign bin bool by cases coda codes collect const count def defs demo digits dir domain down down1 endswith equal equiv eval eval1 first float_div float_inv float_max float_min float_prod float_sort float_sum floats get has hasnt help home if import in info int_div int_inv int_max int_min int_prod int_sort int_sum ints is isnt join ker kernel language last left let letters localdef log logging logs module multi nat nat_max nat_min nat_prod nat_sort nat_sum nats nchar nif not nth nth1 null once out pass pause permutation plus post pre printable prod pure put readpath rep repn rev right sample.atom sample.data sample.even sample.odd sample.pure sample.window skip some source sources split star startswith stat step sum tail term text_sort theorem up up1 use use1 with wrap ◎ 𝝞 𝟬

In [13]:
#
#    Each definition is in a "module" which is either a python file (.py) or a coda file (.co). 
#
once : module : defs:

Logic  Define Text Apply Basic Variable Sequence Search Source Collect Help IO Path Evaluate Evaluation Number Import Language Log Time Sample

In [14]:
#
#    Typically one would want some but not all definitions for a search.  For instance, one typically doesn't want to search 
#    over IO operations or Sample operations (to avoid recursion) or over help system operations.  A typical choice is...
#
defs : Apply Basic Logic Number Sequence  

= ap aq ar arg as bin bool by const count domain equal first float_div float_inv float_max float_min float_prod float_sort float_sum floats get has hasnt if int_div int_inv int_max int_min int_prod int_sort int_sum ints is isnt ker kernel last left nat nat_max nat_min nat_prod nat_sort nat_sum nats nif not nth nth1 null once pass plus post pre prod put rep rev right skip some star sum tail text_sort

In [15]:
sample.data (defs:Basic) : 2

(:put star) (:const has) (:null left) (:put isnt) (:isnt isnt) (:domain left) (:if star) (:bin has) (:domain get) (:domain is) (:left plus) (:const const) (:left left) (:get right) (:is arg) (:null isnt) (:left put) (:left is) (:arg left) (:hasnt hasnt) (:sum has) (:arg right) (:get prod) (:if hasnt) (:sum arg) (:isnt const) (:sum right) (:isnt null) (:null nif) (:bin bin) (:prod left) (:null sum) (:hasnt right) (:nif isnt) (:arg is) (:sum isnt) (:pass is) (:bin prod) (:star put) (:is right) (:domain prod) (:prod arg) (:nif if) (:left get) (:domain if) (:null domain) (:if right) (:is hasnt) (:bin plus) (:nif null) (:star const) (:star isnt) (:get nif) (:is pass) (:plus right) (:has sum) (:nif prod) (:const hasnt) (:hasnt null) (:nif left) (:const pass) (:is plus) (:put sum) (:get const) (:isnt sum) (:arg isnt) (:prod isnt) (:pass) (:sum get) (:plus prod) (:has has) (:plus if) (:plus is) (:const right) (:put if) (:arg arg) (:pass plus) (:plus arg) (:pass has) (:const left) (:arg get) (:

In [16]:
count : defs : Sequence

13

In [17]:
sample.data (defs:Sequence) : 2

(:once nth1) (:once nth) (:last once) (:once rep) (:count last) (:once once) (:once rev) (:skip by) (:skip first) (:nth rev) (:count) (:skip) (:by nth) (:rev nth1) (:last by) (:rev by) (:nth nth1) (:pre by) (:once skip) (:last rev) (:nth1 tail) (:pre skip) (:by post) (:count by) (:post once) (:skip nth1) (:pre rev) (:tail nth1) (:count pre) ◎ (:nth1 skip) (:last rep) (:count count) (:nth once) (:last nth1) (:pre nth1) (:tail skip) (:nth1 count) (:skip once) (:rep count) (:post skip) (:rep skip) (:rep once) (:rev last) (:count once) (:rev count) (:by by) (:nth1 by) (:post last) (:rep first) (:post count) (:first skip) (:once by) (:by tail) (:tail count) (:once last) (:tail by) (:post) (:nth rep) (:nth post) (:first pre) (:once count) (:last tail) (:rev nth) (:nth last) (:nth1 first) (:first tail) (:rep) (:rev skip) (:first rev) (:nth1) (:pre nth) (:skip rep) (:skip post) (:rep by) (:post nth1) (:nth1 post) (:count nth1) (:nth pre) (:pre post) (:skip count) (:by) (:rep rep) (:tail rev) (

In [18]:
defs:Sequence

by count first last nth nth1 once post pre rep rev skip tail

In [19]:
sample.data rev first tail last skip rep nth1 once count nth pre post : 2

(:nth skip) (:post count) (:rep post) (:pre post) (:nth) (:pre first) (:pre once) (:pre pre) (:rep nth1) (:rev nth) (:tail skip) (:last nth1) (:tail count) (:post post) (:post nth) (:tail nth) (:rev rep) (:last skip) (:first count) (:post rep) (:last rep) (:nth rev) (:once rev) (:first once) (:pre tail) (:nth1 rev) (:post) (:pre last) (:first) (:nth first) (:nth1 nth) (:pre nth1) (:rep nth) (:tail last) (:post last) (:tail) (:rep rep) (:once post) (:last count) (:rev pre) (:first skip) (:count pre) (:post rev) (:nth1 tail) (:nth nth) (:pre count) (:first nth) (:nth count) (:rep tail) (:rev last) (:once count) (:rev) (:count) (:count post) (:tail nth1) (:skip pre) (:nth rep) (:skip count) (:tail once) (:nth last) (:post skip) (:tail tail) (:skip tail) (:post once) (:once first) (:nth1 rep) (:first post) (:first rev) (:skip skip) (:once last) (:tail pre) (:last tail) (:first last) (:first rep) (:rev count) (:last post) (:skip first) (:rep) (:nth1 pre) (:count skip) (:count once) (:post n

In [20]:
#
#    To create a typical sample, we start with A, B and {$} for language purposes, add a few "variables" (x? y? z?) and 
#    mix in the above definitions. 
#    
sample.data <A> <B> x? y? <{$}> (defs:Apply Basic) : 2 

(:{B ap} bin) (:pass {has B}) (:A {get : B}) (:{B : hasnt} right) (:{B : aq} domain) (:const {B prod}) (:aq plus) (:{B : ap} sum) (:{nif B} aq) (:right {put B}) (:{B A} null) (:B {nif : B}) (:{plus : B} ap) (:plus {prod : B}) (:{B : nif} ap) (:get {B : sum}) (:{prod : B} is) (:B {as : B}) (:isnt {if B}) (:right {B : ap}) (:{B : nif} is) (:left {star B}) (:ar {put B}) (:{B has} as) (:put prod) (:{star : B} hasnt) (:{bin B} star) (:domain {B : if}) (:const if) (:{aq : B} plus) (:kernel {B ar}) (:as {star B}) (:{left B} pass) (:pass {B}) (:if {B : left}) (:put {hasnt B}) (:ar {B : kernel}) (:{(x:) : B} left) (:bin {star : B}) (:hasnt {B : sum}) (:pass) (:{B if} ar) (:{B left} aq) (:nif {B : (x:)}) (:{hasnt : B} left) (:get {null : B}) (:{as B} A) (:as arg) (:{B : null} A) (:{ap B} (x:)) (:prod {B prod}) (:{B plus} ar) (:{B right} bin) (:{kernel : B} has) (:{B : get} ap) (:hasnt {B : bin}) (:const {right B}) (:{B ar} put) (:prod B) (:{B ap} const) (:as {domain B}) (:ker {B A}) (:hasnt prod

In [21]:
Let Sample : sample.data <A> <B> x? y? <{$}>  : 2 



In [22]:
#
#   sample? is then a collection of data that can be used, for instance, to test for theorems or search for 
#   algebraic structures of interest, as shown in study tutorials. 
#
count : Sample? 

138

For larger scale searching, one would typically increase from width=2 and search in parallel and, perhaps add more variables.

This can be useful for theorem testing and for searching for algebraic structures of interest as illustrated in the theorem tutorial and in various study notebooks. 