# DAS Evolution Queries - Sentences Dataset

This notebook demonstrates how to use Evolution-based queries with Hyperon MeTTa and DAS.

Evolution queries are like regular pattern-matching queries in the sense that it expects the same kind of input (a query) and delivers the same kind of result (an iterator to query answers). They are different because Evolution queries goes through an evolutionary algorithm before delivering the answers while regular queries are simply executed in the query engine. Here's how it works.

1. The caller submit a query to the evolution agent. In addition to the query itself, the caller also provide a fitness function, which can evaluate the quality of query answers giving a score in \[0, 1\], and a secondary query which we call the "correlation query", whose meaning will be explained below.
2. The evolution agent will execute the query in the query angine and will use the first N results (N is another parameter of the evolution query request) to build a population of query answers.
3. All the N query answers in the population are evaluated using the passed fitness function.
4. The best M individuals (i.e. the M query answers with the largest fitness values) are selected to sample the next generation of the population (actually, the selection is made with a mix of just picking up the best individuals and tournament selection with the balance between them being another evolution parameter)
5. To sample the population generation, first we use the "correlation query" passed as evolution parameter. For each of the selected query answers, we use elements from the answer (variable values or the rewritten links themselves) to customize the correlation query. Them this query is executed in the query engine and its results are used to change the Hebbian Network related to the given context in the AttentionBroker and to stimulate some of the elements in the query answer. This stimulation will also trigger activation spreading in the Hebbian Network (all this stimulation and activation spreading happens ONLY IN ATTENTION BROKER, whi9ch keeps separate Hebbian Networks for different contexts; importance updatind DOESN'T affect the atomspace itself).
6. After the importance update in the context, the main query is executed once again and the next generation of the population is sampled by getting the best N individuals (query asnwers) as we did initially.
7. Evolution query agent repeats steps 2-6 until a stop criteria (another evolution parameter) is met. While new generations are sampled and evaluated (using the passed fitness function), every time the agent sees a query answer which is better (i.e. has a larger fitness value) than the last one it delivered to the caller, this new best solution is instantly delivered (it doesn't matter if it'in the first generation, second or whatever). This way, in addition to the stop criteria passed as evolution parameter, the caller can also just interrupt the evolution process by aborting the query if it already found a solution which is considered good enough by the caller.

## Load Sentences Dataset (Optional)

If not already loaded, use das-cli to load a sentences dataset containing 100K sentences with 10 words each (words starting with letters a-e).


In [1]:
!das-cli metta load /tmp/100K_sentences_10_words_a-e.metta

[33mdas-cli-mongodb-40021 is running on port 40021[0m
[33mdas-cli-redis-40020 is running on port 40020[0m
Loading metta file /tmp/100K_sentences_10_words_a-e.metta...
Done.


## Setup

Initialize `hyperon` MeTTa environment and create a helper function to run MeTTa programs.


In [2]:
import hyperon

metta = hyperon.MeTTa()
def run(program='!(+ 1 2)'):
    for result in metta.run(program):
        for child in result:
            print(child)

## Import DAS Module

Import the DAS module into the MeTTa environment.


In [3]:
run('!(import! &self das)')

()


## Connect to DAS

Bind a DAS connection to `&das` space. The first parameter specifies a client's host and port range (47000-47999) and the second must be a known peer address (eg. Query Agent at localhost:40002).


In [4]:
run('!(bind! &das (new-das! (localhost:47000-47999) (localhost:40002)))')

()


## Simple Query: Find Words in Sentence

Find all words contained in a specific sentence. This is a basic pattern match to verify the dataset is loaded.


In [5]:
run('!(match &das (Contains (Sentence "bce ecc dcb ced bbb bca bce cad eba ede") (Word $W)) $W)')

"eba"
"dcb"
"cad"
"ede"
"bca"
"ced"
"bce"
"bbb"
"ecc"


## Define Helper Functions and Evolution Parameters

Define utility functions for text analysis:
- **String operations**: length calculation, character counting, space removal
- **Fitness function (ff)**: Calculates the frequency of a letter in a sentence
- **Query definition**: Pattern to search for sentences containing the word "bbb"
- **Correlation parameters**: Define how evolution should correlate results across generations


In [6]:
run('''
(= (str-length $s) (* ((py-dot "" len) $s) 1.0))
(= (count-letters $s $c) (* ((py-dot $s count) $c) 1.0))
(= (remove-spaces $s) ((py-dot $s replace) " " ""))
(= (prep-sentence $s) (remove-spaces (index-atom $s 1)))
(= 
  (ff $s $c) 
  (/ 
    (count-letters (prep-sentence $s) $c) 
    (str-length (prep-sentence $s))
  )
)

(= 
  (query) 
  (or 
    (Contains $sentence1 (Word "bbb")) 
    (Contains $sentence1 (Word "bbb"))
  )
)

(=
  (correlation-queries)
  (
    (Contains $sentence1 $word1)
  )
)

(=
  (correlation-replacements)
  (
    (sentence1 sentence1)
  )
)

(=
  (correlation-mappings)
  (
    (sentence1 word1)
  )
)
''')

## Check Current Evolution Parameters

Display the current DAS evolution parameters to see default settings.


In [7]:
run('!(das-get-params!)')

()
DAS Params:
'attention_update_flag': Bool(false)
'count_flag': Bool(false)
'elitism_rate': Double(0.08)
'enforce_cache_recreation': Bool(false)
'initial_rent_rate': Double(0.1)
'initial_spreading_rate_lowerbound': Double(0.9)
'initial_spreading_rate_upperbound': Double(0.9)
'max_answers': UnsignedInt(1000)
'max_bundle_size': UnsignedInt(1000)
'max_generations': UnsignedInt(10)
'populate_metta_mapping': Bool(true)
'population_size': UnsignedInt(50)
'positive_importance_flag': Bool(false)
'selection_rate': Double(0.1)
'total_attention_tokens': UnsignedInt(100000)
'unique_assignment_flag': Bool(true)
'use_cache': Bool(true)
'use_metta_as_query_tokens': Bool(true)


## Set Maximum Generations

Configure the evolution to run for 5 generations only. Each generation refines the search based on correlation analysis.


In [8]:
run('!(das-set-param! (max_generations 5))')

()
DAS Param Updated: 'max_generations': UnsignedInt(5)


## Run Evolution Query

Execute an evolution-based query that:
1. Searches for sentences containing "bbb"
2. Analyzes the frequency of letter "c" in matching sentences
3. Evolves over 5 generations to find sentences with optimal "c" frequency
4. Uses correlation mappings to refine results across generations


In [9]:
run('''
!(match 
  &das 
  (
    EVOLUTION 
    (!(query) !(correlation-queries) !(correlation-replacements) !(correlation-mappings))
    ((ff $sentence1 "c")) 
  ) 
  $sentence1
)
''')

(Sentence "bbb eae acb abd dad aba eea bee dde dba")
(Sentence "deb dea eca ada bbe ddb bbb dbd bac abe")
(Sentence "dcc ebe aaa bbb aeb bae dda bbc ada eaa")
(Sentence "bab cbd abb bba aad edc dbd ebd bbb cbe")
(Sentence "eda bca bda ceb bbc bbb eaa ade beb daa")
(Sentence "ade cad aed ebb bdb cec bbb bab dad dad")
(Sentence "bbb aba cec bdb cae dab bdb aad ddd bdd")
(Sentence "aba baa eeb bbb eea bcb bee dee eca ece")
(Sentence "cec adc bbb bad aed ebb ade bae eeb eae")
(Sentence "baa bbc bbb edd cad dba aee eda ceb eae")
(Sentence "cee acd bbb dee add aae dee ede bcc bad")
(Sentence "ade dce bba ded edb ecd bbc bee bec bbb")
(Sentence "dda eeb aeb bee bab edc bbb ced bba ccb")
(Sentence "edb eeb dda bda aca bbb eca dec ebd aca")
(Sentence "deb eaa cad dea bbb eae eec aca ddd bcd")
(Sentence "bbb aea dba bee cab cde cbe eca aeb aab")
(Sentence "bbe bae bbb ddd cdb bbc eac daa dbb bcc")
(Sentence "bdd bab ace cda add acc bbb dae dbc bbd")
(Sentence "bbb daa dea dda eae dcb bbb dcb aac

## Test Fitness Function - Example 1

Calculate the frequency of letter "c" in a sample sentence to verify the fitness function works correctly.


In [10]:
run('!(ff (Sentence "acc cdb ccc cda acc bac edb ceb dab bbb") "c")')

0.36666666666666664


## Test Fitness Function - Example 2

Calculate the frequency of letter "c" in another sample sentence for comparison.


In [11]:
run('!(ff (Sentence "cca ddc cbd ace bbe bad cdd ccb bbb dcc") "c")')

0.3333333333333333
