# Analysing Distributions

The equation tree can be used to extract information from existing distributions of equations (e.g., for example by scraping priors: https://autoresearch.github.io/equation-scraper/tutorials/equation_scraper_tutorial/)

## Installation

In [None]:
!pip install equation-tree

## Equation Database

Here, we use a list of sympy equation to demonstrate the functionality

In [1]:
# import functionality from sympy
from sympy import sympify

eq_1 = sympify('x_1 + x_2')
eq_2 = sympify('exp(x_1) * 2.5')
eq_3 = sympify('sin(x_1) + 2 * cos(x_2)')
equation_list = [eq_1, eq_2, eq_3]

## Analyse the List

We can obtain informations about equations and lists of equations:

In [4]:
from equation_tree import get_frequencies

# Show the frequencies of
frequencies = get_frequencies(equation_list)
frequencies

{'max_depth': {3: 0.3333333333333333,
  4: 0.3333333333333333,
  7: 0.3333333333333333},
 'depth': {1: 0.3333333333333333,
  2: 0.3333333333333333,
  3: 0.3333333333333333},
 'structures': {'[0, 1, 1]': 0.3333333333333333,
  '[0, 1, 1, 2]': 0.3333333333333333,
  '[0, 1, 2, 1, 2, 2, 3]': 0.3333333333333333},
 'features': {'constants': 0.2857142857142857,
  'variables': 0.7142857142857143},
 'functions': {'exp': 0.3333333333333333,
  'sin': 0.3333333333333333,
  'cos': 0.3333333333333333},
 'operators': {'+': 0.5, '*': 0.5},
 'function_conditionals': {'exp': {'features': {'constants': 0.0,
    'variables': 1.0},
   'functions': {},
   'operators': {}},
  'sin': {'features': {'constants': 0.0, 'variables': 1.0},
   'functions': {},
   'operators': {}},
  'cos': {'features': {'constants': 0.0, 'variables': 1.0},
   'functions': {},
   'operators': {}}},
 'operator_conditionals': {'+': {'features': {'constants': 0.0,
    'variables': 1.0},
   'functions': {'sin': 1.0},
   'operators': {'*':

Instead of frequencies, we can also obtain absolute values:

In [3]:
from src.equation_tree import get_counts

counts  = get_counts(equation_list)
counts

{'max_depth': {3: 1, 4: 1, 7: 1},
 'depth': {1: 1, 2: 1, 3: 1},
 'structures': {'[0, 1, 1]': 1, '[0, 1, 1, 2]': 1, '[0, 1, 2, 1, 2, 2, 3]': 1},
 'features': {'constants': 2, 'variables': 5},
 'functions': {'exp': 1, 'sin': 1, 'cos': 1},
 'operators': {'+': 2, '*': 2},
 'function_conditionals': {'exp': {'features': {'constants': 0,
    'variables': 1},
   'functions': {},
   'operators': {}},
  'sin': {'features': {'constants': 0, 'variables': 1},
   'functions': {},
   'operators': {}},
  'cos': {'features': {'constants': 0, 'variables': 1},
   'functions': {},
   'operators': {}}},
 'operator_conditionals': {'+': {'features': {'constants': 0, 'variables': 2},
   'functions': {'sin': 1},
   'operators': {'*': 1}},
  '*': {'features': {'constants': 2, 'variables': 0},
   'functions': {'exp': 1, 'cos': 1},
   'operators': {}}}}

Note: We can directly use the obtained frequencies to sample new functions:

In [6]:
from equation_tree import sample

# sample equations
equations = sample(10, frequencies)
print(equations)

Processing: 100%|██████████| 10/10 [00:00<00:00, 90.79iteration/s]

[c_1*cos(x_1), 2*x_1, 2*x_1, 2*x_1, 2*x_1, x_1 + sin(x_1), c_1*cos(x_1), c_1*cos(x_1), x_1 + sin(x_1), x_1 + sin(x_1)]





In [7]:
# check the frequencies
print(get_frequencies(equations))

{'max_depth': {4: 0.6, 3: 0.4}, 'depth': {2: 0.6, 1: 0.4}, 'structures': {'[0, 1, 1, 2]': 0.6, '[0, 1, 1]': 0.4}, 'features': {'constants': 0.35, 'variables': 0.65}, 'functions': {'cos': 0.5, 'sin': 0.5}, 'operators': {'*': 0.7, '+': 0.3}, 'function_conditionals': {'cos': {'features': {'constants': 0.0, 'variables': 1.0}, 'functions': {}, 'operators': {}}, 'sin': {'features': {'constants': 0.0, 'variables': 1.0}, 'functions': {}, 'operators': {}}}, 'operator_conditionals': {'*': {'features': {'constants': 0.6363636363636364, 'variables': 0.36363636363636365}, 'functions': {'cos': 1.0}, 'operators': {}}, '+': {'features': {'constants': 0.0, 'variables': 1.0}, 'functions': {'sin': 1.0}, 'operators': {}}}}
