Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it even possible to determine whether two arbitrary expression is equivalent? #7

Open
Phryxia opened this issue Aug 19, 2021 · 0 comments
Assignees
Labels
theory Theoretical foundations wip Indicates that this issue is currently on writing

Comments

@Phryxia
Copy link
Owner

Phryxia commented Aug 19, 2021

Before you proceed, please see the model definitions in #8 .

Definition of mathematical equivalence

Let <s, X, f>, <t, X, g> ∈ E.

Two entities are mathematically equivalent on X

iff ∀xX f(x) = g(x)

We're going to use the notation simply fg if they are equivalent. It's important to match each domain, otherwise it doesn't make any sense.

Note that following property is very essential for solving this problem.

  • fgf - g0

Determining mathematical equivalence

If the domain is finite than we can put every values and check whether they are same or not. But in most of cases, the domain is inifinite set.

Another approach is using preserving transformation. For example, f + g is equivalent to g + f then we can use this property to find the path between two expressions. But this is also intractable since the number of preserving transformations is unknown, and even if is known, the state space is inifinite so it cannot be guaranteed to determine in finite time.

Reducing the space

Note that E can be partitioned by grouping the equivalent entities. If there is an algorithm to find representative entity of each partition, then determining equivalence is easy.

We can make the entire problem into smaller one by reducing the space. For example, there are n! ways to express the sum of n different variables, but by sorting them we can reduce the set of every expression summing n different variables into one unique expression.

g0: Every child nodes must be optimized before optimize parent

But it doesn't have to be optimized in every-depth. What it means is this: Consider the small additive tree where x, y, z is not non additive node.

  +
 / \
x   +
   / \
  y   z

Old implementation optimized this like below.

optimize x
optimize y
optimize z
in optimize (y+z)
  try to merge y, z
in optimize x+(y+z)
  try to merge x, y
  try to merge x, z
  try to merge y, z

It was really terrible. Merging n variable requires n*(n-1)/2 comparisons and each comparison uses equivalency comparison, and again it calls optimize recursively for internal terms. I don't want to even estimate its complexity.

Just forget about the old one. If we use same strategy, such tree having total n leaf cause following complexity:

T(n) = n*lg(n) + T(n - 1)
     = n*lg(n) + (n - 1)*lg(n - 1) + T(n - 2)
     = ...
     = n*lg(n) + n*lg(n - 1) + n*lg(n - 2) + ... + n*lg(1)
       - (lg(n - 1) + 2*lg(n - 2) + 3*lg(n - 3) + ... )
     < n*lg(n!) - n*(n+1)/2*lg(n)
     = O(n^2 * lg(n))

Very terrible. New implementation will optimizes them in one batch, so that the complexity can be reduced into O(n * lg(n)). (still equivalent comparison occurs, but this time they are already optimized well)

optimize x
optimize y
optimize z
sort x, y, z
try to merge x, y
try to merge y, z

g1: Every pure-constant-only expressions should be computed

2 * (1 + 3) = 8

g2: In multiplicative and additive expression, child should be ordered by their text rendered form. But constant is exception, and it should be always be first.

a * y * x => a * x * y
x + 1 + e^2 => 1 + e^2 + x

g3: Valid domain of the optimized equation must be equal or large than original domain

Consider this.

x * (x + 1)^0.4

Valid domain of this is x > -1. But if we optimize above equation into below form, it's domain is changed to x > 0.

(x^3.5 + x^2.5)^0.4

This is much smaller than original one. Therefore you shouldn't do this.

Exponential

Every power expressions should be one of these:

  • (has variable) ^ (only constant)
  • (only constant) ^ (has variable)

Keep inside of the exponent minimal if it is possible. This is because power has many equivalent form so that structural complexity increases.

ex) 2^(x+1) => 2*2^x, (x+y)^(2*e) => (x^2+2*x*y+y^2)^e

e0: Nested exponent should be transformed into multiplication

(f ^ g) ^ h = f ^ (g * h)

New child g * h should be reduced using multiplication rules.

e1: If base has variable and exponent has variable, always put base into exponent using log.

f ^ g = e ^ (ln(f) * g)

New child ln(f) * g should be reduced using multiplication rules.

This is because of undecidability of reducing below type.

e ^ (ln(x) * ln(y)) = x ^ ln(y) = y ^ ln(x)

This assures (base is constant and exponent is not constant) or (base is not constant and exponent is constant)

This doesn't matter when one of them is constant.

e ^ (ln(x) * ln(2))
2^ln(x) => x^ln(2) by e5

e ^ (0.1 * ln(x))

x ^ 0.1

e2: If base is non constant additive and exponent is integer constant, spread using binomial theorem.

(f + g) ^ n = f ^ n + n * f ^ (n + (-1)) * g + ...

New children should be reduced using exponential rules.

If there are more than two terms, apply multinomial theorem instead.

e3: If base is multiplicative, spread them.

This assures every base of exponential to be atomic.

(2 * e) ^ h = 2 ^ h * e ^ h

New children should be reduced.

Note that

e4: If base is pure constant and exponent is additive and there is constant term, split constant.

2 ^ (f + 3) = 8 * 2 ^ f

e5: If base is constant and exponent is pure log (or coefficient multiplied), pull down the content into base.

a ^ ln(f) = f ^ ln(a)
a ^ (c * ln(f)) = (f ^ c) ^ ln(a)
e ^ ln(f) = f
e ^ (c * ln(f)) = f ^ c

In the second case and fourth case, new child should be reduced.

Multiplication

m0: Remove / operator

Every expressions using binary division can be replaced by exponentiation.

f / g = f * g ^ (-1)

New child g ^ (-1) should be reduced using exponential rules.

m1: Always spread expression using distributive property

Every power of integer of additive expression should be spread using binomial theorem. Every multiplication of composited expressions should be spread .

(f + g) * h = f * h + g * h

This is because there are many ways to express with factored form, but there is only one (if sorted) additive form. Note that c is guaranteed to be "non integer pure constant" or named constant.

New children should be reduced.

m2: Always group exponent of multiplicative using exponential property

Every multiplication of same base should be grouped using exponential property.

f ^ g * f ^ h = f ^ (g + h)

This is because we want all bases to be unique in multiplication.

New child g + h should be reduced using additive rules, and then f ^ (g + h) should be reduced using exponential rules.

m3: Always group constants

m4: Remove identity

1 * f = f

m5: Remove zero

0 * f = 0

Addition

a0: Remove - operator

Every expressions using binary subtraction can be replaced by multiplication of minus one.

f - g = f + (-1) * g

Since there property is similar to addition, we can reduce the complexity of implementation.

(-1) * g should be reduced using multiplication rules.

a1: Always group coefficient of additive

Every addition of same chunk should be grouped. Note that a and b must be constant. Otherwise it violates previous rule.

a * f + b * f = (a + b) * f

a2: Remove identity

f + 0 = f
@Phryxia Phryxia added the implementation Related to new feature implementation label Aug 19, 2021
@Phryxia Phryxia self-assigned this Aug 19, 2021
@Phryxia Phryxia added wip Indicates that this issue is currently on writing theory Theoretical foundations and removed implementation Related to new feature implementation labels Aug 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
theory Theoretical foundations wip Indicates that this issue is currently on writing
Projects
None yet
Development

No branches or pull requests

1 participant