Make PySR look for strictly factorised functions #465

manuel-morales-a · 2023-11-15T09:36:28Z

manuel-morales-a
Nov 15, 2023

Hello everyone,

Say I have a function that takes two features of an input and gives you an output (example $f(x_0, x_1) = x_1 (1 + x_0^2)$ ).

Is there a way to tell PySR to look for analytical expression that are strictly factorised in terms of their input features? Specifically, I am looking for solutions where the function can be expressed as a product of two separate functions, each depending on one input feature only, like $f(x_0, x_1) = g_{0}(x_0) g_{1}(x_1)$

Thank you for your insights!

Answered by MilesCranmer

Nov 15, 2023

Good question. You could do this with a custom loss function that checks if the expression is factorized, otherwise returns a large loss:

function eval_loss(tree, dataset::Dataset{T,L}, options)::L where {T,L}
    # Check if expression is factorized:
    penalty_term = L(0)

    # Make sure root is degree 2:
    if tree.degree != 2
        penalty_term += L(10000)
    else
        # Make sure operator is *
        if options.operators.binops[tree.op] != *
            penalty_term += L(1000)
        else
            # Split the expression into two subexpressions at the root node:
            g0 = tree.l
            g1 = tree.r
            # Check if it's factorized:
            has_x1_in_g0 =

View full answer

MilesCranmer · 2023-11-15T11:27:53Z

MilesCranmer
Nov 15, 2023
Maintainer

Good question. You could do this with a custom loss function that checks if the expression is factorized, otherwise returns a large loss:

function eval_loss(tree, dataset::Dataset{T,L}, options)::L where {T,L}
    # Check if expression is factorized:
    penalty_term = L(0)

    # Make sure root is degree 2:
    if tree.degree != 2
        penalty_term += L(10000)
    else
        # Make sure operator is *
        if options.operators.binops[tree.op] != *
            penalty_term += L(1000)
        else
            # Split the expression into two subexpressions at the root node:
            g0 = tree.l
            g1 = tree.r
            # Check if it's factorized:
            has_x1_in_g0 = any(node -> node.degree==0 && node.constant==false && node.feature==2, g0)
            has_x0_in_g1 = any(node -> node.degree==0 && node.constant==false && node.feature==1, g1)
            is_factorized = !has_x1_in_g0 && !has_x0_in_g1
            penalty_term += is_factorized ? L(0) : L(100)
        end
    end

    prediction, flag = eval_tree_array(tree, dataset.X, options)
    if !flag
        return L(Inf)
    end
    return (
        penalty_term 
        + sum((prediction .- dataset.y) .^ 2) / length(prediction)
    )
end

Here I make the penalty term increase gradually by how far it is away from the constraints, so that the genetic algorithm has a "direction" towards the right factorization.

Then you can pass this to the full_objective parameter as a string: https://astroautomata.com/PySR/api/#the-objective

3 replies

MilesCranmer Nov 15, 2023
Maintainer

Oops, wait, I mixed up the flag

MilesCranmer Nov 15, 2023
Maintainer

Edit: fixed!

manuel-morales-a Nov 16, 2023
Author

Nice! I'm catching up with the Julia syntax and certainly will try your snippet. Thanks, Miles!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make PySR look for strictly factorised functions #465

{{title}}

Replies: 1 comment 3 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

Make PySR look for strictly factorised functions #465

manuel-morales-a Nov 15, 2023

Replies: 1 comment · 3 replies

MilesCranmer Nov 15, 2023 Maintainer

MilesCranmer Nov 15, 2023 Maintainer

MilesCranmer Nov 15, 2023 Maintainer

manuel-morales-a Nov 16, 2023 Author

manuel-morales-a
Nov 15, 2023

Replies: 1 comment 3 replies

MilesCranmer
Nov 15, 2023
Maintainer

MilesCranmer Nov 15, 2023
Maintainer

MilesCranmer Nov 15, 2023
Maintainer

manuel-morales-a Nov 16, 2023
Author