Functions that produce nan/inf values #84

granawkins · 2022-08-06T05:05:58Z

Some of the operators we support will produce unusable values (nan or inf) in the course of normal use:

Operator	X == 0	X > 1e3	X < 0	X > 1
`/`	nan*
`**`		inf
`sqrt`	nan*		nan
`log`	-inf		nan
`log1p`			nan
`arcsin`				nan
`arccos`				nan

*We currently use helper functions for division and square root which ignore 0s.

What to do?

Here are 3 ideas:

Deal with them case-by-case.
- / and sqrt seem ok for now.
- log1p is a built-in function that extends log by ignoring 0s. We could add a helper which does sign(x) * log1p(abs(x)).
- arccos and arcsin are maybe rare enough, we could add a check in karoo.fit() when using them that -1 < X < 1, else raise a ValueError.
- That leaves **. X > 1e3 happens frequently with small numbers too when combined with other operators, e.g. 2 ** (1 / .001). Replacing with 0 is the simplest option, but it's a big nonlinearity (as X increases, outputs get exponentially larger and then drop to 0).
Accept a kwarg with a replacement value (e.g. 0) in the case that a nan and/or inf is produced. Basically like we do in the *'s above, for everything.
If and when a tree produces a nan or inf, just remove it from the gene pool and don't bother scoring it. This is basically the method used by swim, i.e. eliminate trees with less than the minimum number of nodes.

I lean toward 3.

The text was updated successfully, but these errors were encountered:

asksak · 2022-08-06T08:18:56Z

If idea 3 will be used consistently with all cases, then I think it would be best available resolution.

granawkins · 2022-08-20T06:21:48Z

The best approach seems to be:

Keep the helper fx for / and sqrt, add one for log: sign(x) * log1p(abs(x))
Add an unfit=False attribute to Trees. After predicting each tree, if the output contains nan or inf, set unfit=True.
Skip unfit trees when scoring
Remove unfit trees from gene_pool

kstaats · 2022-08-20T06:33:55Z

Ok. So removing tree altogether, But I assume adding a new tree to replace each that is removed, so that the populations remain at max.

…

On 8/19/22 23:22, Grant wrote: The best approach seems to be: - Keep the helper fx for `/` and `sqrt`, add one for `log`: `sign(x) * log1p(abs(x))` - Add an `unfit=False` attribute to `Tree`s. After predicting each tree, if the output contains `nan` or `inf`, set `unfit=True`. - Skip `unfit` trees when scoring - Remove `unfit` trees from `gene_pool`

granawkins · 2022-08-20T07:34:38Z

Every generation starts with `tree_pop_max` trees, e.g. 100. If 10 are unfit, then the remaining 90 are used to generate the next population of 100. It's the same approach we use to handle `tree_depth_min`.

…

On Sat, Aug 20, 2022 at 1:34 PM Kai Staats ***@***.***> wrote: Ok. So removing tree altogether, But I assume adding a new tree to replace each that is removed, so that the populations remain at max. On 8/19/22 23:22, Grant wrote: > The best approach seems to be: > - Keep the helper fx for `/` and `sqrt`, add one for `log`: `sign(x) * log1p(abs(x))` > - Add an `unfit=False` attribute to `Tree`s. After predicting each tree, if the output contains `nan` or `inf`, set `unfit=True`. > - Skip `unfit` trees when scoring > - Remove `unfit` trees from `gene_pool` > — Reply to this email directly, view it on GitHub <#84 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AL7VFKZHPSKR3BIUC3RRMZ3V2B355ANCNFSM55X6RGBA> . You are receiving this because you authored the thread.Message ID: ***@***.***>

kstaats · 2022-08-20T22:56:44Z

Let's discuss, as there is another method ...

…

On 8/20/22 00:34, Grant wrote: Every generation starts with `tree_pop_max` trees, e.g. 100. If 10 are unfit, then the remaining 90 are used to generate the next population of 100. It's the same approach we use to handle `tree_depth_min`. On Sat, Aug 20, 2022 at 1:34 PM Kai Staats ***@***.***> wrote: > Ok. So removing tree altogether, But I assume adding a new tree to > replace each that is removed, so that the populations remain at max. > > On 8/19/22 23:22, Grant wrote: >> The best approach seems to be: >> - Keep the helper fx for `/` and `sqrt`, add one for `log`: `sign(x) * > log1p(abs(x))` >> - Add an `unfit=False` attribute to `Tree`s. After predicting each tree, > if the output contains `nan` or `inf`, set `unfit=True`. >> - Skip `unfit` trees when scoring >> - Remove `unfit` trees from `gene_pool` >> > > — > Reply to this email directly, view it on GitHub > <#84 (comment)>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AL7VFKZHPSKR3BIUC3RRMZ3V2B355ANCNFSM55X6RGBA> > . > You are receiving this because you authored the thread.Message ID: > ***@***.***> >

granawkins · 2022-08-25T07:39:02Z

This was implemented in #85

granawkins · 2022-10-11T08:15:07Z

Can you explain briefly so I can get moving?

…

On Sun, 21 Aug 2022 at 05:56 Kai Staats ***@***.***> wrote: Let's discuss, as there is another method ... On 8/20/22 00:34, Grant wrote: > Every generation starts with `tree_pop_max` trees, e.g. 100. If 10 are > unfit, then the remaining 90 are used to generate the next population of > 100. It's the same approach we use to handle `tree_depth_min`. > > On Sat, Aug 20, 2022 at 1:34 PM Kai Staats ***@***.***> wrote: > >> Ok. So removing tree altogether, But I assume adding a new tree to >> replace each that is removed, so that the populations remain at max. >> >> On 8/19/22 23:22, Grant wrote: >>> The best approach seems to be: >>> - Keep the helper fx for `/` and `sqrt`, add one for `log`: `sign(x) * >> log1p(abs(x))` >>> - Add an `unfit=False` attribute to `Tree`s. After predicting each tree, >> if the output contains `nan` or `inf`, set `unfit=True`. >>> - Skip `unfit` trees when scoring >>> - Remove `unfit` trees from `gene_pool` >>> >> >> — >> Reply to this email directly, view it on GitHub >> <#84 (comment) >, >> or unsubscribe >> < https://github.com/notifications/unsubscribe-auth/AL7VFKZHPSKR3BIUC3RRMZ3V2B355ANCNFSM55X6RGBA > >> . >> You are receiving this because you authored the thread.Message ID: >> ***@***.***> >> > > — Reply to this email directly, view it on GitHub <#84 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AL7VFK63IHJ5GWSWNCH3FI3V2FPDPANCNFSM55X6RGBA> . You are receiving this because you authored the thread.Message ID: ***@***.***>

granawkins closed this as completed Aug 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Functions that produce nan/inf values #84

Functions that produce nan/inf values #84

granawkins commented Aug 6, 2022 •

edited

Loading

asksak commented Aug 6, 2022

granawkins commented Aug 20, 2022

kstaats commented Aug 20, 2022 via email

granawkins commented Aug 20, 2022 via email

kstaats commented Aug 20, 2022 via email

granawkins commented Aug 25, 2022

granawkins commented Oct 11, 2022 via email

Functions that produce nan/inf values #84

Functions that produce nan/inf values #84

Comments

granawkins commented Aug 6, 2022 • edited Loading

What to do?

asksak commented Aug 6, 2022

granawkins commented Aug 20, 2022

kstaats commented Aug 20, 2022 via email

granawkins commented Aug 20, 2022 via email

kstaats commented Aug 20, 2022 via email

granawkins commented Aug 25, 2022

granawkins commented Oct 11, 2022 via email

granawkins commented Aug 6, 2022 •

edited

Loading