-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add PySR & SymbolicRegression.jl to suite #62
Conversation
Hi @MilesCranmer , thanks for this PR! Looks like there's a TypeError being thrown:
also, OperonRegressor (@foolnotion ) is failing - is this something new @foolnotion ? |
@lacava yes, there have been some major changes, sorry about that. I will PR an updated install script. |
Thanks, just fixed! |
hm, still seeing a type error... |
it seems your PySRRegressor isn't sklearn compatible. Have a look at this code https://gist.github.com/folivetti/609bc9b854c51968ef90aa675ccaa60d I guess you just need to import |
Fixed! |
It should actually work now. Sorry for the delay. The way PySR was being ran inside the srbench test changed the definition of |
Here's it working (using the PR #65 for quicker CI): https://github.com/MilesCranmer/srbench/runs/4909749492?check_suite_focus=true I reduced the number of possible hyperparams since it was fairly slow. Will the hyperparams I give affect the benchmark, or will you use different ones? The hyperparams I listed were for speedy testing but may not have great performance. |
@lacava I've been trying to get julia built directly inside the conda environment but it's much more difficult than I thought. Is it okay if I switch back to building it from the github action as before? It seemed to work with that. |
Nevermind, seems like its working directly from conda now! Ready for review. |
we'll populate the tuned directory for methods that are tested on PMLB. check out issue #24 for a discussion on setting the hyperparameters. |
Thanks, will read in-depth soon. Just so I understand, you are limiting methods to 500k evaluations, as in 500k total equations tried on a dataset? PySR can do 500k evaluations in ~1 second on a 4-core laptop... would this mean that PySR's benchmark will only allow for 1 second of search time? (PySR's algorithm is very evaluation-oriented) Maybe there is a fixed-wall clock time test where PySR can be benchmarked as well? (The reason it can get to this speed is basically because I optimized the equation evaluation for like six months... I even fuse subtrees of operators into a single compiled kernel to get a speedup! At one point I even tried automatically caching parts of equations too.) |
This sounds impressive and would be at least an order of magnitude faster than operon. I assume the number of evaluations per second depends on the number of rows in the dataset and on the avg. size of individuals. |
Here's an example with the equation using BenchmarkTools
using SymbolicRegression
# 200 rows, 3 features:
X = randn(3, 200)
# Enable + * / - cos sin operators:
options = Options(binary_operators=(+, *, /, -), unary_operators=(cos, sin))
# Create equation
tree = Node("x1") * Node("x2") + cos(Node("x3"))
# Set up evaluation function:
testfunc() = evalTreeArray(tree, X, options)
# Evaluate performance:
@btime testfunc() This gives So 4x cores would be equivalent to 2,201,430 evaluations per second. In practice despite the optimization, the equation eval is still the most expensive part so the mutation parts shouldn't add much. (The equation itself isn't compiled, but the operations used to evaluate the tree are, so this performance be similar for a random tree) |
This discussion should probably be elsewhere than this PR, but in short, our goal with SRBench has been to benchmark methods, not implementations (to the extent possible). We do measure and compare running time alongside accuracy and complexity and report them all together. FWIW, the synthetic benchmark upped the evals to 1M, but removed the tuning step (hence the `tuned' folder). |
I agree this makes sense in principle, but often choices in the implementation inform what methods are even possible, so I don't think these should be disentangled. For example, in PySR there are several choices made in designing the algorithm to increase evaluation speed, which changed the underlying evolutionary algorithm*. Maybe the best compromise would be to have separate benchmarks - one for raw wall clock performance, and one for sample efficiency? From a user's perspective, I think the wall clock benchmark would be much more useful in terms of what tool to use. But yes, maybe best left to separate thread. * for example, it's often faster to split a population of individuals into small fixed-size groups, then operate on the groups independently (in parallel) before merging, even though this changes the algorithm itself. |
the methods all seem to be passing the tests, but then this is happening at the end:
|
Hm, this is interesting. So this is obviously a C function, yet PySR has no C dependencies in the main search code which is pure Julia. The other PySR deps are very stable libraries like numpy/sympy which I wouldn't expect to give such an issue. However, I know that PyJulia directly links into the Python binary which can change some behaviour, like signals. Does this script run all algorithms in one single kernel evaluation, or does it start a new Python kernel for each algorithm? I wonder if there is some interaction between algorithms. |
Edited |
The hyperparams you give will affect the benchmark (whenever we fix them and run it). For v2.0, we limited to six combos. However, many hyperparameters shouldn't affect the speed of the tests. When you call evaluate_model.py with the tests=True flag, it skips tuning. You can edit that conditional if you want to further restrict settings for the test. (Probably should handle that better in the future) |
For tests, pytest is a consistent kernel I imagine. I guess that is why you turned it into a script. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, I have one question about Julia in the CI setup. Otherwise ready to merge I think.
.github/workflows/test.yml
Outdated
@@ -49,6 +49,14 @@ jobs: | |||
conda list | |||
conda activate srbench | |||
|
|||
# This is useful for up-to-date CI, but |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not too familiar with Julia, what is happening here? Is it over an above including Julia in environment.yml
?
It it's going to potentially break CI when Julia packages update, i'd rather use a fixed version and update that as needed for srbench dependencies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be removed now. Basically, there is a problem with the Julia package servers in that sometimes they are up to a few days out-of-date. I included the most up-to-date PySR version in this PR which required this forced package server update. However, it's been a few days so the servers are all likely up-to-date!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i sent a PR to your fork for this
remove julia pkg server, add PySR to readme
Add PySR & SymbolicRegression.jl to suite
This adds the install script and the scikit-learn API. I also added a Julia install step to the GitHub action.
I wasn't sure what the
tuned
directory was for so I have left it for now. I also wasn't sure how you typically deal with choices of operators so I have added some different choices to the hyperparameter search.Let me know what else I need to add. Thanks!
Cheers,
Miles