Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Easy loading from equation file #167

Merged
merged 24 commits into from
Aug 10, 2022
Merged

[Feature] Easy loading from equation file #167

merged 24 commits into from
Aug 10, 2022

Conversation

MilesCranmer
Copy link
Owner

@MilesCranmer MilesCranmer commented Jul 25, 2022

This makes it easier to load models directly from a saved equation file. There are two ways to use this:

  1. You can use pysr.load(...) to load from a standalone csv file of equations. However, you must pass a few different attributes here, to initialize the model: binary_operators, unary_operators, and n_features_in. If you have custom variable names, you also need to pass that, as well as nout if the number of outputs is not 1.
  2. PySRRegressor will now automatically create a pickle file of the model parameters when running fit(...). This has the same name as the equation file, but with a .pkl extension. This lets you run, e.g., pysr.load("equations.csv"), and have everything loaded from the equation. This also loads all the other parameters. (Although, you also need to pass extra_sympy_mappings and extra_torch_mappings as they are unpicklable).

@tttc3 what do you think of this?

I am also wondering if it might be more intuitive to have a PySRRegressor.from_file(...) constructor.

@MilesCranmer
Copy link
Owner Author

@kazewong this makes it easier to load from past runs

@tttc3
Copy link
Contributor

tttc3 commented Jul 27, 2022

Making it simpler to load past runs is a nice addition!

I might have missed something, but I think the automatically created pickle file is of the unfitted model as it is called before pysr._run(...). Is this the desired behaviour?

For me, the from_file constructor would be more intuitive. You can already use pickle to save and load a PySR instance without any additional code, ignoring the issue of the non-serializable attributes. The main feature that's being added is the ability to create a new instance of a PySR model from a file. This file could still be the .pkl, in which case, this constructor
is mostly a convenience function for handling the reinitialization of the non-serializable attributes.

@MilesCranmer
Copy link
Owner Author

MilesCranmer commented Jul 27, 2022

I might have missed something, but I think the automatically created pickle file is of the unfitted model as it is called before pysr._run(...). Is this the desired behaviour?

It was the desired behavior, but your comment is making me reconsider. I guess I can see two scenarios where people would want to load their model:

  1. PySR run is ongoing, or PySR run crashed, and you want to load up and visualize the output equations (which are automatically saved during the run).
  2. PySR run completed successfully, and now you want to analyze the equations in a new python process.

For 1., it is necessary to checkpoint the parameters before the actual equation search finishes. With the loaded model, you can do model.refresh(), and it will load the checkpointed equations.

For 2., I think you could have the equations stored in the pickle file itself, and perhaps not need the additional equation file.

Maybe a nice solution for both these is to dump the pickle file twice: once before the fit runs to checkpoint the model parameters in case the search quits early, and once with the fitted equations (to the same file). What do you think?

@tttc3
Copy link
Contributor

tttc3 commented Aug 1, 2022

I think dumping the pickle file twice is a good simple solution

@Pablo-Lemos
Copy link

Loading is really helpful, works perfectly as far as I can tell!

@MilesCranmer MilesCranmer merged commit 1099283 into master Aug 10, 2022
@MilesCranmer MilesCranmer deleted the loading branch August 10, 2022 07:34
@MilesCranmer
Copy link
Owner Author

Thanks for the feedback! The merged version saves a pickle file twice - so now you can send someone that pickle file and they can evaluate your equations.

@MilesCranmer MilesCranmer added this to the v0.10.0 milestone Aug 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants