Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add 'results_function' parameter for improving fitting capabilities #77 #78

Merged
merged 18 commits into from
Mar 28, 2023

Conversation

joelvdavies
Copy link
Contributor

@joelvdavies joelvdavies commented Mar 21, 2023

Adds a parameter results_function that can take all the usual constants, variables and functions available to the input file and additionally two extra 'x' and 'y' representing the output of muspinsim's simulation and runs it before outputting the results. This allows fitting on functions that modify the data e.g.

results_function
    A*y+B

For a linear fit, offsetting the y values and scaling them.

This also detects when fitting parameters are present in any sections of the input other than this function, and in the case they are only used in results_function the FittingRunner will cache the results and only run the simulation once, but repeatedly fit with this function so that it can run significantly faster for large systems.

Other notes

  • Also adds least-squares as an additional fitting method (it behaves better than the default nelder-mead in the case of linear fitting but it also typically requires more function evaluations so is not as suitable for atomistic fitting)
  • Includes docs for these changes
  • Fixes bug preventing fitting with OpenMPI using mpirun -n
  • Removes a limit of a maximum of 4 fitting variables

Closes #77

@joelvdavies joelvdavies added the enhancement New feature or request label Mar 21, 2023
@joelvdavies joelvdavies self-assigned this Mar 21, 2023
@joelvdavies joelvdavies changed the base branch from main to v2.3.0 March 21, 2023 13:11
@joelvdavies joelvdavies added the documentation Improvements or additions to documentation label Mar 22, 2023
@joelvdavies joelvdavies marked this pull request as ready for review March 23, 2023 09:07
Copy link
Contributor

@patrick-austin patrick-austin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were a few places where the logic confused me a bit, I think they can either be simplified or would benefit from slightly expanded documentation (or they're the way they are for good reason which I haven't realised). However on the whole it's really impressive that you managed to get all this functionality into the existing architecture without needing to radically change anything.

(I also haven't manually tested this out yet so will do that as well)
Had one problem, not related to this but while testing I followed the documentation here:

*Example:*
```plaintext
fitting_data
load('results.dat')
```

Which is not accurate: attempting to use single quotes errors (Could not parse input...), it needs to be double quotes (other places in the documentation do use double quotes correctly, don't know why this one is different - just my bad luck to base my config off that one).

But other than that it seems to work (aside from failures to even take a single step from the starting values but that's not our problem).

I just remembered the whole discussion about ranges of simulation vs the "experimental" data to fit to. Currently, we silently overwrite the values provided by the user, and just use the ones from the "experimental" data:

# If we're fitting, we can't have file ranges
finfo = params["fitting_info"]
if finfo["fit"]:
if len(self._file_ranges) > 0:
raise MuSpinConfigError("Can not have file ranges when fitting")
# The x axis is overridden, whatever it is
xname = list(self._x_range.keys())[0]
self._constants.pop(xname, None) # Just in case it was here
self._x_range[xname] = finfo["data"][:, 0]
if xname == "t":
# Special case
self._time_N = len(self._x_range[xname])

I appreciate that any advanced handling of the x ranges may be out of scope for this PR (or indeed the final week of the rotation). I also think overwriting here is justifiable, and certainly avoids a lot of the complexity I ended up rambling about. However, I do not think this should happen without warning the user. I ran loads of these, confused why nothing was breaking even when I provided times starting 100 seconds after my "experimental" data. I think logging at level warn is probably enough to cover ourselves here.

If you did have any thoughts on more complex handling, I suppose they should be documented on #77 (and that left open after merging) or another issue.

docs/docs/input.md Outdated Show resolved Hide resolved
muspinsim/fitting.py Outdated Show resolved Hide resolved
muspinsim/input/input.py Outdated Show resolved Hide resolved
muspinsim/experiment.py Outdated Show resolved Hide resolved
muspinsim/experiment.py Outdated Show resolved Hide resolved
muspinsim/experiment.py Outdated Show resolved Hide resolved
muspinsim/tests/test_experiment.py Outdated Show resolved Hide resolved
joelvdavies and others added 3 commits March 27, 2023 09:14
Co-authored-by: patrick-austin <61705287+patrick-austin@users.noreply.github.com>
…oscopy-computational-project/muspinsim into investigation-of-fitting-#77
@joelvdavies
Copy link
Contributor Author

joelvdavies commented Mar 27, 2023

I just remembered the whole discussion about ranges of simulation vs the "experimental" data to fit to. Currently, we silently overwrite the values provided by the user, and just use the ones from the "experimental" data:

# If we're fitting, we can't have file ranges
finfo = params["fitting_info"]
if finfo["fit"]:
if len(self._file_ranges) > 0:
raise MuSpinConfigError("Can not have file ranges when fitting")
# The x axis is overridden, whatever it is
xname = list(self._x_range.keys())[0]
self._constants.pop(xname, None) # Just in case it was here
self._x_range[xname] = finfo["data"][:, 0]
if xname == "t":
# Special case
self._time_N = len(self._x_range[xname])

I appreciate that any advanced handling of the x ranges may be out of scope for this PR (or indeed the final week of the rotation). I also think overwriting here is justifiable, and certainly avoids a lot of the complexity I ended up rambling about. However, I do not think this should happen without warning the user. I ran loads of these, confused why nothing was breaking even when I provided times starting 100 seconds after my "experimental" data. I think logging at level warn is probably enough to cover ourselves here.

If you did have any thoughts on more complex handling, I suppose they should be documented on #77 (and that left open after merging) or another issue.

I will add a warning and make a note on the issue. Considering at this point in the code it will already have some value (either the default or user defined), we may as well always log it. I also noticed this would cause issues with celio, as it relies on always starting at an initial time of 0, its only the spacing and number of times that is actually used, so I will also add an error there.

@patrick-austin
Copy link
Contributor

I also noticed this would cause issues with celio, as it relies on always starting at an initial time of 0, its only the spacing and number of times that is actually used, so I will also add an error there.

What should happen if I have a start time of 0, but non even spacing? E.g. 0, 1, 2, 10? I just tried this and it seemed to run without erroring but I may not have set it up properly (it's a garbage input with celio 1 stuck at the bottom).

@joelvdavies
Copy link
Contributor Author

I also noticed this would cause issues with celio, as it relies on always starting at an initial time of 0, its only the spacing and number of times that is actually used, so I will also add an error there.

What should happen if I have a start time of 0, but non even spacing? E.g. 0, 1, 2, 10? I just tried this and it seemed to run without erroring but I may not have set it up properly (it's a garbage input with celio 1 stuck at the bottom).

Ah yes, it doesn't currently error, but it doesn't make sense either, we take the first value and expect all subsequent times to have the same spacing. I will prevent this too.

Copy link
Contributor

@patrick-austin patrick-austin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to get the Celio validation errors in alongside the overwrite warning

@joelvdavies joelvdavies merged commit d7a8653 into v2.3.0 Mar 28, 2023
@joelvdavies joelvdavies deleted the investigation-of-fitting-#77 branch March 28, 2023 07:59
This was referenced Mar 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve fitting of experimental data
2 participants