Add 'results_function' parameter for improving fitting capabilities #77 #78

joelvdavies · 2023-03-21T13:07:50Z

Adds a parameter results_function that can take all the usual constants, variables and functions available to the input file and additionally two extra 'x' and 'y' representing the output of muspinsim's simulation and runs it before outputting the results. This allows fitting on functions that modify the data e.g.

results_function
    A*y+B

For a linear fit, offsetting the y values and scaling them.

This also detects when fitting parameters are present in any sections of the input other than this function, and in the case they are only used in results_function the FittingRunner will cache the results and only run the simulation once, but repeatedly fit with this function so that it can run significantly faster for large systems.

Other notes

Also adds least-squares as an additional fitting method (it behaves better than the default nelder-mead in the case of linear fitting but it also typically requires more function evaluations so is not as suitable for atomistic fitting)
- It seems nelder-mead is just more prone to getting stuck looking at the solution here: https://stackoverflow.com/questions/72152400/nelder-mead-isnt-converging-scipy-why-is-there-only-one-initial-point
Includes docs for these changes
Fixes bug preventing fitting with OpenMPI using mpirun -n
Removes a limit of a maximum of 4 fitting variables

Closes #77

patrick-austin

There were a few places where the logic confused me a bit, I think they can either be simplified or would benefit from slightly expanded documentation (or they're the way they are for good reason which I haven't realised). However on the whole it's really impressive that you managed to get all this functionality into the existing architecture without needing to radically change anything.

~~(I also haven't manually tested this out yet so will do that as well)~~
Had one problem, not related to this but while testing I followed the documentation here:

muspinsim/docs/docs/input.md

Lines 311 to 315 in 3dc9fba

    
           *Example:* 
        
           ```plaintext 
        
           fitting_data 
        
               load('results.dat') 
        
           ```

Which is not accurate: attempting to use single quotes errors (Could not parse input...), it needs to be double quotes (other places in the documentation do use double quotes correctly, don't know why this one is different - just my bad luck to base my config off that one).

~~But other than that it seems to work (aside from failures to even take a single step from the starting values but that's not our problem).~~

I just remembered the whole discussion about ranges of simulation vs the "experimental" data to fit to. Currently, we silently overwrite the values provided by the user, and just use the ones from the "experimental" data:

muspinsim/muspinsim/simconfig.py

Lines 177 to 188 in 3dc9fba

    
           # If we're fitting, we can't have file ranges 
        
           finfo = params["fitting_info"] 
        
           if finfo["fit"]: 
        
               if len(self._file_ranges) > 0: 
        
                   raise MuSpinConfigError("Can not have file ranges when fitting") 
        
               # The x axis is overridden, whatever it is 
        
               xname = list(self._x_range.keys())[0] 
        
               self._constants.pop(xname, None)  # Just in case it was here 
        
               self._x_range[xname] = finfo["data"][:, 0] 
        
               if xname == "t": 
        
                   # Special case 
        
                   self._time_N = len(self._x_range[xname])

I appreciate that any advanced handling of the x ranges may be out of scope for this PR (or indeed the final week of the rotation). I also think overwriting here is justifiable, and certainly avoids a lot of the complexity I ended up rambling about. However, I do not think this should happen without warning the user. I ran loads of these, confused why nothing was breaking even when I provided times starting 100 seconds after my "experimental" data. I think logging at level warn is probably enough to cover ourselves here.

If you did have any thoughts on more complex handling, I suppose they should be documented on #77 (and that left open after merging) or another issue.

docs/docs/input.md

muspinsim/fitting.py

muspinsim/input/input.py

muspinsim/experiment.py

muspinsim/tests/test_experiment.py

Co-authored-by: patrick-austin <61705287+patrick-austin@users.noreply.github.com>

…oscopy-computational-project/muspinsim into investigation-of-fitting-#77

joelvdavies · 2023-03-27T08:48:33Z

I just remembered the whole discussion about ranges of simulation vs the "experimental" data to fit to. Currently, we silently overwrite the values provided by the user, and just use the ones from the "experimental" data:

muspinsim/muspinsim/simconfig.py

Lines 177 to 188 in 3dc9fba

# If we're fitting, we can't have file ranges

finfo = params["fitting_info"]

if finfo["fit"]:

if len(self._file_ranges) > 0:

raise MuSpinConfigError("Can not have file ranges when fitting")

# The x axis is overridden, whatever it is

xname = list(self._x_range.keys())[0]

self._constants.pop(xname, None) # Just in case it was here

self._x_range[xname] = finfo["data"][:, 0]

if xname == "t":

# Special case

self._time_N = len(self._x_range[xname])

I appreciate that any advanced handling of the x ranges may be out of scope for this PR (or indeed the final week of the rotation). I also think overwriting here is justifiable, and certainly avoids a lot of the complexity I ended up rambling about. However, I do not think this should happen without warning the user. I ran loads of these, confused why nothing was breaking even when I provided times starting 100 seconds after my "experimental" data. I think logging at level warn is probably enough to cover ourselves here.

If you did have any thoughts on more complex handling, I suppose they should be documented on #77 (and that left open after merging) or another issue.

I will add a warning and make a note on the issue. Considering at this point in the code it will already have some value (either the default or user defined), we may as well always log it. I also noticed this would cause issues with celio, as it relies on always starting at an initial time of 0, its only the spacing and number of times that is actually used, so I will also add an error there.

patrick-austin · 2023-03-27T14:09:21Z

I also noticed this would cause issues with celio, as it relies on always starting at an initial time of 0, its only the spacing and number of times that is actually used, so I will also add an error there.

What should happen if I have a start time of 0, but non even spacing? E.g. 0, 1, 2, 10? I just tried this and it seemed to run without erroring but I may not have set it up properly (it's a garbage input with celio 1 stuck at the bottom).

joelvdavies · 2023-03-27T14:16:17Z

I also noticed this would cause issues with celio, as it relies on always starting at an initial time of 0, its only the spacing and number of times that is actually used, so I will also add an error there.

What should happen if I have a start time of 0, but non even spacing? E.g. 0, 1, 2, 10? I just tried this and it seemed to run without erroring but I may not have set it up properly (it's a garbage input with celio 1 stuck at the bottom).

Ah yes, it doesn't currently error, but it doesn't make sense either, we take the first value and expect all subsequent times to have the same spacing. I will prevent this too.

patrick-austin

Great to get the Celio validation errors in alongside the overwrite warning

joelvdavies added 3 commits March 20, 2023 12:14

Add results_function parameter and use during fitting

5cb46fb

Fix default value and ensure still works when not defined

4565963

Cache simulation where possible to speedup fitting

903b007

joelvdavies added the enhancement New feature or request label Mar 21, 2023

joelvdavies self-assigned this Mar 21, 2023

Fix error when fitting with mpirun

c51068f

joelvdavies changed the base branch from main to v2.3.0 March 21, 2023 13:11

joelvdavies added 3 commits March 21, 2023 15:13

Fix existing unit tests

f2d9d1c

Add some unit tests for MuSpinInput

f06ce49

Add unit tests to experiment, fitting and config

bd23a62

joelvdavies force-pushed the investigation-of-fitting-#77 branch from 05983e9 to bd23a62 Compare March 22, 2023 09:21

Fix reserved variable name clashes in tests

b407dc0

joelvdavies force-pushed the investigation-of-fitting-#77 branch from 895ed18 to b407dc0 Compare March 22, 2023 10:17

joelvdavies added 2 commits March 22, 2023 12:44

Add documentation for results_function

e975b7d

Remove usage of x and y in the fitting example as they are now reserved

e158e3d

joelvdavies added the documentation Improvements or additions to documentation label Mar 22, 2023

joelvdavies added 2 commits March 22, 2023 15:28

Add least-squares as an extra fitting method

ec65407

Add some more keyword unit tests to test_input

1bbfc99

joelvdavies requested a review from patrick-austin March 23, 2023 09:07

joelvdavies marked this pull request as ready for review March 23, 2023 09:07

patrick-austin requested changes Mar 24, 2023

View reviewed changes

joelvdavies and others added 3 commits March 27, 2023 09:14

Update docs/docs/input.md

87168c8

Co-authored-by: patrick-austin <61705287+patrick-austin@users.noreply.github.com>

Apply some suggestions

2126cfa

Merge branch 'investigation-of-fitting-#77' of github.com:muon-spectr…

81ff829

…oscopy-computational-project/muspinsim into investigation-of-fitting-#77

Add warning about overriding x_axis values during fitting

28af575

joelvdavies mentioned this pull request Mar 27, 2023

Improve fitting of experimental data #77

Closed

Add error when times don't start at 0 when using Celio's

503b1aa

joelvdavies force-pushed the investigation-of-fitting-#77 branch from fb50da9 to 503b1aa Compare March 27, 2023 09:25

joelvdavies requested a review from patrick-austin March 27, 2023 09:37

Add error when celio's is used with uneven spacing in the times

3278a41

joelvdavies force-pushed the investigation-of-fitting-#77 branch from 9d81da1 to 3278a41 Compare March 27, 2023 14:44

patrick-austin approved these changes Mar 27, 2023

View reviewed changes

joelvdavies merged commit d7a8653 into v2.3.0 Mar 28, 2023

joelvdavies deleted the investigation-of-fitting-#77 branch March 28, 2023 07:59

This was referenced Mar 29, 2023

Fix ALC fitting #81 #82

Merged

Release v2.3.0 #83

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add 'results_function' parameter for improving fitting capabilities #77 #78

Add 'results_function' parameter for improving fitting capabilities #77 #78

joelvdavies commented Mar 21, 2023 •

edited

Loading

patrick-austin left a comment •

edited

Loading

joelvdavies commented Mar 27, 2023 •

edited

Loading

patrick-austin commented Mar 27, 2023

joelvdavies commented Mar 27, 2023

patrick-austin left a comment

	# If we're fitting, we can't have file ranges
	finfo = params["fitting_info"]
	if finfo["fit"]:
	if len(self._file_ranges) > 0:
	raise MuSpinConfigError("Can not have file ranges when fitting")
	# The x axis is overridden, whatever it is
	xname = list(self._x_range.keys())[0]
	self._constants.pop(xname, None) # Just in case it was here
	self._x_range[xname] = finfo["data"][:, 0]
	if xname == "t":
	# Special case
	self._time_N = len(self._x_range[xname])

Add 'results_function' parameter for improving fitting capabilities #77 #78

Add 'results_function' parameter for improving fitting capabilities #77 #78

Conversation

joelvdavies commented Mar 21, 2023 • edited Loading

Other notes

patrick-austin left a comment • edited Loading

Choose a reason for hiding this comment

joelvdavies commented Mar 27, 2023 • edited Loading

patrick-austin commented Mar 27, 2023

joelvdavies commented Mar 27, 2023

patrick-austin left a comment

Choose a reason for hiding this comment

joelvdavies commented Mar 21, 2023 •

edited

Loading

patrick-austin left a comment •

edited

Loading

joelvdavies commented Mar 27, 2023 •

edited

Loading