Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New: Linear fitting 2.0 #1462

Closed

Conversation

thomasaarholt
Copy link
Contributor

@thomasaarholt thomasaarholt commented Feb 19, 2017

Description of the change

This PR adds support for multivariate linear regression in hyperspy. Linear fitting is approximately 20 times faster than leastsq for a random dataset.

A linear component is one which will only be stretched in the "up"-direction. For instance, the expression a*x**2 + b*x + c is linear for all parameters. It also works in 2D, for instance for a*x**2 + b*y**2. For components that can be stretched or changed in other directions, like the Exponential component A*exp(-x/k), A may be free but k must be fixed.

It can be used with m.multifit in order to fit across the entire navigation space simultaneously. In this case, if all parameters have equal initial values (if m.assign_current_values_to_all() has been called), the components' function need only be calculated for one pixel, and vectorised numpy can perform the regression extremely quickly (Thanks Paul Quinn!). If the components vary across the sample, for instance if an EELS CL onset_energy has been manually changed across several regions of the sample, this is supported as well.

This PR requires numpy>1.10.

Progress of the PR

  • Linear regression for a single pixel
  • Linear regression for multiple pixels
  • Support convolved components
  • Support twinned components
  • Change most components' .function to support full navigation space
  • Calculate EELS GOS across all navigation space
  • Advanced support for Expression component
  • update docstring,
  • update user guide,
  • need help to ensure EELS fine structure is working
  • support fit_component properly
  • support double eels power law, eels_vignetting and voigt components. (Thanks @joverbee)
  • support new Expression-based Polynomial Make Polynomial an Expression + bugfixes #1989
  • add tests
  • try to reduce ram usage
  • support lazy (dask)
  • ready for review.

Minimum Working Example

import hyperspy.api as hs
import numpy as np
from hyperspy._components.gaussian import Gaussian
from hyperspy._components.expression import Expression
from time import time

# Create a large signal with 3 gaussians and one polynomial background
nav_shape = (50,50)
x = np.arange(1024)

Gs = []
for centre, sigma in zip((300,500,800), (10,70,7)):
    Gs.append(Gaussian(A=60000, centre=centre, sigma=sigma))

P = Expression('a*x**2 + b*x + c', 'Poly', a=0.01, b=0.3, c=1.4)

for G in Gs:
    G.sigma.free = False
    G.centre.free = False

data = np.random.random(nav_shape)[...,None] * np.array([G.function(x) for G in Gs]).sum(-2)
data += np.random.random(nav_shape)[...,None] * P.function(x)

s = hs.signals.Signal1D(data)

# Create a model
Cs = Gs + [P]
m = s.create_model()
m.extend(Cs)

# reset values to be 1
for G in Gs:
    G.A.value = 1 
P.a.value = P.b.value = P.c.value = 1
m.assign_current_values_to_all()

# Linear fit with all components equal
if True:
    # Running this takes 0.65 seconds
    t = time()
    m.multifit(fitter='linear')
    print(time() - t)

# Normal nonlinear fit
if False: # Set True to time this
    # Running this takes 13 seconds on my computer
    m.fit('leastsq')
    m.assign_current_values_to_all()
    t = time()
    m.multifit(fitter='leastsq')
    print(time() - t)

# Changing the value of one component to calculate starting component functions for every pixel
# Takes 1.2 seconds on my computer
if False:
    G.sigma.map['values'][0,0] += 0.00001
    t = time()
    m.multifit(fitter='linear')
    print(time() - t)

@thomasaarholt
Copy link
Contributor Author

Right, I think this is ready to be reviewed. If people would like to test it, then try:

%matplotlib qt4
import hyperspy.api as hs
import numpy as np

eds = hs.datasets.example_signals.EDS_TEM_Spectrum()
eds.add_elements(["Cu"])
eds2 = hs.stack([eds, eds])
eds2.data = np.stack((eds.data for _ in range(200)))
eds2.get_dimensions_from_data()
m = eds2.create_model()
m.remove(0) # Remove non-linear background

m.multifit(fitter="lsq_linear", bounded=True)

@thomasaarholt
Copy link
Contributor Author

I'm tempted to rename "lsq_linear" to "linear" instead of the numpy.linalg.lstsq solver. The lsq_linear supports bounds and seems to be faster. (Thanks @tjof2!)

@tjof2
Copy link
Contributor

tjof2 commented Feb 20, 2017

I would certainly recommend scipy's lsq_linear as the default, especially if it includes bounds.

@tjof2
Copy link
Contributor

tjof2 commented Feb 20, 2017

Next big question: Documentation? Building on http://hyperspy.org/hyperspy-doc/current/user_guide/model.html#fitting-the-model-to-the-data - it would be good to have a brief summary of when linear is used and why.

(I'm well aware I was intending to update it myself as to why one should choose BFGS or CG etc... Blame the thesis!)

@thomasaarholt
Copy link
Contributor Author

I would appreciate help on estimating the chisq and red_chisq parameters, as I don't really know anything about estimating errors with fitting.

@thomasaarholt
Copy link
Contributor Author

thomasaarholt commented Feb 21, 2017

Proposed documentation here
Feel free to edit it, it will automatically save.

@tjof2
Copy link
Contributor

tjof2 commented Feb 21, 2017

@thomasaarholt I added some paragraph breaks and a bit about MLE (which I think is worth mentioning)

@thomasaarholt
Copy link
Contributor Author

Absolutely. I was hoping someone would notice my lack of mentioning MLE and feel like they should add it. Thanks :)

Currently the acronym used for MLE is "ml". Should we stick with MLE in the text, or change it? Could put "...typically the squared error term ('ls'), but can also use maximum likelihood estimation ('ml')...".

@tjof2
Copy link
Contributor

tjof2 commented Feb 21, 2017

@francisco-dlp thoughts? MLE is more common as the acronym I think, but leaving it as ('ml') is perhaps a good idea -as you suggest @thomasaarholt, that's a good way to clarify.

@thomasaarholt
Copy link
Contributor Author

Rebased from the linear_parameters branch/PR.

Copy link
Member

@francisco-dlp francisco-dlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good except for the comments above and:

  • When the model includes convolution (e.g. eelsmodel with low-loss), a linear model remains linear but the component's data must be convolved with the low-loss.
  • Instead of using the linearity check only to raise an error, wouldn't it be useful to use it to automatically detect when the model is linear and use the linear fitter instead of nlls? Some may not be aware of the possibility of fitting faster a linear model, so if HyperSpy does it for them it'll be welcome as it's just a couple of lines ahead.

-This could use the previous function to reduce typing-
"""
self._set_p0()
deactivated_components = []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deactivating components should be placed in a try/except/finally statement so that, if something goes wrong, the model doesn't end up in a funny state e.g.

try:
   # Deactive components
   # do something
except:
    raise
finally:
    # reactivate components

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks that this method needs to be updated with #1453

@@ -838,6 +840,67 @@ def _model_function(self, param):
to_return = self.__call__(non_convolved=False, onlyactive=True)
return to_return

def get_raw_signal_for_linear_fitting(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the new model methods should be private as they aren't for use consumption.

signal_to_fit_to = signal_to_fit_to / self.signal.axes_manager[-1].scale
return signal_to_fit_to

def check_all_active_components_are_linear(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs updating with the new Component.is_linear

return False
return True

def get_nonlinear_components(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be a property e.g. nonlinear_components. Also why not a linear_components property? I would be useful for users to know what's making their model nonlinear.

break
return nonlinear

def check_and_set_linear_parameters_not_zero(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make it private

fitter = preferences.Model.default_fitter
if self.check_all_active_components_are_linear():
fitter = "linear"
else: fitter = preferences.Model.default_fitter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add new line to make it look better

signal_axis = self.axis.axis[np.where(self.channel_switches)]
component_data = np.array([component.function(signal_axis)
for component in self if len(component.free_parameters) > 0 and component.active])
print(len(component_data))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove debugging print

output = linear(component_data.T, y)
fit_coefficients = output["x"]
self.p0 = tuple([oldp0*fit_coefficient for oldp0, fit_coefficient in zip(self.p0, fit_coefficients)])
self.fit_output = output
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about std?

params, residuals, rank, singular = linear_lstsq(component_data.T, y)
fit_coefficients = params
self.fit_output = [params, residuals, rank, singular]
self.p0 = tuple([oldp0*fit_coefficient for oldp0, fit_coefficient in zip(self.p0, fit_coefficients)])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about std?

output = nnls(component_data.T, y)
fit_coefficients = output[0]
self.p0 = tuple([oldp0*fit_coefficient for oldp0, fit_coefficient in zip(self.p0, fit_coefficients)])
self.fit_output = output
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about std?

@thomasaarholt
Copy link
Contributor Author

Actually, hold off on checking this out. I'm making some more changes.

@joverbee
Copy link

@thomasaarholt ,

Just out of curiosity, I was wondering how far you got with this. In EELSMODEL the linear fit has changed a lot also for core loss fitting, but it is essential that you make the fine structure and the powerlaw linear. In that case most core loss tasks can be fully linearised which leads to high speed and more importantly to a single convergence point and no longer any local minima (very desirable).

For hints on the power law, check out this component
https://github.com/joverbee/eelsmodel/blob/master/src/components/fast_BG.cpp#L47

and the fine structure for linear fitting can be found here
https://github.com/joverbee/eelsmodel/blob/master/src/components/gdoslin.cpp

If these things are already in, all the better.

@thomasaarholt
Copy link
Contributor Author

Hi @joverbee! I hit a bit of a dead-end with this.
After managing to support these various cases:

  • convolved vs non-convolved
  • linear and non-linear components
  • constant terms
  • components twinned with other components (never finished this)

...I realised that the least squares fitting was now faster than my linear approach. That's on a pixel by pixel fitting approach where the function is rerun on the spectra in each pixel.
At the same time, there was some bug that I couldn't pin down, and at that point I ran out of steam.

Now, feeding the fitting function a multidimensional array, and fitting everything simultaneously would be a lot faster (and is where I was working towards), but since I had a bug somewhere, I didn't want to go ahead with the multi-pixel fitting. This would also mean adding a new method to the api, and I wasn't sure I wanted to go that route before I had something that actually worked properly on a given spectrum.

I may very well pick this up again at some point and look at it with fresh eyes, but for now it's in my backlog.

I did not consider the eels fine structure in my approach. I'm not sure how it is modelled (I tried looking at your link, but it wasn't quite clear), since I've never needed it myself. I did bear in mind the powerlaw, and was thinking of ways to elegantly it on a component-by-component basis when a particular component demands further calculation to make it linear (in this case, take the log).

On the other hand: eelsmodel looks interesting! What is it, exactly? :) A readme file would clarify the repo a lot!

@thomasaarholt
Copy link
Contributor Author

thomasaarholt commented Jun 18, 2018

I've also been meaning to clear up the (rather embarassing) commit history on this, but haven't gotten round to it yet.

@joverbee
Copy link

@thomasaarholt
EELSMODEL is a model based fitting program which formed the basis of much of the fitting that is going on in hyperspy (at least in my opinion) and the ideas behind it are explained in several Ultramicroscopy papers with my name amongst the authors.

I don't see how least squares can ever be faster than linear fitting as in linear fitting you do one matrix inversion while in least square you do an itterative update of many of them. If this is the case, then you are still doing something wrong. For EELSMODEL the gain is enormous and the added benefit of a single convergence point is very important as well.

Sorry for the code in the fine structure paert being a bit opaque, but the idea is this: Give the model some extra shape freedom in the edge onset region of a Hartree Slater model by -adding- a piecewise linear (or spline) function to a limited region near the onset. Together with the linear convolution with the low loss this makes a very good description of the actual shape of EELS edges. In earlier times I had this fine structure -multiplied- with the atomic cross section but even though it makes physical sense in terms of density of states, it makes the model nonlinear and forces us into nonlinear optimisation...

hope this help and I sympathise with your struggle with Github and commiting to big projects, I am also far from proficient and got into big difficulties before which had nothing to do with the actual coding.

@thomasaarholt
Copy link
Contributor Author

thomasaarholt commented Jun 18, 2018

To clarify: The actual inversion calculation is of course super fast. It is the matter of constructing all the components to fit which is slow. For each component in the model I need to check if it is linear, if it should be convoluted, if it has a constant term, etc. This essentially constitutes a for-loop which needs to run at every new pixel, and is a bit slower than I'd like.

The solution to this is calculating the components using the parameter.map['values'], and similar, which returns an array for a given components parameters, across all pixels in the navigation space.

Then instead of performing the (slow) component calculation and the (fast) linear fitting for each pixel, I'll be calculating the component values only once, and linear fitting only once.

Thanks for the clarification on the EELS model!

@thomasaarholt
Copy link
Contributor Author

@joverbee Would you mind clarifying how you linearise your (double) power law background?
Do you 1) just select some background region free of other components, and then take the log of the data and the power law and then just do linear fitting? How do you handle the double power law in this case?
Or 2) Do you do some pre-fitting of r1 and r2 and then keep them constant for a linear fitting?
Or do you do something else entirely? I see you do a partial derivative calculation for each variable in the double power law. How does that factor in?

Documentation 2.0. Thanks @tjof2

clearer docs

corrections to docs

Deleted some unnecessary lines from linear fitting docs
@thomasaarholt thomasaarholt force-pushed the Linear_fit branch 2 times, most recently from 5546fac to 20af8d4 Compare July 5, 2018 09:47
MODEL: Underscore bug

MODEL: default fitter no longer from prefs

MODEL: Working with twins but not in pseudocomponents

fix twin situation

fix twin situation

moved test files

removed old case for appending components

Change test case to use artificial data

add standard error

Remove old code

TEST: Add and remove test cases, and fix hydrogenic

Cleaned up linear fitting procedure, simplified linear model estimation procedure and formatted code

Removed unused code, cleaned up with pep8

moved error message

Rewrote component function calls to support multidimensional models

fixed small bugs with previous commits

more fixes to the previous, now working

made expression fix_free separation 2D-able

corrected shape

Add 2D model support for linear fitting

Added test cases for model2D linear fitting

model.py typo

hotfix for Zanetta

large multifit update - mostly working

need to fix expression multi

getting closer

 minor

new twin tactic

much changes

beginning of EELS CL support

further changes to eels

modified convolve. Nearly there

further GOS changes

hello

fixed Model2D
updated bleasdale to take parameters in function call

remove print

readied most comps for linear multifit

minor

fixing components and adding tests
remove unused code

docstring and PEP improvements

update docs

remove old lsq_linear reference

remove old import

fixed fit_component and signal_range
@dnjohnstone
Copy link
Contributor

@thomasaarholt are you planning on finishing this? It would be quite handy!

@thomasaarholt
Copy link
Contributor Author

thomasaarholt commented Jul 26, 2019

@dnjohnstone I was hoping to do so, but increasingly it looks like it's just a lot of work to get properly right and I haven't had the time to sit down and do it.

@ericpre's work adding function_nd methods to all components is really nice and definitely simplifies some, but at least at the time of last update (july last year), there were a lot of chunks in hyperspy that needed to get into place before the linear fitting was usable.

Some of the problem is that I kept expanding the scope as I went along, and so it turned into an ever-expanding workload. I still do want to finish it, because I had a lot of great discussions and help from various people that made this all very neat, but there are so many aspects to it that I'm worried that I would introduce errors that would go unnoticed for some time.

@dnjohnstone do you have a specific usecase in mind?

@francisco-dlp
Copy link
Member

@thomasaarholt, do you think that this could be ready for the 1.6 release?

@francisco-dlp francisco-dlp mentioned this pull request Jun 8, 2020
17 tasks
@thomasaarholt
Copy link
Contributor Author

I spent yesterday editing this PR to support linear fitting in its most basic form. I'm submitting a new PR on this on Saturday. That should be ready for 1.6.

I will then improve it by one or two PRs to support a faster implementation for certain cases (where the starting values of the components to be fit are the same across all navigation pixels), and finally as a "fit the whole navigation space at once if memory allows".

@francisco-dlp
Copy link
Member

That's great.

(where the starting values of the components to be fit are the same across all navigation pixels),

I didn't get this. Am I wrong in thinking that the linear model shouldn't use starting parameters?

"fit the whole navigation space at once if memory allows".

It'll be desirable to get this for this release too. I'll be more available from next week and, if you like, I can try to help with this one.

@thomasaarholt
Copy link
Contributor Author

Regarding your question - you're sorta correct, but the linear parameters need to be nonzero at the beginning in order to generate a component that isn't just zeroes.

At the moment I check each pixel to make sure the parameters aren't zero, and if so set it to one before fitting.

I could just set all values to be one before beginning the fitting.

@francisco-dlp
Copy link
Member

There may be no need to do that, you could use the _f() method of the Expression component to call the function with only the linear parameter set to 1.

@thomasaarholt
Copy link
Contributor Author

There may be no need to do that, you could use the _f() method of the Expression component to call the function with only the linear parameter set to 1.

Yes, I believe you are right. I'll incorporate that.

@thomasaarholt
Copy link
Contributor Author

@francisco-dlp something to chew on for after the "basic linear fit" PR I'm working on is finished:
If a user performs m.multifit(fitter='linear') it would be nice to perform the fitting of the entire navigation space simultanously if there is memory for it.

If the dataset is too big, one would fit on a pixel-by-pixel method.

However, I think that estimating whether one has enough memory for linear fitting the entire dataset in one go can either be fragile or overestimate the amount of memory needed. Instead one could have a separate fitter='linear_simultanous'.

What do you think?

@thomasaarholt
Copy link
Contributor Author

Hm, how about if we always try to fit simultaneously with linear, unless there's argument like "fit_pixel_by_pixel" (not that name :p). If we run out of memory, we try to catch that and print a message informing about the above argument.

@thomasaarholt thomasaarholt mentioned this pull request Jun 15, 2020
5 tasks
@thomasaarholt
Copy link
Contributor Author

Closed in favour of #2422.

@francisco-dlp francisco-dlp removed this from the v1.6 milestone Aug 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants