Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Pearson4 fit model (original Wikipedia version) #800

Merged
merged 4 commits into from
Sep 16, 2022

Conversation

lellid
Copy link
Contributor

@lellid lellid commented Aug 8, 2022

Description

This adds the Pearson4 fit model (original version from Wikipedia), which is a skewed version of Pearson7.

Type of Changes
  • Bug fix
  • New feature
  • Refactoring / maintenance
  • Documentation / examples
Tested on

Python: 3.10.4 | packaged by conda-forge | (main, Mar 30 2022, 08:38:02) [MSC v.1916 64 bit (AMD64)]
lmfit: 0.0.post2625+g0b9947a.d20220808, scipy: 1.9.0, numpy: 1.23.1, asteval: 0.9.27, uncertainties: 3.1.7

Verification

Have you

  • included docstrings that follow PEP 257?
  • verified that existing tests pass locally?
  • verified that the documentation builds locally?
  • squashed/minimized your commits and written descriptive commit messages?
  • added or updated existing tests to cover the changes? (not neccessary, new model))
  • updated the documentation and/or added an entry to the release notes (doc/whatsnew.rst)?
  • added an example? (not neccessary, only a model)

@newville
Copy link
Member

newville commented Aug 9, 2022

@lellid Thanks! This looks like a very good start to me, and I'll try to take a more careful look at it. FWIW, I didn't understand the failures (files were modified by isort?). Maybe @reneeotten can comment on that...

@reneeotten
Copy link
Contributor

FWIW, I didn't understand the failures (files were modified by isort?). Maybe @reneeotten can comment on that...

I can, the changes/additions to import packages do not follow the guidelines of the pre-commit hooks we use. That's something that could be verified locally before submitting a PR, but in short the diff for lineshapes.py in the imports should read like:

-from numpy import (arctan, copysign, cos, exp, isclose, isnan, log, pi, real,
-                   sin, sqrt, where)
+from numpy import (arctan, copysign, cos, exp, isclose, isnan, log, log1p, pi,
+                   real, sin, sqrt, where)
+from scipy.special import betaln as betalnfcn
 from scipy.special import erf, erfc
 from scipy.special import gamma as gamfcn
+from scipy.special import loggamma as loggammafcn
 from scipy.special import wofz

lmfit/models.py Outdated Show resolved Hide resolved
Copy link
Contributor

@reneeotten reneeotten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the PR @lellid - a few small comments after a quick look through

@newville
Copy link
Member

@reneeotten Ah, thanks -- "imports out of preferred order"!

@reneeotten
Copy link
Contributor

reneeotten commented Aug 10, 2022

@reneeotten Ah, thanks -- "imports out of preferred order"!

well... the main thing is you're not supposed to have several "import ..... as ....." on one line as was done in this PR:
from scipy.special import gamma as gamfcn, loggamma as loggammafcn, betaln as betalnfcn

- add tests
- improve documentation
- fix flake8 complainings
@lellid
Copy link
Contributor Author

lellid commented Aug 15, 2022

Hi,

I just updated this PR. I

  • added tests (code coverage is not complaining any more)
  • put the import statement in lineshapes.py on separate lines
  • improved the documentation in models.py
  • added Wiki link in lineshapes.py

@newville
Copy link
Member

newville commented Sep 4, 2022

@lellid @reneeotten if I understand correctly, we're all in favor of merging this, except for some "not-quite-as-beautiful-as-we-like" import statements. Please fix those so we can merge.

@newville newville mentioned this pull request Sep 4, 2022
11 tasks
@lellid
Copy link
Contributor Author

lellid commented Sep 5, 2022

I used iSort to sort the using statements, so now it passes the tests.

@newville
Copy link
Member

newville commented Sep 5, 2022

@lellid Thanks! I'll plan to merge in a couple of days to give time for final comments or suggestions.

lmfit/models.py Outdated Show resolved Hide resolved
@reneeotten
Copy link
Contributor

@lellid Thanks! I'll plan to merge in a couple of days to give time for final comments or suggestions.

thanks @lellid - for me the two comments about the docstrings (I left them again above) remain, after making these changes - or explaining why I'm wrong ;) - the commits should be squashed into one - then it's ready-to-go!

@newville
Copy link
Member

newville commented Sep 7, 2022

@reneeotten @lellid Yes, let's please have consistent code and docstrings! I think we are happy to merge this, but we kind of need a version we can support, which does include being able to explain the actual definition used. We discussed this earlier at #799, but this PR was created as a separate PR after that discussion.

To reiterate, I strongly suggest following the Wikipedia definition or having a really clearly documented explanation of why that form was used instead of the Wikipedia definition.

@lellid
Copy link
Contributor Author

lellid commented Sep 7, 2022

Hi,

I consolidated the duplication of the docstring.

To the other question about the Wikipedia definition and my implementation:

It is simply a numerical thing:

Of course I can use a naive implementation of $(1+arg^2)^{-m}$, but guess what happens if m is 1000 and arg is 10? Underflow. Or for instance, what is $Gamma(1000)$? Overflow! Finally, both terms together would give some reasonable result, but first, you get underflow multiplied by overflow.

So I logarithmize everything, and put all terms then in the $\exp$ function. For instance, $(1+arg^2)^{-m} = \exp(-m \times \log(1+arg^2)) = \exp(-m \times log1p(arg^2))$

I have seen other rather naive implementations in your code (for instance, Pearson7). Maybe, in the future, I will make some pull requests to correct that.

@reneeotten
Copy link
Contributor

I consolidated the duplication of the docstring.

thanks!

To the other question about the Wikipedia definition and my implementation:

It is simply a numerical thing:

Of course I can use a naive implementation of (1+arg2)−m, but guess what happens if m is 1000 and arg is 10? Underflow. Or for instance, what is Gamma(1000)? Overflow! Finally, both terms together would give some reasonable result, but first, you get underflow multiplied by overflow.

So I logarithmize everything, and put all terms then in the exp function. For instance, (1+arg2)−m=exp⁡(−m×log⁡(1+arg2))=exp⁡(−m×log1p(arg2))

That might be clear to you since you've probably stared at that equation for a while and looked at the math, but from a first read it wasn't to me. I also haven't checked that this is all implemented correctly, but I assume that you did... Of course all of this could be considered a "me problem" but perhaps you could just add a note to the function in lineshapes.py to mention it; just so that one doesn't need to go back to this PR in a few months/years to remember this ;)

I have seen other rather naive implementations in your code (for instance, Pearson7). Maybe, in the future, I will make some pull requests to correct that.

Well, naive or not it appears to have been working well for people for quite some time - of course improvements are always welcome.

Thanks again @lellid - this PR should be almost at the finish line.

@lellid
Copy link
Contributor Author

lellid commented Sep 8, 2022

Hi,

what I was really wondering about was why you don't have any tests for the function values in your code? At least I haven't seen any. Would make those numerical tweakings a lot more secure, if you can be sure that the values come out that you expect.

@newville
Copy link
Member

newville commented Sep 8, 2022

@lellid

I consolidated the duplication of the docstring.

OK, thanks. Well, at some point we might fix the capital "I" to "j" (Python's preferred sqrt(-1)).

To the other question about the Wikipedia definition and my implementation:

It is simply a numerical thing:

Of course I can use a naive implementation of , but guess what happens if m is 1000 and arg is 10? Underflow. Or for instance, what is ? Overflow! Finally, both terms together would give some reasonable result, but first, you get underflow multiplied by overflow.

So I logarithmize everything, and put all terms then in the function

These functions are kind of prone to under/overflow. I would not necessarily agree that it is obvious that taking log and then doing exp of that will help in general, but that's to say that it would not help here...

I have seen other rather naive implementations in your code (for instance, Pearson7). Maybe, in the future, I will make some pull requests to correct that.

Do you mean to use scipy.special.beta instead of using the equivalent with gamma? Yeah, that would be okay.

But, I'm not sure that the general approach of "work in log space and then take exponential" is necessarily worth the complication. Readability definitely counts.

@newville
Copy link
Member

newville commented Sep 8, 2022

@lellid

what I was really wondering about was why you don't have any tests for the function values in your code?

Probably because we feel like we don't need those specifically.

The tests of fitting with these functions do test the values of parameters, so if the functions break, that will very likely be reflected in the fit results.

At least I haven't seen any. Would make those numerical tweakings a lot more secure, if you can be sure that the values come out that you expect.

Well, I guess we expect that we are not changing the lineshape definitions very much ;)

@newville
Copy link
Member

@reneeotten @lellid I'm just going to squash and merge this. If there are more issues to resolve, they can be done afterward.

@newville newville merged commit 97ea5f1 into lmfit:master Sep 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants