New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

generalized linear_regression #1716

Closed
wants to merge 26 commits into
base: master
from

Conversation

Projects
None yet
6 participants
@ThomasLecocq
Contributor

ThomasLecocq commented Mar 13, 2017

This PR adds a generalized linear regression to obspy.signal - this allows OLS and WLS, with/without allowing for an intercept.

PR Checklist

  • All tests still pass.
  • Any new features or fixed regressions are be covered via new tests.
  • Any new or changed features have are fully documented.
  • Significant changes have been added to CHANGELOG.txt .
  • First time contributors have added your name to CONTRIBUTORS.txt .

TODO

  • Test

ThomasLecocq added some commits Mar 13, 2017

@Jollyfant

This comment has been minimized.

Show comment
Hide comment
@Jollyfant

Jollyfant Mar 13, 2017

Contributor

I'm just curious but does OLS stand for orthogonal least squares (this is for bi-variant data) or ordinary least squares -- like the classical approach?

Contributor

Jollyfant commented Mar 13, 2017

I'm just curious but does OLS stand for orthogonal least squares (this is for bi-variant data) or ordinary least squares -- like the classical approach?

@ThomasLecocq

This comment has been minimized.

Show comment
Hide comment
@ThomasLecocq

ThomasLecocq Mar 13, 2017

Contributor

ordinary least squares & the weighted version

the goal of this is to remove the need for statsmodels for my msnoise routines that come down to obspy

Contributor

ThomasLecocq commented Mar 13, 2017

ordinary least squares & the weighted version

the goal of this is to remove the need for statsmodels for my msnoise routines that come down to obspy

ThomasLecocq added some commits Mar 13, 2017

@barsch

This comment has been minimized.

Show comment
Hide comment
@barsch

barsch Mar 13, 2017

Member

weight

weight

@ThomasLecocq

This comment has been minimized.

Show comment
Hide comment
@ThomasLecocq

ThomasLecocq Mar 13, 2017

Contributor

raaaah, indeed !

Contributor

ThomasLecocq commented on 682042e Mar 13, 2017

raaaah, indeed !

ThomasLecocq added some commits Mar 13, 2017

@ThomasLecocq ThomasLecocq referenced this pull request Mar 13, 2017

Open

Feature "Moving Window Cross-Spectrum" #1719

4 of 6 tasks complete

@ThomasLecocq ThomasLecocq added this to the 1.1.0 milestone Mar 13, 2017

@ThomasLecocq

This comment has been minimized.

Show comment
Hide comment
@ThomasLecocq

ThomasLecocq Mar 14, 2017

Contributor

good to go once reviewed by you guyz :)

Contributor

ThomasLecocq commented Mar 14, 2017

good to go once reviewed by you guyz :)

Show outdated Hide outdated obspy/signal/regression.py
1/sigma.
:param p0: Initial guess for the parameters. If None, then the initial
values will all be 0 (Different from SciPy where all are 1)
:param intercept: If False: solves y=a*x ; if True: solves y=a*x+b.

This comment has been minimized.

@Jollyfant

Jollyfant Mar 14, 2017

Contributor

It is semantics but maybe we should call this floating_intercept instead of intercept (or anchor_intercept but the boolean will need to flip). Because you will be anchoring the regression to the origin. Both regressions will naturally have an intercept and this threw me off initially.

CodeCanary fails but that is unrelated. Other than that I can approve. Nice tests!

@Jollyfant

Jollyfant Mar 14, 2017

Contributor

It is semantics but maybe we should call this floating_intercept instead of intercept (or anchor_intercept but the boolean will need to flip). Because you will be anchoring the regression to the origin. Both regressions will naturally have an intercept and this threw me off initially.

CodeCanary fails but that is unrelated. Other than that I can approve. Nice tests!

This comment has been minimized.

@Jollyfant

Jollyfant Mar 14, 2017

Contributor

constant to me says y = b so it is not a good candidate IMHO.

@Jollyfant

Jollyfant Mar 14, 2017

Contributor

constant to me says y = b so it is not a good candidate IMHO.

This comment has been minimized.

@ThomasLecocq

ThomasLecocq Mar 14, 2017

Contributor

or revert the arg : force_y_intercept_through_origin ... I don't know which one makes it easier to understand...

@ThomasLecocq

ThomasLecocq Mar 14, 2017

Contributor

or revert the arg : force_y_intercept_through_origin ... I don't know which one makes it easier to understand...

This comment has been minimized.

@Jollyfant

Jollyfant Mar 14, 2017

Contributor

Sure intercept_origin works if flipped.

@Jollyfant

Jollyfant Mar 14, 2017

Contributor

Sure intercept_origin works if flipped.

This comment has been minimized.

@ThomasLecocq

ThomasLecocq Mar 14, 2017

Contributor

Perfect ! I agree on that, it'll be easier to understand !

@ThomasLecocq

ThomasLecocq Mar 14, 2017

Contributor

Perfect ! I agree on that, it'll be easier to understand !

@ThomasLecocq

This comment has been minimized.

Show comment
Hide comment
@ThomasLecocq

ThomasLecocq Mar 14, 2017

Contributor

ah and I don't know how to trigger the doc bot ? @megies

Contributor

ThomasLecocq commented Mar 14, 2017

ah and I don't know how to trigger the doc bot ? @megies

@Jollyfant

This comment has been minimized.

Show comment
Hide comment
@Jollyfant

Jollyfant Mar 14, 2017

Contributor

+DOCS

Contributor

Jollyfant commented Mar 14, 2017

+DOCS

@ThomasLecocq

This comment has been minimized.

Show comment
Hide comment
@ThomasLecocq

ThomasLecocq Mar 24, 2017

Contributor

all good, OK for you boyz ? @krischer @megies

Contributor

ThomasLecocq commented Mar 24, 2017

all good, OK for you boyz ? @krischer @megies

Show outdated Hide outdated obspy/signal/regression.py
def linear_regression(xdata, ydata, weights=None, p0=None,
intercept_origin=True):
""" Use linear least squares to fit a function, f, to data. This method

This comment has been minimized.

@QuLogic

QuLogic Mar 24, 2017

Member

First line should be standalone:

"""
Use linear least squares to fit a function, f, to data.

This method is a ....
@QuLogic

QuLogic Mar 24, 2017

Member

First line should be standalone:

"""
Use linear least squares to fit a function, f, to data.

This method is a ....

ThomasLecocq added some commits Mar 27, 2017

Update regression.py
docstring
@megies

Some minor docstring changes, otherwise this is good to go, I think (without having it tried out).

Show outdated Hide outdated obspy/signal/regression.py
def linear_regression(xdata, ydata, weights=None, p0=None,
intercept_origin=True):
""" Use linear least squares to fit a function, f, to data.

This comment has been minimized.

@megies

megies Mar 28, 2017

Member

Please stick to our docstring conventions.. :-)

  • three double quotes
  • short one-line (if possible) summary
  • blank line
  • more details
def xyz(...):
    """
    Short summary on first line

    More detailed information,
    multiple lines etc.
    """
@megies

megies Mar 28, 2017

Member

Please stick to our docstring conventions.. :-)

  • three double quotes
  • short one-line (if possible) summary
  • blank line
  • more details
def xyz(...):
    """
    Short summary on first line

    More detailed information,
    multiple lines etc.
    """
Show outdated Hide outdated obspy/signal/regression.py
def linear_regression(xdata, ydata, weights=None, p0=None,
intercept_origin=True):
""" Use linear least squares to fit a function, f, to data.

This comment has been minimized.

@megies

megies Mar 28, 2017

Member

-> Use least squares to fit a linear function to data. (letter f as function name is not used otherwise)

@megies

megies Mar 28, 2017

Member

-> Use least squares to fit a linear function to data. (letter f as function name is not used otherwise)

This comment has been minimized.

@ThomasLecocq

ThomasLecocq Mar 28, 2017

Contributor

ok

@ThomasLecocq
Show outdated Hide outdated obspy/signal/regression.py
:meth:`scipy.optimize.minpack.curve_fit`; allowing for Ordinary Least
Square and Weighted Least Square regressions:
* OLS with origin intercept : ``linear_regression(xdata, ydata)``

This comment has been minimized.

@megies

megies Mar 28, 2017

Member

"origin intercept" sounds awkward, no? Maybe OLS through origin or OLS without intercept, .. or something else..?

@megies

megies Mar 28, 2017

Member

"origin intercept" sounds awkward, no? Maybe OLS through origin or OLS without intercept, .. or something else..?

This comment has been minimized.

@ThomasLecocq

ThomasLecocq Mar 28, 2017

Contributor

+1 for through origin

@ThomasLecocq

ThomasLecocq Mar 28, 2017

Contributor

+1 for through origin

Show outdated Hide outdated obspy/signal/regression.py
uncertainties are assumed to be 1. In SciPy vocabulary, our weights are
1/sigma.
:param p0: Initial guess for the parameters. If None, then the initial
values will all be 0 (Different from SciPy where all are 1)

This comment has been minimized.

@megies

megies Mar 28, 2017

Member

I guess there is a good reason for this deviation from scipy defaults with MSNoise in the back of the head..? But maybe in general it might be worth a thought to stick to scipy defaults?

@megies

megies Mar 28, 2017

Member

I guess there is a good reason for this deviation from scipy defaults with MSNoise in the back of the head..? But maybe in general it might be worth a thought to stick to scipy defaults?

This comment has been minimized.

@ThomasLecocq

ThomasLecocq Mar 28, 2017

Contributor

it's mostly because it's soooo complicated in scipy/statsmodels ... if we allow "sigma": then we should state "if you want to pass weights, then you should pass 1/weights"... :(

@ThomasLecocq

ThomasLecocq Mar 28, 2017

Contributor

it's mostly because it's soooo complicated in scipy/statsmodels ... if we allow "sigma": then we should state "if you want to pass weights, then you should pass 1/weights"... :(

Show outdated Hide outdated obspy/signal/regression.py
:param xdata: The independent variable where the data is measured.
:param ydata: The dependent data - nominally f(xdata, ...)
:param weights: If not None, the uncertainties in the ydata array. These
are used as weights in the least-squares problem. If None, the

This comment has been minimized.

@megies

megies Mar 28, 2017

Member
  • please use 4 spaces as indents on follow-up lines in parameter specs..
  • builtins / literals (like None) should be enclosed in two backticks or monospaced font in our docs
    :param weights: If not ``None``, the uncertainties in the ydata array. These
        are used as weights in the least-squares problem. If None, the
@megies

megies Mar 28, 2017

Member
  • please use 4 spaces as indents on follow-up lines in parameter specs..
  • builtins / literals (like None) should be enclosed in two backticks or monospaced font in our docs
    :param weights: If not ``None``, the uncertainties in the ydata array. These
        are used as weights in the least-squares problem. If None, the

This comment has been minimized.

@ThomasLecocq

ThomasLecocq Mar 28, 2017

Contributor

ok

@ThomasLecocq
Show outdated Hide outdated obspy/signal/regression.py
intercept_origin=True):
""" Use linear least squares to fit a function, f, to data.
This method is a generalized version of
:meth:`scipy.optimize.minpack.curve_fit`; allowing for Ordinary Least

This comment has been minimized.

@megies

megies Mar 28, 2017

Member

This should be :func:`...`, I think, in any case the link doesn't work: http://docs.obspy.org/pull-requests/1716/packages/autogen/obspy.signal.regression.linear_regression.html

@megies

This comment has been minimized.

@ThomasLecocq

ThomasLecocq Mar 28, 2017

Contributor

ok

@ThomasLecocq
Show outdated Hide outdated obspy/signal/regression.py
1/sigma.
:param p0: Initial guess for the parameters. If None, then the initial
values will all be 0 (Different from SciPy where all are 1)
:param intercept_origin: If True: solves y=a*x (default); if False:

This comment has been minimized.

@megies

megies Mar 28, 2017

Member

Again, enclose formulas etc in two backticks, please.

@megies

megies Mar 28, 2017

Member

Again, enclose formulas etc in two backticks, please.

This comment has been minimized.

@ThomasLecocq

ThomasLecocq Mar 28, 2017

Contributor

ok

@ThomasLecocq
slope, intercept = p
std_slope = np.sqrt(cov[0, 0])
std_intercept = np.sqrt(cov[1, 1])
return slope, intercept, std_slope, std_intercept

This comment has been minimized.

@megies

megies Mar 28, 2017

Member

Python modules in our repo usually have the doctest footer, even though there are none currently, would not hurt to add it

@megies

megies Mar 28, 2017

Member

Python modules in our repo usually have the doctest footer, even though there are none currently, would not hurt to add it

This comment has been minimized.

@ThomasLecocq

ThomasLecocq Mar 28, 2017

Contributor

euh ? what's that ?

@ThomasLecocq

ThomasLecocq Mar 28, 2017

Contributor

euh ? what's that ?

This comment has been minimized.

@ThomasLecocq

ThomasLecocq Mar 28, 2017

Contributor

ok boss :)

@ThomasLecocq

ThomasLecocq Mar 28, 2017

Contributor

ok boss :)

Show outdated Hide outdated obspy/signal/regression.py
:rtype: tuple
:returns: (slope, std_slope) if `intercept_origin` is `True`;
(slope, intercept, std_slope, std_intercept) if `False`.

This comment has been minimized.

@megies

megies Mar 28, 2017

Member

again.. two backticks for monospace -> "... if intercept_origin=True ..."

@megies

megies Mar 28, 2017

Member

again.. two backticks for monospace -> "... if intercept_origin=True ..."

This comment has been minimized.

@ThomasLecocq

ThomasLecocq Mar 28, 2017

Contributor

ok

@ThomasLecocq
Show outdated Hide outdated obspy/signal/regression.py
def linear_regression(xdata, ydata, weights=None, p0=None,
intercept_origin=True):

This comment has been minimized.

@megies

megies Mar 28, 2017

Member

I find the naming intercept_origin not very intuitive, maybe rename that option to allow_intercept?

@megies

megies Mar 28, 2017

Member

I find the naming intercept_origin not very intuitive, maybe rename that option to allow_intercept?

This comment has been minimized.

@ThomasLecocq

ThomasLecocq Mar 28, 2017

Contributor

well, read above, we're coming from there and arrived to intercept_origin :(

@ThomasLecocq

ThomasLecocq Mar 28, 2017

Contributor

well, read above, we're coming from there and arrived to intercept_origin :(

ThomasLecocq added some commits Mar 28, 2017

docstring
+DOCS
Merge remote-tracking branch 'origin/linear_regression' into linear_r…
…egression2

Conflicts:
	obspy/signal/regression.py
@ThomasLecocq

This comment has been minimized.

Show comment
Hide comment
@ThomasLecocq

ThomasLecocq Mar 29, 2017

Contributor

@megies : all done

Contributor

ThomasLecocq commented Mar 29, 2017

@megies : all done

@krischer

Some final comments - sorry for joining the discussion so late...I was a bit short on time. After these its IMHO good to merge :)

If you have no time right now, please let me know and I'll do it.

Show outdated Hide outdated obspy/signal/regression.py
# Email: Thomas.Lecocq@seismology.be
#
# Copyright (C) 2017 Thomas Lecocq
# --------------------------------------------------------------------

This comment has been minimized.

@krischer

krischer Mar 29, 2017

Member

Can you merge that information with the copyright header right below it? These two also kind of contradict each other.

@krischer

krischer Mar 29, 2017

Member

Can you merge that information with the copyright header right below it? These two also kind of contradict each other.

This comment has been minimized.

@ThomasLecocq

ThomasLecocq Mar 29, 2017

Contributor

ok, I actually followed the header in trigger.py :)

@ThomasLecocq

ThomasLecocq Mar 29, 2017

Contributor

ok, I actually followed the header in trigger.py :)

Show outdated Hide outdated obspy/signal/regression.py
from future.builtins import * # NOQA
import scipy.optimize
import numpy as np

This comment has been minimized.

@krischer

krischer Mar 29, 2017

Member

Please put numpy before scipy.

@krischer

krischer Mar 29, 2017

Member

Please put numpy before scipy.

This comment has been minimized.

@ThomasLecocq

ThomasLecocq Mar 29, 2017

Contributor

ok

@ThomasLecocq
Show outdated Hide outdated obspy/signal/regression.py
def linear_regression(xdata, ydata, weights=None, p0=None,
intercept_origin=True):
""" Use least squares to fit a linear function to data.

This comment has been minimized.

@krischer

krischer Mar 29, 2017

Member

Can you do

"""
Use least squares...

with a newline after the triple quotes

@krischer

krischer Mar 29, 2017

Member

Can you do

"""
Use least squares...

with a newline after the triple quotes

This comment has been minimized.

@ThomasLecocq

ThomasLecocq Mar 29, 2017

Contributor

ok

@ThomasLecocq
Show outdated Hide outdated obspy/signal/regression.py
(slope, intercept, std_slope, std_intercept) if ``False``.
"""
if weights is not None:
sigma = 1./weights

This comment has been minimized.

@krischer

krischer Mar 29, 2017

Member

Add spaces around / - not sure why flake8 does not complain.

@krischer

krischer Mar 29, 2017

Member

Add spaces around / - not sure why flake8 does not complain.

This comment has been minimized.

@ThomasLecocq

ThomasLecocq Mar 29, 2017

Contributor

ok

@ThomasLecocq

ThomasLecocq added some commits Mar 29, 2017

Merge remote-tracking branch 'origin/linear_regression' into linear_r…
…egression

Conflicts:
	obspy/signal/regression.py
@megies

This comment has been minimized.

Show comment
Hide comment
@megies

megies Mar 30, 2017

Member

@ThomasLecocq forgot to add a test file? http://tests.obspy.org/75817/#1

But on the other hand it sounds like the file is related to #1719? Or are you using the same test file in both PRs?

Member

megies commented Mar 30, 2017

@ThomasLecocq forgot to add a test file? http://tests.obspy.org/75817/#1

But on the other hand it sounds like the file is related to #1719? Or are you using the same test file in both PRs?

@ThomasLecocq

This comment has been minimized.

Show comment
Hide comment
@ThomasLecocq

ThomasLecocq Mar 30, 2017

Contributor

F@#{|^|^[#{^@[[|@#{[@#^[{#[^# git stuff.... my two PRs got crappy mixed together. so this one here shouldn't have the MWCS stuff... pfffr.....
what's the easiest? I create a new branch of current master, cherry pick changes for linear_regression and PR those ?

Contributor

ThomasLecocq commented Mar 30, 2017

F@#{|^|^[#{^@[[|@#{[@#^[{#[^# git stuff.... my two PRs got crappy mixed together. so this one here shouldn't have the MWCS stuff... pfffr.....
what's the easiest? I create a new branch of current master, cherry pick changes for linear_regression and PR those ?

@krischer

This comment has been minimized.

Show comment
Hide comment
@krischer

krischer Mar 30, 2017

Member

I create a new branch of current master, cherry pick changes for linear_regression and PR those ?

Sounds like a good way to resolve the mess.

Member

krischer commented Mar 30, 2017

I create a new branch of current master, cherry pick changes for linear_regression and PR those ?

Sounds like a good way to resolve the mess.

@megies

This comment has been minimized.

Show comment
Hide comment
@megies

megies Mar 30, 2017

Member

F@#{|^|^[#{^@[[|@#{[@#^[{#[^# git stuff.... my two PRs got crappy mixed together. so this one here shouldn't have the MWCS stuff... pfffr.....
what's the easiest? I create a new branch of current master, cherry pick changes for linear_regression and PR those ?

Need help? I can fix it if need be, squashing everything together, dont think we need any granularity here..

Member

megies commented Mar 30, 2017

F@#{|^|^[#{^@[[|@#{[@#^[{#[^# git stuff.... my two PRs got crappy mixed together. so this one here shouldn't have the MWCS stuff... pfffr.....
what's the easiest? I create a new branch of current master, cherry pick changes for linear_regression and PR those ?

Need help? I can fix it if need be, squashing everything together, dont think we need any granularity here..

@ThomasLecocq

This comment has been minimized.

Show comment
Hide comment
@ThomasLecocq

ThomasLecocq Apr 1, 2017

Contributor

ok I will do that tomorrow, easier...

Contributor

ThomasLecocq commented Apr 1, 2017

ok I will do that tomorrow, easier...

@krischer

This comment has been minimized.

Show comment
Hide comment
@krischer

krischer Apr 7, 2017

Member

Continued in #1747.

Member

krischer commented Apr 7, 2017

Continued in #1747.

@krischer krischer closed this Apr 7, 2017

@QuLogic QuLogic added the duplicate label Apr 12, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment