Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Carljv improved lowess rebased2 #856

Merged

Conversation

josef-pkt
Copy link
Member

rebased version of the cython lowess version PR #266

I left the old lowess in there as smoothers_lowess_old.py, just in case

unit tests run without errors or failures

Added a wrapper function to check input (type casting), optional sorting, ...
see comments below

@josef-pkt
Copy link
Member Author

rebased again and force pushed (includes update to .travis.yml)

let's see if we get TravisCI on py3

@josef-pkt
Copy link
Member Author

travisCI is now green on both py 2.7 and py 3.2

I just saw that the cython function doesn't have a pop-up signature, that's annoying because I had to look at the source to see what argument determines the bandwidth choice (frac).

@josef-pkt
Copy link
Member Author

some usage limitations in the design

I was thinking whether we can attach a pandas index to the returned array.

The problem is that we don't know the internal reindexing.

  1. lowess drops not isfinite() observations
  2. lowess sorts the arrays by exog

as a consequence the returned arrays don't have the same index order (and might be shorter) than the original arrays.
This doesn't matter if the arrays are already sorted and complete, but then we don't need to check for not finite and don't need to sort. (use case plotting residual for example)
This might not matter if we just plot and matplotlib reorders the plot line by exog (x-axis)

However, if we want to compare the returned array with the original array (and align them), then in the current implementation, we would have to redo the isfinite check and the argsort.

Note: the old version was doing the same thing, but now it will be inside the compiled code.

# same length.
if exog.ndim != 1:
raise ValueError('exog must be a vector')
if endog.ndim != 1:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ndim checks are redundant, since the signature already requires it in the compiled version.

@josef-pkt
Copy link
Member Author

Pretty much finished, except for dropping the x from the returned array.

@josef-pkt
Copy link
Member Author

changing the return shape and dropping x doesn't break anything in the graphics test suite.
I thought we have an option to add a lowess line to a regression diagnostics plot

add_lowess in graphics doesn't have unit tests

@josef-pkt
Copy link
Member Author

I didn't remember the regression_plots correctly. We are not sorting in the partial_residual plot for example (ex_regression_plots.py). That means we still need to sort the lowess curve to get the correct matplotlib lines.

only regressionplots.plot_fit sorts by the exog that is on the x-axis of the plot.

need to rethink the interface again. return_sorted=True to get old behavior?

@josef-pkt
Copy link
Member Author

I'm back to returning the sorted array by default as before.
This behavior is more convenient for pure plotting purposes. (The only thing I don't like is the column_stack instead of returning 2 arrays)

The new behavior, when we change options from the default, is more useful when we consider lowess as a robust local linear smoother, and we want to interpret the fittedvalues in the same way as in the regression models.
The overhead can now also be reduced if we want to do cross-validation or similar.

@josef-pkt
Copy link
Member Author

I think I'm done.

@josef-pkt
Copy link
Member Author

rebased and force pushed, (maybe shouldn't have done it)

ready for merge, when TravisCI is green

@josef-pkt
Copy link
Member Author

TravisCI doesn't start, merging anyway

josef-pkt added a commit that referenced this pull request Jun 22, 2013
@josef-pkt josef-pkt merged commit 76c7f0b into statsmodels:master Jun 22, 2013
@josef-pkt josef-pkt mentioned this pull request Jun 22, 2013
@josef-pkt josef-pkt deleted the carljv_improved-lowess_rebased2 branch June 22, 2013 18:36
PierreBdR pushed a commit to PierreBdR/statsmodels that referenced this pull request Sep 2, 2014
…ess_rebased2

Carljv improved cythonized lowess closes statsmodels#266
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants