Feature request: remove output messages, restricted result range #3

vkhodygo · 2018-03-23T16:33:07Z

Hi again!

Sorry for disturbing, but I would like to ask you a few more things.
First, is it possible to add a key parameter which turns off all the output messages (except errors)?
I have to deal with massive numbers of files and I need to see my own print messages.

Second, can your algorithm in general use boundaries for resulting parameters? Say, I know in general, that the values of slopes a priori lie in range [0:2], thus, the result has to be in the range.

Sincerely,
V.

cjekel · 2018-03-24T13:10:45Z

First, is it possible to add a key parameter which turns off all the output messages (except errors)?
I have to deal with massive numbers of files and I need to see my own print messages.

you can use something like

fit(4,disp=False)

to fit to four line segments while turning off the optimization output...

What is sometimes printed often are numpy warnings, i.e. divide by zero. You can look at https://stackoverflow.com/questions/14463277/how-to-disable-python-warnings to disable warnings in Python.

Does this help?

The only print() in the code is

print(res)

which displays the optimization results?

Would you like a keyword to turn this off?

Second, can your algorithm in general use boundaries for resulting parameters? Say, I know in general, that the values of slopes a priori lie in range [0:2], thus, the result has to be in the range.

The fit doesn't solve for the slopes, it actually solves for the y locations provided x break points. From this solution the slopes can be calculated.

I'm working on implementing fixing the x,y locations at the boundaries (beginning and end). There has been a few people who have requested this, but it's not ready (or working) at the moment.

vkhodygo · 2018-03-24T15:13:34Z

you can use something like

fit(4,disp=False)

No, I still get messages like

fun: 4.895025824377757e-07
message: 'Optimization terminated successfully.'
nfev: 393
nit: 12
success: True
x: array([2.6693644 , 3.32241918])
And when I try to use fitfast, it becomes even worse (especially with 4 cores running)

What is sometimes printed often are numpy warnings, i.e. divide by zero.

I observe some messages, indeed, they look like this:

RuntimeWarning: invalid value encountered in double_scalars
  A[i,i] = A[i,i] + sum(((sepDataX[i] - breaks[i+1]) ** 2)) / ((breaks[i+1] - breaks[i]) ** 2)
/home/user/.local/lib/python3.6/site-packages/pwlf/pwlf.py:248: RuntimeWarning: invalid value encountered in double_scalars
  A[i,i+1] = A[i,i+1] - sum((sepDataX[i] - breaks[i]) * (sepDataX[i] - breaks[i+1])) / ((breaks[i+1] - breaks[i]) ** 2)
/home/user/.local/lib/python3.6/site-packages/pwlf/pwlf.py:249: RuntimeWarning: invalid value encountered in double_scalars
  B[i] = B[i] + (-sum(sepDataX[i] * sepDataY[i]) + breaks[i+1] * sum(sepDataY[i])) / (breaks[i+1] - breaks[i])
/home/user/.local/lib/python3.6/site-packages/pwlf/pwlf.py:241: RuntimeWarning: invalid value encountered in double_scalars
  A[i,i-1] = A[i,i-1] - sum((sepDataX[i-1] - breaks[i-1]) * (sepDataX[i-1] - breaks[i])) / ((breaks[i] - breaks[i-1]) ** 2)
/home/user/.local/lib/python3.6/site-packages/pwlf/pwlf.py:242: RuntimeWarning: invalid value encountered in double_scalars
  A[i,i] = A[i,i] + sum((sepDataX[i-1] - breaks[i-1]) ** 2) / ((breaks[i] - breaks[i-1]) ** 2)
/home/user/.local/lib/python3.6/site-packages/pwlf/pwlf.py:243: RuntimeWarning: invalid value encountered in double_scalars
  B[i] = B[i] + (sum(sepDataX[i-1] * sepDataY[i-1]) - breaks[i-1] * sum(sepDataY[i-1])) / (breaks[i] - breaks[i-1])

and

/home/user/.local/lib/python3.6/site-packages/numpy/core/_methods.py:112: RuntimeWarning: invalid value encountered in subtract
  x = asanyarray(arr - arrmean)

Is that what you mean?

Would you like a keyword to turn this off?

Yes, that would be great. Since I can't be sure that results in such cases are correct, I need to know what datasets lead to this (and I have a few thousands =/).

I'm working on implementing fixing the x,y locations at the boundaries (beginning and end). There has been a few people who have requested this, but it's not ready (or working) at the moment.

Well, that means that I have to do more things manually.

P.S. It seems that your fitfast doesn't work as planned, however, I need to check, what data breaks it, and open a new issue.

cjekel · 2018-03-25T17:58:52Z

Thanks for the PR!

66c0245 now defaults to prints being off. Use

piecewise_lin_fit(x, y, disp_res=True)

to turn prints on. This doesn't get rid of numpy warnings.

9959ec2 fitfast() now defaults to a population of 2. This should be faster than the differential evolution for all cases, at the cost of possibly not finding a good solution. Increase the population of fitfast() to find a better solution.
Can you describe you application for boundary slopes? Are you trying to force a solution range, or speed up the optimization implementation?

vkhodygo · 2018-04-16T15:03:52Z

Good, now I can see only important messages!
I haven't tried to use updated fitfast yet, hope it works properly now.
I know, that slopes can't be, say, negative and greater than 2. Thus, all values that are not in range can't be accepted. I know, that, for example, scipy allows to use boundaries for ranges, but your package is better for my purposes.

P.S. Your updated version definitely looks faster, however, now I get very strange results. This drives me crazy %)

Same dataset (msd shifted), but correct (at least acceptable) results only in the case of old algo. I use default code from your examples and 3 segments to fit the data.

cjekel · 2018-04-16T15:37:51Z

Edit*:
I've found a working example that breaks... will be working on a hotfix. Sorry about this.

Can you send me that msd shifted dataset to troubleshoot? Or reproduce on a simple data set? Are you using version 0.2.3?

In the meantime you can revert to the old release by running

[sudo] pip uninstall pwlf
[sudo] pip install pwlf==0.1.7

cjekel · 2018-04-16T16:10:26Z

Fixed the weird prediction issues in 0.2.4. Sorry about that, not sure how that escaped my test function! I need to think about that more...

I know, that slopes can't be, say, negative and greater than 2. Thus, all values that are not in range can't be accepted. I know, that, for example, scipy allows to use boundaries for ranges, but your package is better for my purposes.

Okay. I need to think about this more, but I think it can be done by setting up inequality constraints.

They would look something like
b_l <= b_1 <= b_h
Then
1 <= b_1/b_l
and
1 >= b_1/b_h

this might be useful for future https://math.stackexchange.com/questions/69613/linear-least-squares-with-inequality-constraints

vkhodygo · 2018-04-16T17:07:14Z

Okay. I need to think about this more, but I think it can be done by setting up inequality constraints.

Thank you.
The attached file is one of those with strange values. The first column contains time, the last one is the data I need. I import it, skip the first row with zero values and use numpy.log10() to linearize it.

data = pd.read_csv(file, sep='\s+', engine='python', usecols=(0, 3), skiprows=1, names=('lag', 'msd_shift'))
lag = np.log10(data['lag'].as_matrix())
msd_shift = np.log10(data['msd_shift'].as_matrix())

data.zip

cjekel · 2018-04-16T17:30:13Z

Works good in version 0.2.4!

import numpy as np
import matplotlib.pyplot as plt
import pwlf
import pandas as pd
data = pd.read_csv('ref_msd.16.bin', sep='\s+', engine='python', usecols=(0, 3), skiprows=1, names=('lag', 'msd_shift'))
lag = np.log10(data['lag'].as_matrix())
msd_shift = np.log10(data['msd_shift'].as_matrix())
myPWLF = pwlf.PiecewiseLinFit(lag, msd_shift)
myPWLF.fit(3)
yhat = myPWLF.predict(lag)
plt.figure()
plt.plot(lag, msd_shift, 'o')
plt.plot(lag, yhat)
plt.show()

vkhodygo · 2018-04-16T17:37:32Z

Great!
However, could you please take a look at the values of slopes: they are a little bit strange:
myPWLF.slopes
array([ 1.89755427, -0.67540119, 2.00421613])

Upd. I'm pretty sure, that the plot itself is correct. I have found the old one based on the initial algorithm. It seems that the first slope is identical in both cases, however, it's not clear what gives such behaviour.

Upd2. I feel that this is related to this question.

cjekel · 2018-04-16T19:21:47Z

I ended up creating a new function to evaluate the slopes by predicting at the break points. The results should be similar to the previous verision. See d59e0be for the function.

I will push 0.2.5 to pypi shortly .

Edit: My previous interpretation of the beta parameters as slopes was incorrect. New 0.2.5 release should give you similar slope values as you had before.

vkhodygo · 2018-04-16T20:20:37Z

I've updated it but no result so far. I still get the same values:
>>> myPWLF.slopes
array([ 1.89755342, -0.67540057, 2.00421982])
>>> myPWLF.beta
array([-4.09022781, 1.89755342, -0.67540057, 2.00421982])

Edit. Sorry, everything works fine, I'm simply used to python3 (and pip3) and your instructions for the upgrade are for p2.
Edit #2: I think, that it works like it has to. You can close this issue.

vkhodygo closed this as completed May 8, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: remove output messages, restricted result range #3

Feature request: remove output messages, restricted result range #3

vkhodygo commented Mar 23, 2018

cjekel commented Mar 24, 2018 •

edited

Loading

vkhodygo commented Mar 24, 2018

cjekel commented Mar 25, 2018

vkhodygo commented Apr 16, 2018

cjekel commented Apr 16, 2018 •

edited

Loading

cjekel commented Apr 16, 2018

vkhodygo commented Apr 16, 2018

cjekel commented Apr 16, 2018

vkhodygo commented Apr 16, 2018 •

edited

Loading

cjekel commented Apr 16, 2018 •

edited

Loading

vkhodygo commented Apr 16, 2018 •

edited

Loading

Feature request: remove output messages, restricted result range #3

Feature request: remove output messages, restricted result range #3

Comments

vkhodygo commented Mar 23, 2018

cjekel commented Mar 24, 2018 • edited Loading

vkhodygo commented Mar 24, 2018

cjekel commented Mar 25, 2018

vkhodygo commented Apr 16, 2018

cjekel commented Apr 16, 2018 • edited Loading

cjekel commented Apr 16, 2018

vkhodygo commented Apr 16, 2018

cjekel commented Apr 16, 2018

vkhodygo commented Apr 16, 2018 • edited Loading

cjekel commented Apr 16, 2018 • edited Loading

vkhodygo commented Apr 16, 2018 • edited Loading

cjekel commented Mar 24, 2018 •

edited

Loading

cjekel commented Apr 16, 2018 •

edited

Loading

vkhodygo commented Apr 16, 2018 •

edited

Loading

cjekel commented Apr 16, 2018 •

edited

Loading

vkhodygo commented Apr 16, 2018 •

edited

Loading