Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optuna execution error of multiple pta indicators #15

Closed
QAQOAO opened this issue Oct 14, 2021 · 9 comments
Closed

Optuna execution error of multiple pta indicators #15

QAQOAO opened this issue Oct 14, 2021 · 9 comments

Comments

@QAQOAO
Copy link

QAQOAO commented Oct 14, 2021

Well, after changing the code to my personal usage, I found that particularly some indicators of pandas-ta both have strange errors like my previous issue :tos_stdevall, but the message is slightly different.

It seems to be a common error for pta.stochrsi, pta.tsi, pta.smi. Originally, I think it is an individual issue for one indicator. So I just drop it and rerun, but after I running “all” for three times, error of pta.stochrsi, pta.tsi, pta.smi came out sequentially. Afterwards, I think these errors might all have one thing in common. That is, when the indicator has multiple period parameters. So there might be more indicators with the error like below but not found by me yet. Hopefully you can solve it soon, thanks again.

Below is the error message of pta.smi.


RemoteTraceback Traceback (most recent call last)
RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/tuneta/optimize.py", line 123, in _objective
res = [eval_res(X, self.function, self.idx, trial, sym=sym) for sym, X in X.groupby(level=1)]
File "/usr/local/lib/python3.7/dist-packages/tuneta/optimize.py", line 123, in
res = [eval_res(X, self.function, self.idx, trial, sym=sym) for sym, X in X.groupby(level=1)]
File "/usr/local/lib/python3.7/dist-packages/tuneta/optimize.py", line 98, in eval_res
res = eval(function)
File "", line 1, in
File "/usr/local/lib/python3.7/dist-packages/pandas_ta/momentum/smi.py", line 23, in smi
tsi_df = tsi(close, fast=fast, slow=slow, signal=signal, scalar=scalar)
File "/usr/local/lib/python3.7/dist-packages/pandas_ta/momentum/tsi.py", line 34, in tsi
tsi_signal = ma(mamode, tsi, length=signal)
File "/usr/local/lib/python3.7/dist-packages/pandas_ta/overlap/ma.py", line 73, in ma
else: return ema(source, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/pandas_ta/overlap/ema.py", line 22, in ema
ema = EMA(close, length)
File "/usr/local/lib/python3.7/dist-packages/talib/init.py", line 35, in wrapper
result = func(*args, **kwargs)
File "talib/_func.pxi", line 2931, in talib._ta_lib.EMA
File "talib/_func.pxi", line 68, in talib._ta_lib.check_begidx1
Exception: inputs are all NaN

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/multiprocess/pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "/usr/local/lib/python3.7/dist-packages/tuneta/optimize.py", line 189, in fit
n_trials=self.n_trials, callbacks=[_early_stopping_opt])
File "/usr/local/lib/python3.7/dist-packages/optuna/study/study.py", line 409, in optimize
show_progress_bar=show_progress_bar,
File "/usr/local/lib/python3.7/dist-packages/optuna/study/_optimize.py", line 76, in _optimize
progress_bar=progress_bar,
File "/usr/local/lib/python3.7/dist-packages/optuna/study/_optimize.py", line 163, in _optimize_sequential
trial = _run_trial(study, func, catch)
File "/usr/local/lib/python3.7/dist-packages/optuna/study/_optimize.py", line 264, in _run_trial
raise func_err
File "/usr/local/lib/python3.7/dist-packages/optuna/study/_optimize.py", line 213, in _run_trial
value_or_values = func(trial)
File "/usr/local/lib/python3.7/dist-packages/tuneta/optimize.py", line 188, in
self.study.optimize(lambda trial: _objective(self, trial, X, y),
File "/usr/local/lib/python3.7/dist-packages/tuneta/optimize.py", line 128, in _objective
raise RuntimeError(f"Optuna execution error: {self.function}")
RuntimeError: Optuna execution error: pta.smi(X.close, fast=trial.suggest_int('fast', 2, 30), slow=trial.suggest_int('slow', 2, 30), signal=trial.suggest_int('signal', 2, 30), scalar=trial.suggest_int('scalar', 2, 30), )
"""

The above exception was the direct cause of the following exception:

RuntimeError Traceback (most recent call last)
in ()
7 ranges=[(2, 30)],
8 trials=100,
----> 9 early_stop=50,
10 )
11

2 frames
/usr/local/lib/python3.7/dist-packages/multiprocess/pool.py in get(self, timeout)
655 return self._value
656 else:
--> 657 raise self._value
658
659 def _set(self, i, obj):

RuntimeError: Optuna execution error: pta.smi(X.close, fast=trial.suggest_int('fast', 2, 30), slow=trial.suggest_int('slow', 2, 30), signal=trial.suggest_int('signal', 2, 30), scalar=trial.suggest_int('scalar', 2, 30), )

——————————————————————-

Below is the error message of pta.stochrsi

RemoteTraceback Traceback (most recent call last)
RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/tuneta/optimize.py", line 123, in _objective
res = [eval_res(X, self.function, self.idx, trial, sym=sym) for sym, X in X.groupby(level=1)]
File "/usr/local/lib/python3.7/dist-packages/tuneta/optimize.py", line 123, in
res = [eval_res(X, self.function, self.idx, trial, sym=sym) for sym, X in X.groupby(level=1)]
File "/usr/local/lib/python3.7/dist-packages/tuneta/optimize.py", line 98, in eval_res
res = eval(function)
File "", line 1, in
File "/usr/local/lib/python3.7/dist-packages/pandas_ta/momentum/stochrsi.py", line 30, in stochrsi
stochrsi_d = ma(mamode, stochrsi_k, length=d)
File "/usr/local/lib/python3.7/dist-packages/pandas_ta/overlap/ma.py", line 65, in ma
elif name == "sma": return sma(source, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/pandas_ta/overlap/sma.py", line 20, in sma
sma = SMA(close, length)
File "/usr/local/lib/python3.7/dist-packages/talib/init.py", line 35, in wrapper
result = func(*args, **kwargs)
File "talib/_func.pxi", line 4538, in talib._ta_lib.SMA
File "talib/_func.pxi", line 68, in talib._ta_lib.check_begidx1
Exception: inputs are all NaN

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/multiprocess/pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "/usr/local/lib/python3.7/dist-packages/tuneta/optimize.py", line 189, in fit
n_trials=self.n_trials, callbacks=[_early_stopping_opt])
File "/usr/local/lib/python3.7/dist-packages/optuna/study/study.py", line 409, in optimize
show_progress_bar=show_progress_bar,
File "/usr/local/lib/python3.7/dist-packages/optuna/study/_optimize.py", line 76, in _optimize
progress_bar=progress_bar,
File "/usr/local/lib/python3.7/dist-packages/optuna/study/_optimize.py", line 163, in _optimize_sequential
trial = _run_trial(study, func, catch)
File "/usr/local/lib/python3.7/dist-packages/optuna/study/_optimize.py", line 264, in _run_trial
raise func_err
File "/usr/local/lib/python3.7/dist-packages/optuna/study/_optimize.py", line 213, in _run_trial
value_or_values = func(trial)
File "/usr/local/lib/python3.7/dist-packages/tuneta/optimize.py", line 188, in
self.study.optimize(lambda trial: _objective(self, trial, X, y),
File "/usr/local/lib/python3.7/dist-packages/tuneta/optimize.py", line 128, in _objective
raise RuntimeError(f"Optuna execution error: {self.function}")
RuntimeError: Optuna execution error: pta.stochrsi(X.close, length=trial.suggest_int('length', 2, 30), k=trial.suggest_int('k', 2, 30), d=trial.suggest_int('d', 2, 30), )
"""

The above exception was the direct cause of the following exception:

RuntimeError Traceback (most recent call last)
in ()
27 ranges=[(2, 30)],
28 trials=100,
---> 29 early_stop=50,
30 )
31

2 frames
/usr/local/lib/python3.7/dist-packages/multiprocess/pool.py in get(self, timeout)
655 return self._value
656 else:
--> 657 raise self._value
658
659 def _set(self, i, obj):

RuntimeError: Optuna execution error: pta.stochrsi(X.close, length=trial.suggest_int('length', 2, 30), k=trial.suggest_int('k', 2, 30), d=trial.suggest_int('d', 2, 30), )

@jmrichardson
Copy link
Owner

Hmmm. I am not able to reproduce on my end:

Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/tuneta/optimize.py", line 123, in _objective
res = [eval_res(X, self.function, self.idx, trial, sym=sym) for sym, X in X.groupby(level=1)]
File "/usr/local/lib/python3.7/dist-packages/tuneta/optimize.py", line 123, in
res = [eval_res(X, self.function, self.idx, trial, sym=sym) for sym, X in X.groupby(level=1)]
File "/usr/local/lib/python3.7/dist-packages/tuneta/optimize.py", line 98, in eval_res
res = eval(function)
File "", line 1, in
File "/usr/local/lib/python3.7/dist-packages/pandas_ta/momentum/smi.py", line 23, in smi
tsi_df = tsi(close, fast=fast, slow=slow, signal=signal, scalar=scalar)
File "/usr/local/lib/python3.7/dist-packages/pandas_ta/momentum/tsi.py", line 34, in tsi
tsi_signal = ma(mamode, tsi, length=signal)
File "/usr/local/lib/python3.7/dist-packages/pandas_ta/overlap/ma.py", line 73, in ma
else: return ema(source, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/pandas_ta/overlap/ema.py", line 22, in ema
ema = EMA(close, length)
File "/usr/local/lib/python3.7/dist-packages/talib/init.py", line 35, in wrapper
result = func(*args, **kwargs)
File "talib/_func.pxi", line 2931, in talib._ta_lib.EMA
File "talib/_func.pxi", line 68, in talib._ta_lib.check_begidx1
Exception: inputs are all NaN

The above error is saying that the inputs are all NaNs? Tuneta requires the dataset to have date/symbol as the index and ohlcv. Is your dataset the same format?

And this:

RuntimeError: Optuna execution error: pta.smi(X.close, fast=trial.suggest_int('fast', 2, 30), slow=trial.suggest_int('slow', 2, 30), signal=trial.suggest_int('signal', 2, 30), scalar=trial.suggest_int('scalar', 2, 30), )

There seems to be an issue with running pta.smi Unfortunately, i can't see what the error message is from pandas-ta.

I just tried a bunch of different parameter combinations for pta.smi and all completed successfully. I also have run a bunch of tests with 'all' and just those 3 indicators and no issue:

    tt.fit(X_train, y_train,
        indicators=['pta.stochrsi', 'pta.tsi', 'pta.smi'],
        # indicators=['all'],
        ranges=[(2, 30)],
        trials=100,
        early_stop=50,
    )

I just added some try/except logic to output the specific error message to see if that sheds some light on the situation. Try installing with:

pip install -U git+https://github.com/jmrichardson/tuneta

Also, can you send me what your X_train dataframe looks like to see if there is anything there that could be causing the problem?

@QAQOAO
Copy link
Author

QAQOAO commented Oct 14, 2021

My data contains no NaN. When I use your example (only a few stocks) , no error appears.
My change is just adding more stocks (roughly 100), I notice that there are some stocks IPO very late, and some have already IPO, so when the start date begins if there are stocks not IPO yet, their indicators value (should be NaN) but I filled them with “ffill and bfill”. Not sure this is the cause or not.

Below is the X_train.

1CD028B4-5CD0-46E7-B128-A7E5E93EE85D
026361F7-D749-4EE1-A045-4B25A34B0CFD

Also, seeing your prune_dataframe.py where catch22 has 22 features, after adding ohlcv, there should be 27 columns displayed below “Indicator Correlation to target”.

However, I counted all column names and found that “SC_FluctAnal_2_rsrangefit_50_1_logi_prop_r1” & “SC_FluctAnal_2_dfa_50_1_2_logi_prop_r1”
did not appear below “Indicator Correlation to target”.

Similarly, when I used my custom features, a least a column disappeared. I didn’t have any clue why some features would disappear. It might be some kind of bug.

By the way, my data input format is like the format of tune market. Does tuneta support prune market or just one ticker currently?

@jmrichardson
Copy link
Owner

Perfect, your data looks good. Did you install the latest update and get the error message? I want to see why pandas-ta has an error for those indicators.

However, I counted all column names and found that “SC_FluctAnal_2_rsrangefit_50_1_logi_prop_r1” & “SC_FluctAnal_2_dfa_50_1_2_logi_prop_r1”
did not appear below “Indicator Correlation to target”.

Sorry, I should have documented this somewhere but I remove all features with a 0 correlation. On second thought, maybe I shouldn't do that? What do you think? The code is on line 134 in tune_ta.py:

self.fitted = [f for f in self.fitted if f.study.user_attrs['best_trial'].user_attrs['correlation'] > 0]

By the way, my data input format is like the format of tune market. Does tuneta support prune market or just one ticker currently?

It supports pruning for both. It also supports pruning pre-existing features (look at prune_dataframe for an example.)

When you can, send the error messages with the latest update of tuneta which has the logic now to display a more descriptive error message (hopefully).

@jmrichardson
Copy link
Owner

Thinking more about the bfill, I am wondering if there is an issue where the indicators don't like data which begins with constant values? Maybe remove those stocks and run again just as a test?

@QAQOAO
Copy link
Author

QAQOAO commented Oct 14, 2021

First of all, I always use !pip install -U git[tuneta url], so I think the version is not the issue.

As for the rule of not including zero correlation feature, I think either printing along with output or documenting the rule is fine.

Secondly, regarding the execution error, after I dropped several "bfilled" tickers, the execution error still occurred.

That is, not only "pta" had an execution error since after I changed ["all"] to ["fta", "tta"], I received another error below.

RuntimeError: Optuna execution error: fta.VORTEX(X, period=trial.suggest_int('period', 2, 30), )

Therefore, It seems like there are many indicators having this kind of issue.

I think you may add more try and except, so when an indicator fails, "tt" skips it and calculates the next indicator.

Finally, I found your slack in AI4Finance, can I DM you so the conversation may be more convenient?

@QAQOAO
Copy link
Author

QAQOAO commented Oct 14, 2021

Suddenly I found the cause might be the frequency of y since my target is designed to be weekly.

Thus, after merging with ohlcv, the time interval of the date index in DataFrame is a week.

Nevertheless, updating to daily and rerunning "all indicators" still gives me an error below.

But this time, it seems that "tt" successfully pass "pta.smi" , "pta.tsi", and"pta.stochrsi", which gave me errors before.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/tuneta/optimize.py", line 129, in _objective
res = [eval_res(X, self.function, self.idx, trial, sym=sym) for sym, X in X.groupby(level=1)]
File "/usr/local/lib/python3.7/dist-packages/tuneta/optimize.py", line 129, in
res = [eval_res(X, self.function, self.idx, trial, sym=sym) for sym, X in X.groupby(level=1)]
File "/usr/local/lib/python3.7/dist-packages/tuneta/optimize.py", line 104, in eval_res
raise Exception(e)
Exception: Cannot cast array data from dtype('O') to dtype('float64') according to the rule 'safe'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/multiprocess/pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "/usr/local/lib/python3.7/dist-packages/tuneta/optimize.py", line 195, in fit
n_trials=self.n_trials, callbacks=[_early_stopping_opt])
File "/usr/local/lib/python3.7/dist-packages/optuna/study/study.py", line 409, in optimize
show_progress_bar=show_progress_bar,
File "/usr/local/lib/python3.7/dist-packages/optuna/study/_optimize.py", line 76, in _optimize
progress_bar=progress_bar,
File "/usr/local/lib/python3.7/dist-packages/optuna/study/_optimize.py", line 163, in _optimize_sequential
trial = _run_trial(study, func, catch)
File "/usr/local/lib/python3.7/dist-packages/optuna/study/_optimize.py", line 264, in _run_trial
raise func_err
File "/usr/local/lib/python3.7/dist-packages/optuna/study/_optimize.py", line 213, in _run_trial
value_or_values = func(trial)
File "/usr/local/lib/python3.7/dist-packages/tuneta/optimize.py", line 194, in
self.study.optimize(lambda trial: _objective(self, trial, X, y),
File "/usr/local/lib/python3.7/dist-packages/tuneta/optimize.py", line 134, in _objective
raise RuntimeError(f"Optuna execution error: {self.function}")
RuntimeError: Optuna execution error: fta.VORTEX(X, period=trial.suggest_int('period', 2, 30), )

@jmrichardson
Copy link
Owner

Good morning, yes, let's chat on slack, it should go a bit faster than here. I did realize i had a nested try/except which may be preventing the more descriptive error message. So I removed the top layer "try" and pushed it to the repo. Go ahead and give it a shot and ping me on slack.

@QAQOAO
Copy link
Author

QAQOAO commented Oct 15, 2021

I noticed that you turned down the notification on slack. Just remind you that I had already DM you on slack.

@jmrichardson
Copy link
Owner

Glad we got the issues resolved. Several modifications to tuneta and ensuring the training data has datetime index instead of object.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants