Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected behaviour with replace-missing, first value propagated up despite :down strategy #305

Closed
alex314159 opened this issue Jun 18, 2022 · 7 comments

Comments

@alex314159
Copy link

Hi Chris,

replace-missing is very useful especially with time series data, when used with strategy :down it typically means "if market was closed for this particular security on this day, use the previous day". However replace-missing behaves unexpectedly when the column starts with nil values: it propagates the first value into the past. So [nil nil 5 3 nil 2 4] becomes [5 5 5 3 3 2 4] instead of [nil nil 5 3 3 2 4] (latter is also the pandas behaviour). In real life the nil at the beginning means "the security didn't exist yet".

Would it be possible to add a strategy that does the above? I've tried looking into the code but it's too complex for me... of course a low performance version would be (reduce (fn [a b] (conj a (if (nil? b) (peek a) b))) [] coll)

Thank you!

@cnuernber
Copy link
Collaborator

Yes for sure - I agree behavior is wrong

@cnuernber
Copy link
Collaborator

@genmeblog - Will fixing this the suggested way cause regressions in TC?

@genmeblog
Copy link

Yeah, it caused a regression. However I need to check why I initially implemented it this way. The proposed behavior is actually the proper one.
Maybe it's good to introduce "updown" and "downup" strategies to allow previous behavior? (R has such: https://tidyr.tidyverse.org/reference/fill.html)

@genmeblog
Copy link

genmeblog commented Jun 19, 2022

Additionally I think the other functionality is removed.
When strategy is "down" or "up" and the user provides a value, the remaining missing values after applying requested strategy should be filled with a given value.

Just :down
[nil nil 5 3 nil 2 4] -> [nil nil 5 3 3 2 4]

:down with a value 999, ie (replace-missing ds column :down 999)
[nil nil 5 3 nil 2 4] -> [999 999 5 3 3 2 4]

I propose to keep this path.

@cnuernber
Copy link
Collaborator

Got it. Yep - makes sense.

@cnuernber cnuernber reopened this Jun 19, 2022
@genmeblog
Copy link

Thanks. To summarize:

  • :up - as proposed, fills gaps above (only)
  • :down - fills gaps below
  • :updown - first up then down
  • :downup - first down then up
  • if value is provided to :up or :down, fill the remaining missings with a value.

@cnuernber
Copy link
Collaborator

cnuernber commented Jun 19, 2022

Updated with the proximally above specification with the addition that value is taken into consideration even when using updown or downup in that if the value is provided it overrides the secondary direction.

Release 6.090.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants