Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

recursion limit reached when joining period and datetimeindexes #3899

Closed
cpcloud opened this issue Jun 14, 2013 · 17 comments · Fixed by #5101

Comments

@cpcloud
Copy link
Member

commented Jun 14, 2013

one should probably not be trying to join these types of indices. even so, a stack overflow is unacceptable. there should at least be an error message...

code:

In [9]: df = mkdf(10, 10, data_gen_f=lambda *args: randint(2), c_idx_type='p', r_idx_type='dt')

In [10]: s = df.iloc[:5, 0]

In [11]: s.index.join(df.columns, how='outer')
@cpcloud

This comment has been minimized.

Copy link
Member Author

commented Jun 14, 2013

cc #3900

@cpcloud

This comment has been minimized.

Copy link
Member Author

commented Jun 14, 2013

@jreback should joins be allowed here (resulting in an object index)? alternative is to raise a type error disallowing joins between period and datetime indices. more generally, to revisit the issue about datetimeindex joins, should anything besides "date-able" strings and integers be allowed to join, the result being an object index if they were, otherwise raise a typeerror...i think this particular bug might be easy to fix if we choose to raise, but legacy support will most likely be broken since the latter would be more strict with allowed joins...

@jreback

This comment has been minimized.

Copy link
Contributor

commented Jun 14, 2013

u can convert period to datetime (and vice versa)
so I think u should do that

though still fuzzy when/if this is useful

@cpcloud

This comment has been minimized.

Copy link
Member Author

commented Jun 14, 2013

i don't think it is that useful and maybe this bug is even a bit academic, lmk if u think this not worth fixing. these kinds of bugs are notoriously difficult because the code path to the cycle has the potential to be enormous. thankfully, i just discovered the cycle, and it's not that long. i believe the problem is that object datetime indices are not being converted to datetime indices and thus self and other dtype keep testing as not equal ad infinitum.

another way to see it is that periodindexes should be kept as object here and thus the dtypes will test equal on the next recursive call and the code will terminate.

either way if the dtypes are not equal the code attempts conversion to object but the __new__ method intercepts and periodindexes always get constructed as periodindexes then join is called recursively and never finishes

@jreback

This comment has been minimized.

Copy link
Contributor

commented Jun 14, 2013

that's why it might make sense to only allow certain types of joins (and conversions)
raising otherwise

@cpcloud

This comment has been minimized.

Copy link
Member Author

commented Jun 14, 2013

okay...also this bug hidden by the "fake" unicode error that i submitted a pr for last night. later today i'll put up a table with what i think makes sense for conversions and would be nice if u could suggest additions/deletions...then will fix it according to that. thanks

@cpcloud

This comment has been minimized.

Copy link
Member Author

commented Jun 14, 2013

i think implementing this _assert_can_do_setop method in the index base class is probably a good idea. and then have a _join method (overloaded by subclasses) that is called by join

@cpcloud

This comment has been minimized.

Copy link
Member Author

commented Jun 14, 2013

@jreback so one solution that i have working is to convert period/datetime indices to the other depending on which is doing the join. the obvs prb here is all period can convert to datetime where as not all datetime can cvt to period. possible solutions:

  1. let periodindex raise when it cannot convert the rhs datetimeindex to period

or

  1. convert everything to datetime indexes
@cpcloud

This comment has been minimized.

Copy link
Member Author

commented Jun 15, 2013

i think 1) is the best option since it seems weird to have the periodindex join method return a datetimeindex even if the other is a datetimeindex. i think it should raise if the datetimeindex can't be converted to a periodindex and of course convert if it can

@jreback

This comment has been minimized.

Copy link
Contributor

commented Jun 15, 2013

let push to 0.12

@jreback

This comment has been minimized.

Copy link
Contributor

commented Sep 21, 2013

what are we doing with this? punt/push?

@cpcloud

This comment has been minimized.

Copy link
Member Author

commented Sep 21, 2013

i'll fix by 0.13

@cpcloud

This comment has been minimized.

Copy link
Member Author

commented Sep 21, 2013

just need to clarify what the exact issue is

@cpcloud

This comment has been minimized.

Copy link
Member Author

commented Sep 21, 2013

as in write it down here so that we can discuss it

@jreback

This comment has been minimized.

Copy link
Contributor

commented Sep 30, 2013

why don't we just raise for now? and then let's figure out in 0.14 (if it should even be allowed)

@cpcloud

This comment has been minimized.

Copy link
Member Author

commented Oct 3, 2013

pr in the works

@cpcloud

This comment has been minimized.

Copy link
Member Author

commented Oct 3, 2013

entire problem was that PeriodIndex.astype(object) was returning an ndarray which would then get converted back to period iindex then reconverted back to raw ndarray and so on until the recursion limit was reeached

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.