New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: possible indexing/selection bug #319
Comments
I can mail the login if anyone wants it (Jeff has it too and can share it with you too if you want) |
Is there any way you can provide a simple script which generates an error that fails? Additionally is there any way that you could provide a failing script that doesn't use subprocess? Thanks! |
@scopatz I couldn't find a smaller example I don't need sub process - that's just to run the ptrepacl (which shows that it does select properly with no index) |
@scopatz emailed u the login indo |
@jreback A bash script that does the same thing then would be greatly appreciated. I want to ensure that our workflows are exactly the same so that we don't waste a lot of time. Also, I might not be able to get to this for a few days. |
@scopatz found a reproducible example (I changed the top section). |
Thanks @jreback! |
Has there been any progress with this? We have a couple of reproducible examples and nasty workarounds are required to avoid using the index which is seemingly unreliable. Thanks. |
@scopatz any progress on this? |
@FrancescAlted thanks! |
see here: pandas-dev/pandas#5913
Narrowed it down to this:
create a table with a larger
expectedrows
that actually storingcreate an index on a column
select via a pretty small start/stop range (e.g. in the below example if you use a chunksize of 1M, then it doesn't show up, but 500k makes it fail).
If I don't pass
expectedrows
, then this works as expected!Code to reproduce:
Output; the output for each chunk should be
[-20000, -19999]
; extraneous values are being selected thatare not in the selection spec
The text was updated successfully, but these errors were encountered: