Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-10617: [Python] Fix RecordBatchStreamReader iteration with Python 3.8 #8677

Closed

Conversation

sighingnow
Copy link
Contributor

No description provided.

@github-actions
Copy link

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reporting this. Could you add a test?

@pitrou pitrou changed the title ARROW-10617: [Python] Fixes RecordBatchStreamReader's iteraction bug in python 3.8. ARROW-10617: [Python] Fix RecordBatchStreamReader iteration with Python 3.8 Nov 16, 2020
@sighingnow
Copy link
Contributor Author

I do think a test is needed, but the case has already been tested in test_ipc.py (e.g., test_stream_simple_roundtrip) but seems it was skiped in the Python 3.8 case (in Github Action). I'm not sure how could I enable the test case, and the code I pasted in the Jira issue is taken from pyarrow's example, it should be easy to reproduce.

BTW, the appveyor failure is not caused by this PR.

@jorisvandenbossche
Copy link
Member

Indeed, it seems the full test_ipc.py file is being skipped on the "Python 3.8 Without Pandas" build (https://github.com/apache/arrow/pull/8677/checks?check_run_id=1406477146). And this is because we don't have a 3.8 build with pandas, see this comment at the top of the file:

# TODO(wesm): The IPC tests depend a lot on pandas currently, so all excluded
# when it is not installed
pytestmark = pytest.mark.pandas

(so certainly a TODO we should resolve)

@jorisvandenbossche
Copy link
Member

But don't we have a ursabot build with python 3.8? (they don't seem active / be triggered on this PR, though)

@sighingnow
Copy link
Contributor Author

Hi @jorgecarleitao Then could we get this one merged first ? Or should I enable pandas with python 3.8 tests on github workflow?

@jorisvandenbossche
Copy link
Member

Then could we get this one merged first ?

Ideally we would first ensure we have such a build, so we can know it is failing before applying this patch (so we know that the issue would actually be caught by our tests).

(now, I just mentioned on the issue that I can't reproduce the failure)

@pitrou
Copy link
Member

pitrou commented Dec 7, 2020

I can't reproduce the failure either. @sighingnow, I suggest you investigate a bit more?

@pitrou
Copy link
Member

pitrou commented Dec 7, 2020

Hmm, perhaps you're using a development version of Cython?

@pitrou pitrou force-pushed the ht/fix-recordbatch-reader-py38 branch from da12abb to 19e0ad7 Compare December 7, 2020 17:43
@pitrou
Copy link
Member

pitrou commented Dec 7, 2020

In any case, the change looks sane, so I'm gonna merge if CI is ok.

@pitrou pitrou closed this in 49f23a1 Dec 7, 2020
@sighingnow
Copy link
Contributor Author

Thank you, Pitrou!

@sighingnow
Copy link
Contributor Author

Hmm, perhaps you're using a development version of Cython?

Thanks for you insight! I just realized that I'm working with Cython 0.3a6.

GeorgeAp pushed a commit to sirensolutions/arrow that referenced this pull request Jun 7, 2021
…on 3.8

Closes apache#8677 from sighingnow/ht/fix-recordbatch-reader-py38

Lead-authored-by: Tao He <linzhu.ht@alibaba-inc.com>
Co-authored-by: Antoine Pitrou <antoine@python.org>
Signed-off-by: Antoine Pitrou <antoine@python.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants