Skip to content
This repository was archived by the owner on Feb 2, 2024. It is now read-only.

Conversation

PokhodenkoSA
Copy link
Contributor

@PokhodenkoSA PokhodenkoSA commented Nov 28, 2019

This PR makes PyArrow treads count equal to Numba threads count.
I do not know which threading layer is used by PyArrow. it is possible that there is no intersection with Numba layers.
By default PyArrow uses all available cpus for reading.

Copy link
Collaborator

@AlexanderKalistratov AlexanderKalistratov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pep8speaks
Copy link

pep8speaks commented Dec 3, 2019

Hello @PokhodenkoSA! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 76:5: E722 do not use bare 'except'
Line 81:9: E128 continuation line under-indented for visual indent

Comment last updated at 2019-12-11 08:00:26 UTC

@PokhodenkoSA
Copy link
Contributor Author

PokhodenkoSA commented Dec 3, 2019

This PR looks very big. I will split it to many smaller ones with dedicated improvements. Now I am using the PR for preparing benchmark report only.

Add pyarrow_cpu_count context manager which always returns cpu_count to previous value.
Use single config data in all places
Add PyArrow benchmark record
read_csv data size (1m,10)
Show size [rows,cols]
Implement data file caching and move functions for generating to gen_csv.py
@PokhodenkoSA PokhodenkoSA merged commit 76d588d into IntelPython:master Dec 11, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants