-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-17106: [Python] Move init code to core and expose only API #13802
Conversation
Opened ARROW-17320 to refine which symbols should be designated as private/public from |
@github-actions crossbow submit -g python |
Revision: 5c81e45 Submitted crossbow builds: ursacomputing/crossbow @ actions-474d060045 |
Default is to not re-export symbols with leading _
@github-actions crossbow submit -g python |
Revision: 8578c01 Submitted crossbow builds: ursacomputing/crossbow @ actions-db593693db |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this. Hopefully we can soon define a curated set of exported APIs.
python/pyarrow/parquet/core.py
Outdated
__all__ = [ | ||
m for m, _ in inspect.getmembers(sys.modules[__name__]) | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems this is the same as the following, but much slower?
__all__ = [ | |
m for m, _ in inspect.getmembers(sys.modules[__name__]) | |
] | |
__all__ = list(__dict__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I'm missing somehting, but I get
-> 3463 __all__ = list(__dict__)
NameError: name '__dict__' is not defined
Changing it to list(sys.modules[__name__].__dict__)
and a smidgen faster. Thanks!
In [10]: N = 1_000
In [11]: # __dict__
In [12]: timeit.timeit('_ = importlib.reload(core)', number=N, globals=globals()) / N
Out[12]: 0.0005946487420005723
In [13]: # inspect.getmembers
In [14]: timeit.timeit('_ = importlib.reload(core)', number=N, globals=globals()) / N
Out[14]: 0.000692891882998083
In [15]:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, since you're using IPython, you probably know it but the %timeit
macro is useful :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recently switched to it actually, thanks for the tip! 👍
Hmm, the crash on the Sphinx & Numpydoc CI job is unfortunate but certainly not related I think. |
Benchmark runs are scheduled for baseline = 210cf06 and contender = a2f3666. a2f3666 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Will close ARROW-17106
Moves all current code in
pyarrow/parquet/__init__.py
topyarrow/parquet/core.py
and re-exports the public API from core.py inside of the init file.