ARROW-4583: [Plasma] Fix some small bugs reported by code scan tool#3656
ARROW-4583: [Plasma] Fix some small bugs reported by code scan tool#3656guoyuhong wants to merge 12 commits intoapache:masterfrom
Conversation
|
@guoyuhong Out of curiosity which "code scan tool" did you use? |
|
Coverity just started offering free scans to open source projects again. @pitrou can you review? |
cpp/src/arrow/util/logging.h
Outdated
There was a problem hiding this comment.
There are lint tool conflicts here, so I disable clang-format.
|
@pcmoritz Please take a look at this PR. |
|
Ping... @pcmoritz |
pitrou
left a comment
There was a problem hiding this comment.
Thanks for doing this @guoyuhong . Here are some comments.
cpp/src/arrow/python/deserialize.cc
Outdated
There was a problem hiding this comment.
Ah, sorry for not being specific here. It's better to not override the second parameter, so that the proper status code is returned (for example a Python MemoryError would be turned into a OutOfMemory Status, etc.).
There was a problem hiding this comment.
(attempting to fix it directly from Github...)
There was a problem hiding this comment.
Thanks! I checked out common.h. Yes, ConvertPyError() is better.
|
Rebased, will wait for CI. |
|
Hmm, I really don't understand what the merge script is saying me here: @wesm did something weird happen on git master? |
|
@pitrou master had a hiccup about 5 hours ago at 19:30 UTC; I accidentally created a merge commit in master (using the GitHub UI -- I have since asked INFRA to disallow merge commits through the UI) and quickly rebased and force-pushed master. It's possible that this PR was impacted. I have also seen the merge script misbehave at times like this when the GitHub API returns bad results now and then |
|
@wesm I've rebased it since that. |
|
Finally merged. Thanks @guoyuhong ! |
|
@pitrou Thanks! |
We used a static code scan tool to scan arrow code. There are several possible bugs: 1. The return value of `PyDict_SetItem` is not used. 2. Currently, `EventLoop:: Shutdown` should be called explicitly, which is error-prone and causing leak when the user forgets to call it. 3. There is an unclosed file descriptor in `io.cc` when path name is too long. Besides, we also made the following small changes: 1. When we use Plasma in Yarn and when a node uses too much memory, a SIGTERM signal will be sent to Plasma. Current plasma will exit silently. We also some log to plasma store to help us to debug. 2. `ARROW_LOG` will always evaluate the output expression even when it is not enabled, which is not efficient. 3. The constructor of Java class `ObjectStoreData` is private, which is not convenient when we want to create a mock plasma store. 4. Fix a call to `ObjectStoreData` which misplaces `meta` and `data` according to https://github.com/apache/arrow/blob/master/java/plasma/src/main/java/org/apache/arrow/plasma/ObjectStoreLink.java#L32 . Author: Yuhong Guo <yuhong.gyh@antfin.com> Author: Antoine Pitrou <antoine@python.org> Closes apache#3656 from guoyuhong/fixPlasma and squashes the following commits: 634e36a <Antoine Pitrou> Use default argument value to `ConvertPyError` b547f2f <Yuhong Guo> remove if from ARROW_LOG 440f097 <Yuhong Guo> Address comment d3eb22f <Yuhong Guo> Lint 79b4af3 <Yuhong Guo> Fix and Lint 434a039 <Yuhong Guo> Make constructor of ObjectStoreData public b2ddba6 <Yuhong Guo> Fix misplace of meta and data in PlasmaClient.java a667402 <Yuhong Guo> Do not evaluate logging strings when logging is not enabled. 5be7b89 <Yuhong Guo> Fix unclosed fd reported by code scan tool 3a917d4 <Yuhong Guo> Fix not used return value in deserialize.cc reported by code scan tool ed56a48 <Yuhong Guo> Fix possible unclosed EventLoop reported by code scanning tool 3fed926 <Yuhong Guo> Add plasma log
We used a static code scan tool to scan arrow code. There are several possible bugs:
PyDict_SetItemis not used.EventLoop:: Shutdownshould be called explicitly, which is error-prone and causing leak when the user forgets to call it.io.ccwhen path name is too long.Besides, we also made the following small changes:
ARROW_LOGwill always evaluate the output expression even when it is not enabled, which is not efficient.ObjectStoreDatais private, which is not convenient when we want to create a mock plasma store.ObjectStoreDatawhich misplacesmetaanddataaccording to https://github.com/apache/arrow/blob/master/java/plasma/src/main/java/org/apache/arrow/plasma/ObjectStoreLink.java#L32 .